Michele Dolfi
d5f161d0f5
apply changes to the picture data annotations
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-16 13:24:21 +02:00
Michele Dolfi
dd2982cce1
pin models, core and adapt example
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-16 10:57:05 +02:00
Christoph Auer
84438bd8a8
Merge from main and remove conflicts
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-15 16:22:28 +02:00
Christoph Auer
a66c4ee8eb
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
2024-10-15 14:58:10 +02:00
Christoph Auer
27f4ed3620
Enable mypy and fix many reported errors
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-15 14:58:00 +02:00
Michele Dolfi
f49d7881d0
pin docling-core and glm
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-15 14:35:02 +02:00
Christoph Auer
dac82ca7f2
Import statement updates from docling-core
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-15 10:11:10 +02:00
Christoph Auer
5b33b12660
renaming BaseTableData
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-14 17:01:50 +02:00
Panos Vagenas
d504432c1e
docs: introduce docs site ( #141 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-10-14 14:13:13 +02:00
Michele Dolfi
245b6c4c01
pin picture data with molecule
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-13 18:07:43 +02:00
Michele Dolfi
7c8d7e222e
use new PictureData
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-13 16:48:16 +02:00
Christoph Auer
69f0ab419c
Bump docling-core version
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 16:55:01 +02:00
github-actions[bot]
4672b24c1a
chore: bump version to 1.20.0 [skip ci]
2024-10-11 13:48:02 +00:00
Christoph Auer
5e4944f15f
feat: new experimental docling-parse v2 backend ( #131 )
...
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-11 15:12:49 +02:00
Michele Dolfi
331ab36f04
Merge remote-tracking branch 'origin/main' into cau/input-format-abstraction
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-11 11:23:04 +02:00
github-actions[bot]
2ec39636f0
chore: bump version to 1.19.1 [skip ci]
2024-10-11 08:52:09 +00:00
Michele Dolfi
1bcad334f2
pin docling-parse release
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-10 18:30:09 +02:00
Michele Dolfi
bde8186700
update pinning
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-10 17:54:05 +02:00
Michele Dolfi
50c05b262a
pin updates compatible with each other
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-10 17:40:32 +02:00
Christoph Auer
7cad290ceb
Refactor test data, legacy usage and more
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-10 13:54:44 +02:00
Panos Vagenas
5f1bd9e9c8
docs: simplify LlamaIndex example using Docling extension ( #135 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-10-09 22:17:56 +02:00
Christoph Auer
b5a27386c1
Merge from main, update OCR model and test cases
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-09 16:04:19 +02:00
Panos Vagenas
6924999f1f
chore: explicitly manage pandas dependency ( #134 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-10-09 14:50:39 +02:00
github-actions[bot]
0ffc1708d2
chore: bump version to 1.19.0 [skip ci]
2024-10-08 17:42:29 +00:00
Michele Dolfi
f96ea86a00
feat: add options for choosing OCR engines ( #118 )
...
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Peter Staar <taa@zurich.ibm.com>
2024-10-08 19:07:08 +02:00
Christoph Auer
c0447206af
Merge from main
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-08 14:42:33 +02:00
Maxim Lysak
89e58ca730
Added HTML backend implementation, few improvements for other backends
...
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-08 11:14:44 +02:00
Maxim Lysak
bea9fc22af
Added mspowerpoint backend first implementation, improvements on msword backend
...
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-07 14:55:21 +02:00
github-actions[bot]
9b82ae3324
chore: bump version to 1.18.0 [skip ci]
2024-10-03 17:16:00 +00:00
Maxim Lysak
2422f706a1
feat: new torch-based docling models ( #120 )
...
---------
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-03 18:42:33 +02:00
github-actions[bot]
9ebbbc1245
chore: bump version to 1.17.0 [skip ci]
2024-10-03 13:44:52 +00:00
Michele Dolfi
d44c62d7ce
feat: windows support ( #122 )
...
* feat: windows support
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add Windows in README
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-03 14:23:47 +02:00
Christoph Auer
0a86529afb
Repinning
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-30 13:47:22 +02:00
github-actions[bot]
cde671cf34
chore: bump version to 1.16.1 [skip ci]
2024-09-27 14:36:40 +00:00
Michele Dolfi
34bd887a7f
fix: allow usage of opencv 4.6.x ( #110 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-09-27 15:51:43 +02:00
github-actions[bot]
6760571fe1
chore: bump version to 1.16.0 [skip ci]
2024-09-27 06:21:15 +00:00
Christoph Auer
9ffd1dc396
Merge from main
2024-09-26 18:06:08 +02:00
Christoph Auer
0ee82a5e78
Bump deepsearch-glm
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-25 16:05:54 +02:00
Christoph Auer
ad2bd714d4
Update GT test files for pages
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-25 15:54:55 +02:00
Panos Vagenas
39977b5631
chore: move examples extras to respective group ( #103 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-09-25 15:47:48 +02:00
Christoph Auer
3efc2bbbf4
Apply renamings to DocItemLabel
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-25 12:22:02 +02:00
github-actions[bot]
3dfd02a7e9
chore: bump version to 1.15.0 [skip ci]
2024-09-24 15:58:16 +00:00
Michele Dolfi
6a03c208ec
feat: add figure in markdown ( #98 )
...
* feat: add figures in markdown
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update to new docling-core and update test results with figures
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update with improved docling-core
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-09-24 17:28:23 +02:00
Christoph Auer
850a521195
Update lockfile
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-24 16:26:22 +02:00
Christoph Auer
33373ac0dd
Switch everything to use label enum, and more
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-24 16:00:39 +02:00
github-actions[bot]
001d214a13
chore: bump version to 1.14.0 [skip ci]
2024-09-24 13:38:23 +00:00
Christoph Auer
867e06f9f2
Merge from main
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-24 12:05:17 +02:00
github-actions[bot]
c65a01c9b7
chore: bump version to 1.13.1 [skip ci]
2024-09-23 19:04:01 +00:00
Peter W. J. Staar
4794ce460a
fix: updated the render_as_doctags with the new arguments from docling-core ( #93 )
...
* updated the render_as_doctags with the new arguments from docling-core
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* ensuring that docling-core is >1.5.0 to accomodate with the latest export-to-doctags parameters
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the doctags tests
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the README
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fix poetry lock
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* Fix formatting problems
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fixed the doctag export in docling/utils/export.py
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* propagate xsize and ysize
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-23 20:12:18 +02:00
Christoph Auer
abb6dddea8
Reorganize imports from docling-core
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-09-20 10:53:52 +02:00