Christoph Auer
|
07206c5b3e
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-16 13:49:04 +02:00 |
|
Christoph Auer
|
5c862b5971
|
Update docling-core pinnings
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-16 13:49:01 +02:00 |
|
Maxim Lysak
|
a07a187150
|
Added and fixed origin for msword and mspowerpoint backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-16 13:32:50 +02:00 |
|
Christoph Auer
|
515ab04947
|
Update docling-core pinnings
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-16 13:32:15 +02:00 |
|
Michele Dolfi
|
d5f161d0f5
|
apply changes to the picture data annotations
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-16 13:24:21 +02:00 |
|
Christoph Auer
|
c123e5a812
|
Update docling-core pinnings
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-16 11:43:23 +02:00 |
|
Michele Dolfi
|
dd2982cce1
|
pin models, core and adapt example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-16 10:57:05 +02:00 |
|
Christoph Auer
|
8a25230240
|
Update v2 documentation
|
2024-10-16 10:28:40 +02:00 |
|
Christoph Auer
|
df3ff47914
|
Add migration instructions to doc (cont)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 19:08:45 +02:00 |
|
Michele Dolfi
|
cd8e3dce76
|
fix generation of images and adapt examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-15 17:43:47 +02:00 |
|
Michele Dolfi
|
75feef259d
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-15 17:10:30 +02:00 |
|
Michele Dolfi
|
1cb11be06f
|
add options to generate images
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-15 17:09:54 +02:00 |
|
Christoph Auer
|
74e0452b6a
|
Add migration instructions to doc (wip)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 17:08:48 +02:00 |
|
Christoph Auer
|
9d15f4d5bf
|
Adjust CI examples path
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 16:36:21 +02:00 |
|
Christoph Auer
|
40bb84d2de
|
Change output folder for examples.
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 16:34:45 +02:00 |
|
Panos Vagenas
|
c1794a79e2
|
add v2 docs placeholder [skip ci] (#145)
* add v2 docs placeholder [skip ci]
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
* Remove conflicts
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 16:30:35 +02:00 |
|
Christoph Auer
|
84438bd8a8
|
Merge from main and remove conflicts
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 16:22:28 +02:00 |
|
Christoph Auer
|
ba9eaf1bd7
|
CLI and error handling fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 15:58:39 +02:00 |
|
Christoph Auer
|
a66c4ee8eb
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-15 14:58:10 +02:00 |
|
Christoph Auer
|
27f4ed3620
|
Enable mypy and fix many reported errors
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 14:58:00 +02:00 |
|
Michele Dolfi
|
f49d7881d0
|
pin docling-core and glm
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-15 14:35:02 +02:00 |
|
Maxim Lysak
|
115435a835
|
Fixes for lists handling in docx
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-15 14:33:37 +02:00 |
|
Christoph Auer
|
d687f93d52
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-15 10:52:23 +02:00 |
|
Christoph Auer
|
fa5d972291
|
Merge remaining changes from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 10:52:16 +02:00 |
|
Panos Vagenas
|
6b8835b234
|
switch convert_all output type from Iterable to Iterator (#143)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
|
2024-10-15 10:29:45 +02:00 |
|
Christoph Auer
|
dac82ca7f2
|
Import statement updates from docling-core
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 10:11:10 +02:00 |
|
Christoph Auer
|
8710506072
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-15 09:50:18 +02:00 |
|
Christoph Auer
|
afafb97b87
|
Update CLI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 09:50:06 +02:00 |
|
Maxim Lysak
|
aa22fd31db
|
small corrections to pptx
|
2024-10-15 09:43:06 +02:00 |
|
Christoph Auer
|
5b33b12660
|
renaming BaseTableData
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-14 17:01:50 +02:00 |
|
Christoph Auer
|
b964c4bb69
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-14 16:54:56 +02:00 |
|
Christoph Auer
|
57de8ad63a
|
Fix generate_multimodal_pages
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-14 16:52:58 +02:00 |
|
Maxim Lysak
|
98ca58ffd0
|
added support for enumerated lists
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-14 16:49:19 +02:00 |
|
Christoph Auer
|
3f0b01702b
|
Update example export code
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-14 16:40:40 +02:00 |
|
Christoph Auer
|
a50ba57a1f
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-14 16:36:20 +02:00 |
|
Christoph Auer
|
497ddb34a8
|
Big refactoring for legacy_document support
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-14 16:36:11 +02:00 |
|
Maxim Lysak
|
e87bf9ae06
|
Updated pptx backend, fixes issues with lists, also added more different list cases to example
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-14 16:20:17 +02:00 |
|
Panos Vagenas
|
d504432c1e
|
docs: introduce docs site (#141)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
|
2024-10-14 14:13:13 +02:00 |
|
Michele Dolfi
|
08ab628e75
|
use self.artifacts_path
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-14 09:03:49 +02:00 |
|
Michele Dolfi
|
ab8f71511b
|
fix artifacts_path via pipeline_options
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-14 08:57:15 +02:00 |
|
Michele Dolfi
|
2b1e72d327
|
refactor: fix type of tesseractocr options (#140)
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
|
2024-10-14 08:40:22 +02:00 |
|
Michele Dolfi
|
245b6c4c01
|
pin picture data with molecule
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-13 18:07:43 +02:00 |
|
Michele Dolfi
|
ddb509628e
|
use do_ flag in pipeline_options
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-13 16:54:46 +02:00 |
|
Michele Dolfi
|
7c8d7e222e
|
use new PictureData
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-13 16:48:16 +02:00 |
|
Michele Dolfi
|
c1ed447c21
|
propagate raises, add enrichment model, some renaming
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-13 16:03:19 +02:00 |
|
Michele Dolfi
|
941b51aa3e
|
missing renamed files
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-11 18:10:45 +02:00 |
|
Michele Dolfi
|
7f10a546d3
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-11 17:04:01 +02:00 |
|
Michele Dolfi
|
98f1a4597e
|
rename and refactor *model*
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-11 16:57:40 +02:00 |
|
Christoph Auer
|
69f0ab419c
|
Bump docling-core version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-11 16:55:01 +02:00 |
|
Christoph Auer
|
2a259b9723
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-11 16:47:20 +02:00 |
|