Commit Graph

21 Commits

Author SHA1 Message Date
Michele Dolfi
7f10a546d3 Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-11 17:04:01 +02:00
Michele Dolfi
98f1a4597e rename and refactor *model*
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-11 16:57:40 +02:00
Christoph Auer
6efcf0a5a5 Add image format support to PdfBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 16:47:15 +02:00
Christoph Auer
d0fccb9342 Merge from simplify-conv-api
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 15:57:08 +02:00
Christoph Auer
95c1f80087 Change code to use unordered/ordered list, robustifications
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 14:53:38 +02:00
Panos Vagenas
136f16e85a
feat!: simplify conversion API (#139)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-10-11 14:52:37 +02:00
Christoph Auer
52713f0cf5 Optionally produce legacy_doc
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 12:57:47 +02:00
Christoph Auer
025983f07b Backend error handling fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 11:18:47 +02:00
Christoph Auer
304d16029a More renaming, design enrichment interface
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-11 10:21:31 +02:00
Michele Dolfi
3794f8245e add example PNG
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-10-10 18:29:26 +02:00
Christoph Auer
7cad290ceb Refactor test data, legacy usage and more
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-10 13:54:44 +02:00
Christoph Auer
0dfbd0b6fc Update examples and test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-09 15:20:27 +02:00
Christoph Auer
080042d06d Merge from upstream
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-08 16:40:55 +02:00
Christoph Auer
203cf19b1b Lots of improvements
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-08 16:38:42 +02:00
Maxim Lysak
07d952acf9 Improved backends
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-08 16:37:47 +02:00
Christoph Auer
c0447206af Merge from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-08 14:42:33 +02:00
Maxim Lysak
89e58ca730 Added HTML backend implementation, few improvements for other backends
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-08 11:14:44 +02:00
Maxim Lysak
f773d8a621 Improved demo code, that saves output mds to files
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-07 17:25:17 +02:00
Maxim Lysak
bea9fc22af Added mspowerpoint backend first implementation, improvements on msword backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-07 14:55:21 +02:00
Maxim Lysak
cefc34e8d8 Working on a first version of DOCX native backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-04 18:19:40 +02:00
Christoph Auer
1fa7cd9855 Fundamental refactoring for multi-format support
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2024-10-01 16:54:09 +02:00