Christoph Auer
|
ba9eaf1bd7
|
CLI and error handling fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 15:58:39 +02:00 |
|
Christoph Auer
|
a66c4ee8eb
|
Merge branch 'cau/input-format-abstraction' of github.com:DS4SD/docling into cau/input-format-abstraction
|
2024-10-15 14:58:10 +02:00 |
|
Christoph Auer
|
27f4ed3620
|
Enable mypy and fix many reported errors
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 14:58:00 +02:00 |
|
Maxim Lysak
|
115435a835
|
Fixes for lists handling in docx
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-15 14:33:37 +02:00 |
|
Christoph Auer
|
dac82ca7f2
|
Import statement updates from docling-core
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-15 10:11:10 +02:00 |
|
Christoph Auer
|
5b33b12660
|
renaming BaseTableData
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-14 17:01:50 +02:00 |
|
Michele Dolfi
|
7c8d7e222e
|
use new PictureData
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-13 16:48:16 +02:00 |
|
Christoph Auer
|
6efcf0a5a5
|
Add image format support to PdfBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-11 16:47:15 +02:00 |
|
Christoph Auer
|
95c1f80087
|
Change code to use unordered/ordered list, robustifications
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-11 14:53:38 +02:00 |
|
Christoph Auer
|
025983f07b
|
Backend error handling fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-11 11:18:47 +02:00 |
|
Maxim Lysak
|
da0700f959
|
Fixes for docx backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-09 16:52:44 +02:00 |
|
Christoph Auer
|
0dfbd0b6fc
|
Update examples and test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-09 15:20:27 +02:00 |
|
Christoph Auer
|
c0447206af
|
Merge from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-08 14:42:33 +02:00 |
|
Maxim Lysak
|
89e58ca730
|
Added HTML backend implementation, few improvements for other backends
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-08 11:14:44 +02:00 |
|
Maxim Lysak
|
bea9fc22af
|
Added mspowerpoint backend first implementation, improvements on msword backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-07 14:55:21 +02:00 |
|
Maxim Lysak
|
1346843301
|
Improved docx parsing
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-07 13:00:50 +02:00 |
|
Maxim Lysak
|
cefc34e8d8
|
Working on a first version of DOCX native backend
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
|
2024-10-04 18:19:40 +02:00 |
|
Christoph Auer
|
1fa7cd9855
|
Fundamental refactoring for multi-format support
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
|
2024-10-01 16:54:09 +02:00 |
|