Michele Dolfi
90dd676422
feat: update parser with bytesio interface and set as new default backend ( #32 )
...
* update parser with bytesio interface
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* change default backend
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update DEFAULT_BACKEND
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-14 12:30:00 +02:00
Michele Dolfi
79ef8d2f2f
fix: update (vuln) deps ( #29 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-07 17:29:36 +02:00
Maxim Lysak
b8f5e38a8c
feat: introducing docling_backend ( #26 )
...
Uses our own docling_parse to reliably get PDF cells
To get page images, this backend uses pypdfium2
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com>
2024-08-07 16:22:36 +02:00
Panos Vagenas
d2d9543415
fix: set page number using 1-based indexing ( #22 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-07-31 14:28:44 +02:00
Panos Vagenas
d603137383
feat: add simplified single-doc conversion ( #20 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-07-26 16:55:33 +02:00
Michele Dolfi
54b3dda141
fix: add easyocr to main deps for valid extra ( #19 )
...
* fix: add easyocr to main deps for valid extra
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove group
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-07-24 14:11:26 +02:00
Michele Dolfi
b0725e0aa6
fix: expose ocr as extra ( #18 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-07-24 11:14:17 +02:00
Michele Dolfi
7bc20adc16
pin docling-ibm-models 1.1.0 with python 3.10 support ( #15 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-07-18 17:27:48 +02:00
Panos Vagenas
eb0b208272
chore: switch to docling-core Markdown export ( #14 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-07-18 16:10:05 +02:00
Michele Dolfi
fb72688ff7
feat: enable python 3.12 support by updating glm ( #8 )
...
* update deepsearch-glm for python 3.12 support
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* enable python 3.12 in ci tests
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-07-17 14:03:26 +02:00
Christoph Auer
e2d996753b
Initial commit
2024-07-15 09:42:42 +02:00