mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-25 19:44:34 +00:00
fix: Support for RTL programmatic documents fix(parser): detect and handle rotated pages fix(parser): fix bug causing duplicated text fix(formula): improve stopping criteria chore: update lock file fix: temporary constrain beautifulsoup * switch to code formula model v1.0.1 and new test pdf Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * switch to code formula model v1.0.1 and new test pdf Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * cleaned up the data folder in the tests Signed-off-by: Peter Staar <taa@zurich.ibm.com> * switch to code formula model v1.0.1 and new test pdf Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * added three test-files for right-to-left Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fix black Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * added new gt for test_e2e_conversion Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * added new gt for test_e2e_conversion Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * Add code to expose text direction of cell Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * new test file Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> * update lock Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix mypy reports Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix example filepaths Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test data results Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * pin wheel of latest docling-parse release Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use latest docling-core Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove debugging code Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix path to files in example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Revert unwanted RTL additions Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix test data paths in examples Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com> Co-authored-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com> |
||
---|---|---|
.. | ||
2203.01017v2.doctags.txt | ||
2203.01017v2.json | ||
2203.01017v2.md | ||
2203.01017v2.pages.json | ||
2206.01062.doctags.txt | ||
2206.01062.json | ||
2206.01062.md | ||
2206.01062.pages.json | ||
2305.03393v1-pg9.doctags.txt | ||
2305.03393v1-pg9.json | ||
2305.03393v1-pg9.md | ||
2305.03393v1-pg9.pages.json | ||
2305.03393v1.doctags.txt | ||
2305.03393v1.json | ||
2305.03393v1.md | ||
2305.03393v1.pages.json | ||
amt_handbook_sample.doctags.txt | ||
amt_handbook_sample.json | ||
amt_handbook_sample.md | ||
amt_handbook_sample.pages.json | ||
code_and_formula.doctags.txt | ||
code_and_formula.json | ||
code_and_formula.md | ||
code_and_formula.pages.json | ||
picture_classification.doctags.txt | ||
picture_classification.json | ||
picture_classification.md | ||
picture_classification.pages.json | ||
redp5110_sampled.doctags.txt | ||
redp5110_sampled.json | ||
redp5110_sampled.md | ||
redp5110_sampled.pages.json | ||
right_to_left_01.doctags.txt | ||
right_to_left_01.json | ||
right_to_left_01.md | ||
right_to_left_01.pages.json | ||
right_to_left_02.doctags.txt | ||
right_to_left_02.json | ||
right_to_left_02.md | ||
right_to_left_02.pages.json | ||
right_to_left_03.doctags.txt | ||
right_to_left_03.json | ||
right_to_left_03.md | ||
right_to_left_03.pages.json |