mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-27 20:44:16 +00:00
* Upgraded Layout Postprocessing, sending old code back to ERZ Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Implement hierachical cluster layout processing Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Pass nested cluster processing through full pipeline Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Pass nested clusters through GLM as payload Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Move to_docling_document from ds-glm to this repo Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Clean up imports again Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat(Accelerator): Introduce options to control the num_threads and device from API, envvars, CLI. - Introduce the AcceleratorOptions, AcceleratorDevice and use them to set the device where the models run. - Introduce the accelerator_utils with function to decide the device and resolve the AUTO setting. - Refactor the way how the docling-ibm-models are called to match the new init signature of models. - Translate the accelerator options to the specific inputs for third-party models. - Extend the docling CLI with parameters to set the num_threads and device. - Add new unit tests. - Write new example how to use the accelerator options. * fix: Improve the pydantic objects in the pipeline_options and imports. Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> * fix: TableStructureModel: Refactor the artifacts path to use the new structure for fast/accurate model Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> * Updated test ground-truth Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updated test ground-truth (again), bugfix for empty layout Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Do proper check to set the device in EasyOCR, RapidOCR. Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> * Rollback changes from main Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update test gt Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove unused debug settings Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Review fixes Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Nail the accelerator defaults for MPS Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> |
||
---|---|---|
.. | ||
2203.01017v2.doctags.txt | ||
2203.01017v2.json | ||
2203.01017v2.md | ||
2203.01017v2.pages.json | ||
2206.01062.doctags.txt | ||
2206.01062.json | ||
2206.01062.md | ||
2206.01062.pages.json | ||
2305.03393v1-pg9.doctags.txt | ||
2305.03393v1-pg9.json | ||
2305.03393v1-pg9.md | ||
2305.03393v1-pg9.pages.json | ||
2305.03393v1.doctags.txt | ||
2305.03393v1.json | ||
2305.03393v1.md | ||
2305.03393v1.pages.json | ||
example_01.html.itxt | ||
example_01.html.json | ||
example_01.html.md | ||
example_02.html.itxt | ||
example_02.html.json | ||
example_02.html.md | ||
example_03.html.itxt | ||
example_03.html.json | ||
example_03.html.md | ||
example_04.html.itxt | ||
example_04.html.json | ||
example_04.html.md | ||
lorem_ipsum.docx.itxt | ||
lorem_ipsum.docx.json | ||
lorem_ipsum.docx.md | ||
powerpoint_sample.pptx.itxt | ||
powerpoint_sample.pptx.json | ||
powerpoint_sample.pptx.md | ||
powerpoint_with_image.pptx.itxt | ||
powerpoint_with_image.pptx.json | ||
powerpoint_with_image.pptx.md | ||
redp5110_sampled.doctags.txt | ||
redp5110_sampled.json | ||
redp5110_sampled.md | ||
redp5110_sampled.pages.json | ||
tablecell.docx.itxt | ||
tablecell.docx.json | ||
tablecell.docx.md | ||
test_01.asciidoc.md | ||
test_02.asciidoc.md | ||
test_emf_docx.docx.itxt | ||
test_emf_docx.docx.json | ||
test_emf_docx.docx.md | ||
test-01.xlsx.itxt | ||
test-01.xlsx.json | ||
test-01.xlsx.md | ||
unit_test_01.html.itxt | ||
unit_test_01.html.json | ||
unit_test_01.html.md | ||
unit_test_headers.docx.itxt | ||
unit_test_headers.docx.json | ||
unit_test_headers.docx.md | ||
unit_test_lists.docx.itxt | ||
unit_test_lists.docx.json | ||
unit_test_lists.docx.md | ||
wiki_duck.html.itxt | ||
wiki_duck.html.json | ||
wiki_duck.html.md | ||
word_sample.docx.itxt | ||
word_sample.docx.json | ||
word_sample.docx.md | ||
word_sample.json | ||
word_sample.md | ||
word_sample.yaml |