mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-26 20:14:47 +00:00
* Implement new reading-order model, replacing DS GLM model (WIP) Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update reading-order model branch Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update lockfile [skip ci] Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add captions, footnotes and merges [skip ci] Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updates for reading-order implementation Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updates for reading-order implementation Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update tests and lockfile Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fixes, update tests Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add normalization, update tests again Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update tests with code Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Push final lockfile Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * sanitize text Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Inlcude furniture, Update tests with furniture Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix content_layer assignment Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * chore: Delete empty file docling/models/ds_glm_model.py Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com> |
||
---|---|---|
.. | ||
data | ||
data_scanned | ||
__init__.py | ||
test_backend_asciidoc.py | ||
test_backend_csv.py | ||
test_backend_docling_json.py | ||
test_backend_docling_parse_v2.py | ||
test_backend_docling_parse.py | ||
test_backend_html.py | ||
test_backend_jats.py | ||
test_backend_markdown.py | ||
test_backend_msexcel.py | ||
test_backend_msword.py | ||
test_backend_patent_uspto.py | ||
test_backend_pdfium.py | ||
test_backend_pptx.py | ||
test_cli.py | ||
test_code_formula.py | ||
test_data_gen_flag.py | ||
test_document_picture_classifier.py | ||
test_e2e_conversion.py | ||
test_e2e_ocr_conversion.py | ||
test_input_doc.py | ||
test_interfaces.py | ||
test_invalid_input.py | ||
test_legacy_format_transform.py | ||
test_options.py | ||
verify_utils.py |