docling/tests at d8a81c31686449a0bd3a56c0bc8475fead658ba9 - docling - Zorio's Git

mirrors/docling

mirror of https://github.com/DS4SD/docling.git synced 2025-12-08 20:58:11 +00:00

Files

History

Christoph Auer c93e36988f feat: Implement new reading-order model (#916 )

* Implement new reading-order model, replacing DS GLM model (WIP)

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update reading-order model branch

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update lockfile [skip ci]

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add captions, footnotes and merges [skip ci]

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Updates for reading-order implementation

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Updates for reading-order implementation

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update tests and lockfile

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fixes, update tests

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add normalization, update tests again

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update tests with code

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Push final lockfile

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* sanitize text

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* Inlcude furniture, Update tests with furniture

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix content_layer assignment

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* chore: Delete empty file docling/models/ds_glm_model.py

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>

2025-02-20 17:51:17 +01:00

..

feat: Implement new reading-order model (#916 )

2025-02-20 17:51:17 +01:00

feat: Implement new reading-order model (#916 )

2025-02-20 17:51:17 +01:00

__init__.py

fix: Add unit tests (#51 )

2024-08-30 14:08:20 +02:00

test_backend_asciidoc.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_backend_csv.py

test: validate actual docitems in tests (#966 )

2025-02-14 17:47:53 +01:00

test_backend_docling_json.py

feat: add Docling JSON ingestion (#783 )

2025-01-24 18:05:23 +01:00

test_backend_docling_parse_v2.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_backend_docling_parse.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_backend_html.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_backend_jats.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_backend_markdown.py

fix(markdown): handle nested lists (#910 )

2025-02-07 12:55:12 +01:00

test_backend_msexcel.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_backend_msword.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_backend_patent_uspto.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_backend_pdfium.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_backend_pptx.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00

test_cli.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_code_formula.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_data_gen_flag.py

fix(markdown): handle nested lists (#910 )

2025-02-07 12:55:12 +01:00

test_document_picture_classifier.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_e2e_conversion.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_e2e_ocr_conversion.py

feat: Python 3.13 support (#841 )

2025-01-30 17:26:42 +01:00

test_input_doc.py

feat(xml-jats): parse XML JATS documents (#967 )

2025-02-17 10:43:31 +01:00

test_interfaces.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_invalid_input.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_legacy_format_transform.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

test_options.py

fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903 )

2025-02-07 08:43:31 +01:00

verify_utils.py

test: avoid testing exact JSON (#1027 )

2025-02-20 16:20:07 +01:00