docling/tests/data/groundtruth/docling_v1
Maxim Lysak 6e75f0b5d3
fix: Revise DocTags, fix iterate_items to output content_layer in items (#965)
* Testing fix for docling-core dt

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* fix: Fix code_formula test unit, update test-cases

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: Fix code-formula model for new docling-core

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: Update fixes

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update test cases for office formats

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update deps and lockfile

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Clean up imports

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-02-17 14:11:55 +01:00
..
2203.01017v2.doctags.txt feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2203.01017v2.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
2203.01017v2.md feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2203.01017v2.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
2206.01062.doctags.txt feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2206.01062.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
2206.01062.md feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2206.01062.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
2305.03393v1-pg9.doctags.txt feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
2305.03393v1-pg9.json feat: Add content_layer property to items to address body, furniture and other roles (#735) 2025-02-10 12:07:49 +01:00
2305.03393v1-pg9.md feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
2305.03393v1-pg9.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
2305.03393v1.doctags.txt feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2305.03393v1.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
2305.03393v1.md feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
2305.03393v1.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
amt_handbook_sample.doctags.txt docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
amt_handbook_sample.json feat: Add content_layer property to items to address body, furniture and other roles (#735) 2025-02-10 12:07:49 +01:00
amt_handbook_sample.md docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
amt_handbook_sample.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
code_and_formula.doctags.txt fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
code_and_formula.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
code_and_formula.md fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
code_and_formula.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
picture_classification.doctags.txt feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
picture_classification.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
picture_classification.md feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
picture_classification.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
redp5110_sampled.doctags.txt feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
redp5110_sampled.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
redp5110_sampled.md feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
redp5110_sampled.pages.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_01.doctags.txt fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_01.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_01.md fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_01.pages.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
right_to_left_02.doctags.txt fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_02.json fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_02.md fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_02.pages.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
right_to_left_03.doctags.txt fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_03.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00
right_to_left_03.md fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
right_to_left_03.pages.json fix: Revise DocTags, fix iterate_items to output content_layer in items (#965) 2025-02-17 14:11:55 +01:00