docling/tests at a458e298ca64da2c6df29d953e95645525817bed - docling - Zorio's Git

mirrors/docling

mirror of https://github.com/DS4SD/docling.git synced 2025-12-08 20:58:11 +00:00

Files

History

Peter W. J. Staar a458e298ca fix: added extraction of byte-images in excel (#804 )

* fix(msexcel): ignore Mypy checking for _find_images_in_sheet function

Signed-off-by: Jiun An Tsai <andrew@247365-Macbook.local>

* fixed some issues

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatted the code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* pinned pillow in pyproject

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Jiun An Tsai <andrew@247365-Macbook.local>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Jiun An Tsai <andrew@247365-Macbook.local>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>

2025-01-24 18:48:02 +01:00

..

fix: added extraction of byte-images in excel (#804 )

2025-01-24 18:48:02 +01:00

feat: Updated Layout processing with forms and key-value areas (#530 )

2024-12-17 17:32:24 +01:00

__init__.py

fix: Add unit tests (#51 )

2024-08-30 14:08:20 +02:00

test_backend_asciidoc.py

feat: Add pipeline timings and toggle visualization, establish debug settings (#183 )

2024-10-30 15:04:19 +01:00

test_backend_docling_json.py

feat: add Docling JSON ingestion (#783 )

2025-01-24 18:05:23 +01:00

test_backend_docling_parse_v2.py

chore: make tests lighter (#228 )

2024-11-04 14:02:28 +01:00

test_backend_docling_parse.py

chore: make tests lighter (#228 )

2024-11-04 14:02:28 +01:00

test_backend_html.py

fix: fix duplicate title and heading + add e2e tests for html and docx (#186 )

2024-10-30 13:14:56 +01:00

test_backend_msexcel.py

fix: added extraction of byte-images in excel (#804 )

2025-01-24 18:48:02 +01:00

test_backend_msword.py

fix: fix duplicate title and heading + add e2e tests for html and docx (#186 )

2024-10-30 13:14:56 +01:00

test_backend_patent_uspto.py

feat: create a backend to parse USPTO patents into DoclingDocument (#606 )

2024-12-17 16:35:23 +01:00

test_backend_pdfium.py

chore: make tests lighter (#228 )

2024-11-04 14:02:28 +01:00

test_backend_pptx.py

feat: Extracting picture data for raster images found in PPTX (#349 )

2024-11-18 15:22:28 +01:00

test_backend_pubmed.py

feat: Create a backend to transform PubMed XML files to DoclingDocument (#557 )

2024-12-17 19:27:09 +01:00

test_cli.py

test: generate file from CLI in a temporary directory (#618 )

2024-12-17 16:35:42 +01:00

test_code_formula.py

feat: Code and equation model for PDF and code blocks in markdown (#752 )

2025-01-24 16:54:22 +01:00

test_document_picture_classifier.py

feat: New document picture classifier (#805 )

2025-01-24 18:05:51 +01:00

test_e2e_conversion.py

feat: Add pipeline timings and toggle visualization, establish debug settings (#183 )

2024-10-30 15:04:19 +01:00

test_e2e_ocr_conversion.py

feat: add "auto" language for TesseractOcr (#759 )

2025-01-23 12:40:50 +01:00

test_input_doc.py

feat: add Docling JSON ingestion (#783 )

2025-01-24 18:05:23 +01:00

test_interfaces.py

fix: improve handling of disallowed formats (#429 )

2024-12-03 12:45:32 +01:00

test_invalid_input.py

fix: improve handling of disallowed formats (#429 )

2024-12-03 12:45:32 +01:00

test_legacy_format_transform.py

fix: fix duplicate title and heading + add e2e tests for html and docx (#186 )

2024-10-30 13:14:56 +01:00

test_options.py

feat: Introduce support for GPU Accelerators (#593 )

2024-12-13 17:45:22 +01:00

verify_utils.py

feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290 )

2024-11-12 09:46:14 +01:00