docling/tests
Cesar Berrospi Ramis 40145b59b3 fix(docx): merged cells not properly converted
Fix conversion issue of merged cells in Word tables leading to repeated text.
Simplify Word table conversion code.
Add docx file with several table formats for regression tests.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-01-31 18:30:28 +01:00
..
data fix(docx): merged cells not properly converted 2025-01-31 18:30:28 +01:00
data_scanned docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
__init__.py fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
test_backend_asciidoc.py feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
test_backend_docling_json.py feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
test_backend_docling_parse_v2.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_docling_parse.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_html.py fix: parse html with omitted body tag (#818) 2025-01-27 16:59:00 +01:00
test_backend_markdown.py fix: fix single newline handling in MD backend (#824) 2025-01-28 19:05:55 +01:00
test_backend_msexcel.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_msword.py fix(docx): merged cells not properly converted 2025-01-31 18:30:28 +01:00
test_backend_patent_uspto.py docs: description of supported formats and backends (#788) 2025-01-26 08:10:33 +01:00
test_backend_pdfium.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_pptx.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_pubmed.py docs: description of supported formats and backends (#788) 2025-01-26 08:10:33 +01:00
test_cli.py test: generate file from CLI in a temporary directory (#618) 2024-12-17 16:35:42 +01:00
test_code_formula.py feat: Code and equation model for PDF and code blocks in markdown (#752) 2025-01-24 16:54:22 +01:00
test_document_picture_classifier.py feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
test_e2e_conversion.py docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
test_e2e_ocr_conversion.py feat: Python 3.13 support (#841) 2025-01-30 17:26:42 +01:00
test_input_doc.py feat: Add option to define page range (#852) 2025-01-31 15:23:00 +01:00
test_interfaces.py fix: improve handling of disallowed formats (#429) 2024-12-03 12:45:32 +01:00
test_invalid_input.py fix: improve handling of disallowed formats (#429) 2024-12-03 12:45:32 +01:00
test_legacy_format_transform.py fix: fix duplicate title and heading + add e2e tests for html and docx (#186) 2024-10-30 13:14:56 +01:00
test_options.py feat: Add option to define page range (#852) 2025-01-31 15:23:00 +01:00
verify_utils.py feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) 2024-11-12 09:46:14 +01:00