docling/tests/data
Cesar Berrospi Ramis 0cd81a8122
fix(docx): merged table cells not properly converted (#857)
* fix(docx): merged cells not properly converted

Fix conversion issue of merged cells in Word tables leading to repeated text.
Simplify Word table conversion code.
Add docx file with several table formats for regression tests.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

* chore: add type hinting to docx backend

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

---------

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-02-03 10:20:03 +01:00
..
docx fix(docx): merged table cells not properly converted (#857) 2025-02-03 10:20:03 +01:00
groundtruth fix(docx): merged table cells not properly converted (#857) 2025-02-03 10:20:03 +01:00
html fix: parse html with omitted body tag (#818) 2025-01-27 16:59:00 +01:00
md fix(markdown): fix empty block handling (#843) 2025-01-30 16:22:29 +01:00
pptx feat: Extracting picture data for raster images found in PPTX (#349) 2024-11-18 15:22:28 +01:00
pubmed feat: Create a backend to transform PubMed XML files to DoclingDocument (#557) 2024-12-17 19:27:09 +01:00
uspto feat: create a backend to parse USPTO patents into DoclingDocument (#606) 2024-12-17 16:35:23 +01:00
xlsx fix: added extraction of byte-images in excel (#804) 2025-01-24 18:48:02 +01:00
2203.01017v2.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2206.01062.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2305.03393v1-pg9-img.png feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
2305.03393v1-pg9.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2305.03393v1.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
amt_handbook_sample.pdf docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
code_and_formula.pdf feat: Code and equation model for PDF and code blocks in markdown (#752) 2025-01-24 16:54:22 +01:00
picture_classification.pdf feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
redp5110_sampled.pdf chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_01.asciidoc feat: Support AsciiDoc and Markdown input format (#168) 2024-10-23 16:14:26 +02:00
test_02.asciidoc feat: Support AsciiDoc and Markdown input format (#168) 2024-10-23 16:14:26 +02:00