docling/tests/data
Cesar Berrospi Ramis 40145b59b3 fix(docx): merged cells not properly converted
Fix conversion issue of merged cells in Word tables leading to repeated text.
Simplify Word table conversion code.
Add docx file with several table formats for regression tests.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-01-31 18:30:28 +01:00
..
docx fix(docx): merged cells not properly converted 2025-01-31 18:30:28 +01:00
groundtruth fix(docx): merged cells not properly converted 2025-01-31 18:30:28 +01:00
html fix: parse html with omitted body tag (#818) 2025-01-27 16:59:00 +01:00
md fix(markdown): fix empty block handling (#843) 2025-01-30 16:22:29 +01:00
pptx feat: Extracting picture data for raster images found in PPTX (#349) 2024-11-18 15:22:28 +01:00
pubmed feat: Create a backend to transform PubMed XML files to DoclingDocument (#557) 2024-12-17 19:27:09 +01:00
uspto feat: create a backend to parse USPTO patents into DoclingDocument (#606) 2024-12-17 16:35:23 +01:00
xlsx fix: added extraction of byte-images in excel (#804) 2025-01-24 18:48:02 +01:00
2203.01017v2.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2206.01062.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2305.03393v1-pg9-img.png feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
2305.03393v1-pg9.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
2305.03393v1.pdf fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
amt_handbook_sample.pdf docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
code_and_formula.pdf feat: Code and equation model for PDF and code blocks in markdown (#752) 2025-01-24 16:54:22 +01:00
picture_classification.pdf feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
redp5110_sampled.pdf chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_01.asciidoc feat: Support AsciiDoc and Markdown input format (#168) 2024-10-23 16:14:26 +02:00
test_02.asciidoc feat: Support AsciiDoc and Markdown input format (#168) 2024-10-23 16:14:26 +02:00