..
docx
fix: Fixes for wordx ( #432 )
2024-11-26 14:44:43 +01:00
groundtruth
docs: Add example for inspection of picture content ( #624 )
2025-01-29 10:39:00 +01:00
html
fix: parse html with omitted body tag ( #818 )
2025-01-27 16:59:00 +01:00
md
fix: fix single newline handling in MD backend ( #824 )
2025-01-28 19:05:55 +01:00
pptx
feat: Extracting picture data for raster images found in PPTX ( #349 )
2024-11-18 15:22:28 +01:00
pubmed
feat: Create a backend to transform PubMed XML files to DoclingDocument ( #557 )
2024-12-17 19:27:09 +01:00
uspto
feat: create a backend to parse USPTO patents into DoclingDocument ( #606 )
2024-12-17 16:35:23 +01:00
xlsx
fix: added extraction of byte-images in excel ( #804 )
2025-01-24 18:48:02 +01:00
2203.01017v2.pdf
fix: Add unit tests ( #51 )
2024-08-30 14:08:20 +02:00
2206.01062.pdf
fix: Add unit tests ( #51 )
2024-08-30 14:08:20 +02:00
2305.03393v1-pg9-img.png
feat!: Docling v2 ( #117 )
2024-10-16 21:02:03 +02:00
2305.03393v1-pg9.pdf
fix: Add unit tests ( #51 )
2024-08-30 14:08:20 +02:00
2305.03393v1.pdf
fix: Add unit tests ( #51 )
2024-08-30 14:08:20 +02:00
amt_handbook_sample.pdf
docs: Add example for inspection of picture content ( #624 )
2025-01-29 10:39:00 +01:00
code_and_formula.pdf
feat: Code and equation model for PDF and code blocks in markdown ( #752 )
2025-01-24 16:54:22 +01:00
picture_classification.pdf
feat: New document picture classifier ( #805 )
2025-01-24 18:05:51 +01:00
redp5110_sampled.pdf
chore: make tests lighter ( #228 )
2024-11-04 14:02:28 +01:00
test_01.asciidoc
feat: Support AsciiDoc and Markdown input format ( #168 )
2024-10-23 16:14:26 +02:00
test_02.asciidoc
feat: Support AsciiDoc and Markdown input format ( #168 )
2024-10-23 16:14:26 +02:00