mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-26 20:14:47 +00:00
* fix: parse HTML files without body tag Parse HTML files without 'body' tag, since it is optional in HTML5 specification. Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> * test: ensure docling converts HTML without body tag Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> --------- Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> |
||
---|---|---|
.. | ||
docx | ||
groundtruth | ||
html | ||
md | ||
pptx | ||
pubmed | ||
uspto | ||
xlsx | ||
2203.01017v2.pdf | ||
2206.01062.pdf | ||
2305.03393v1-pg9-img.png | ||
2305.03393v1-pg9.pdf | ||
2305.03393v1.pdf | ||
code_and_formula.pdf | ||
picture_classification.pdf | ||
redp5110_sampled.pdf | ||
test_01.asciidoc | ||
test_02.asciidoc |