mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-09 13:18:24 +00:00
fix(html): Parse text in div elements as TextItem (#1041)
feat(html): Parse text in div elements as TextItem Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
1d17e7397a
commit
1b0ead6907
7
tests/data/groundtruth/docling_v2/example_06.html.itxt
Normal file
7
tests/data/groundtruth/docling_v2/example_06.html.itxt
Normal file
@@ -0,0 +1,7 @@
|
||||
item-0 at level 0: unspecified: group _root_
|
||||
item-1 at level 1: paragraph: This is a div with text.
|
||||
item-2 at level 1: paragraph: This is another div with text.
|
||||
item-3 at level 1: paragraph: This is a regular paragraph.
|
||||
item-4 at level 1: paragraph: This is a third div
|
||||
with a new line.
|
||||
item-5 at level 1: paragraph: This is a fourth div with a bold paragraph.
|
||||
Reference in New Issue
Block a user