krrome
|
94fcc46aa9
|
feat(html): Support formatting tags in HTML texts (#2111)
* add parsing for formatting tags in HTML backend
Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>
fix latest tests + wiki_duck result files.
Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>
* convert _collect_parent_format_tags to staticmethod
Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>
---------
Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>
|
2025-08-22 10:37:34 +02:00 |
|
Cesar Berrospi Ramis
|
a069b1175b
|
refactor(HTML): handle text from styled html (#1960)
* A new HTML backend that handles styled html (ignors it) as well as images.
Images are parsed as placeholders with a caption, if it exists.
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: vaaale <2428222+vaaale@users.noreply.github.com>
Signed-off-by: Alexander Vaagan <alexander.vaagan@gmail.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Signed-off-by: vaaale <2428222+vaaale@users.noreply.github.com>
* tests(HTML): re-enable test_ordered_lists
Re-enable test_ordered_lists regression test for the HTML backend since
docling-core now supports ordered lists with custom start value.
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
---------
Signed-off-by: Alexander Vaagan <alexander.vaagan@gmail.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Signed-off-by: vaaale <2428222+vaaale@users.noreply.github.com>
Co-authored-by: Alexander Vaagan <2428222+vaaale@users.noreply.github.com>
|
2025-07-22 13:16:31 +02:00 |
|
Cesar Berrospi Ramis
|
ed20124544
|
fix(html): handle address, details, and summary tags (#1436)
* fix(html): handle 'address' tag
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* fix(html): handle 'details' tag
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
---------
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
|
2025-04-23 09:30:59 +02:00 |
|
Cesar Berrospi Ramis
|
1b0ead6907
|
fix(html): Parse text in div elements as TextItem (#1041)
feat(html): Parse text in div elements as TextItem
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
|
2025-02-24 12:38:29 +01:00 |
|