Files
docling/tests/data/html/formatting.html
krrome 94fcc46aa9 feat(html): Support formatting tags in HTML texts (#2111)
* add parsing for formatting tags in HTML backend

Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>

fix latest tests + wiki_duck result files.

Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>

* convert _collect_parent_format_tags to staticmethod

Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>

---------

Signed-off-by: Roman Kayan BAZG <roman.kayan@bazg.admin.ch>
2025-08-22 10:37:34 +02:00

47 lines
1.4 KiB
HTML
Vendored

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>HTML Formatting Tags Demo</title>
</head>
<body>
<h1>HTML Text Formatting Examples</h1>
<p>
This is a <b>bold (b)</b> example and right next to it we have a
<strong>strong emphasis (strong)</strong>.
Notice that <strong><b>strong + bold mixed</b></strong> looks similar but carries additional semantic meaning.
</p>
<p>
Here is an <i>italic (i)</i> word and an <em>emphasis (em)</em> example.
Sometimes we combine them like <i><em>italic + emphasis together</em></i>.
</p>
<p>
Now let's look at text that appears crossed out:
<s>strikethrough with s</s> and
<del>deleted with del</del>.
You can also mix them: <s><del>double strikethrough (s + del)</del></s>.
</p>
<p>
To highlight insertions or underlines:
<u>underlined with u</u>,
<ins>inserted with ins</ins>.
A combination could be: <u><ins>underline + insertion together</ins></u>.
</p>
<p>
Subscript and superscript examples:
Water is written as H<sub>2</sub>O using sub.
The mathematical expression x<sup>2</sup> + y<sup>3</sup> uses sup.
They can also be combined: CO<sub>2</sub><sup>*</sup>.
</p>
<p>
Mixing several: This sentence has <strong><em>strong + emphasis</em></strong>,
some <b><u>bold + underline</u></b>, and a formula like a<sup>2</sup> + b<sub>3</sub>.
</p>
</body>
</html>