fix(HTML): replace non-standard Unicode characters (#2006)

chore(HTML): replace non-standard Unicode characters for beter downstream tasks

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
This commit is contained in:
Cesar Berrospi Ramis
2025-07-29 11:05:35 +02:00
committed by GitHub
parent aae42b37a8
commit 86f70128aa
8 changed files with 125 additions and 52 deletions

View File

@@ -133,7 +133,7 @@
"label": "text",
"prov": [],
"orig": "Docling simplifies document processing, parsing diverse formats — including HTML — and providing seamless integrations with the gen AI ecosystem.",
"text": "Docling simplifies document processing, parsing diverse formats including HTML and providing seamless integrations with the gen AI ecosystem."
"text": "Docling simplifies document processing, parsing diverse formats - including HTML - and providing seamless integrations with the gen AI ecosystem."
},
{
"self_ref": "#/texts/3",