mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-27 12:34:22 +00:00
* chore: bump version to 2.28.4 [skip ci] Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> * Improve text parsing Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> * fix: Tesseract OCR CLI can't process images composed with numbers only (#1201) fix wrong type text extracted by tesseract_ocr_cli_model Signed-off-by: gvl4 <Guilhem.VERMOREL@3ds.com> Co-authored-by: gvl4 <Guilhem.VERMOREL@3ds.com> Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> * Flexibilize heading detection Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> * Fix trailing space Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> * Remove trailing space Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> --------- Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com> Signed-off-by: gvl4 <Guilhem.VERMOREL@3ds.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Guilhem VERMOREL <83694424+guilhemvermorel@users.noreply.github.com> Co-authored-by: gvl4 <Guilhem.VERMOREL@3ds.com> Signed-off-by: Benichou <fbenichou@deloitte.ca> |
||
---|---|---|
.. | ||
docx | ||
json | ||
xml | ||
__init__.py | ||
abstract_backend.py | ||
asciidoc_backend.py | ||
csv_backend.py | ||
docling_parse_backend.py | ||
docling_parse_v2_backend.py | ||
docling_parse_v4_backend.py | ||
html_backend.py | ||
md_backend.py | ||
msexcel_backend.py | ||
mspowerpoint_backend.py | ||
msword_backend.py | ||
pdf_backend.py | ||
pypdfium2_backend.py |