mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-23 18:45:00 +00:00
* Draft implementation of Doctag backend Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated VLM pipeline doctags to docling conversion, now properly supports lists Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * preparing to migrate to new doctags deserializer Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * re-using DocTagsDocument.from_doctags_and_image_pairs Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * satisfying mypy and other checks Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Added support for force_backend_text parameter Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * removed unnecessary transformation Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Cleaned up Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Update tests Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updated readme Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> --------- Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com> |
||
---|---|---|
.. | ||
data | ||
data_scanned | ||
__init__.py | ||
test_backend_asciidoc.py | ||
test_backend_csv.py | ||
test_backend_docling_json.py | ||
test_backend_docling_parse_v2.py | ||
test_backend_docling_parse_v4.py | ||
test_backend_docling_parse.py | ||
test_backend_html.py | ||
test_backend_jats.py | ||
test_backend_markdown.py | ||
test_backend_msexcel.py | ||
test_backend_msword.py | ||
test_backend_patent_uspto.py | ||
test_backend_pdfium.py | ||
test_backend_pptx.py | ||
test_cli.py | ||
test_code_formula.py | ||
test_data_gen_flag.py | ||
test_document_picture_classifier.py | ||
test_e2e_conversion.py | ||
test_e2e_ocr_conversion.py | ||
test_input_doc.py | ||
test_interfaces.py | ||
test_invalid_input.py | ||
test_legacy_format_transform.py | ||
test_options.py | ||
verify_utils.py |