mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-25 19:44:34 +00:00
* chore(xml-jats): separate authors and affiliations In XML PubMed (JATS) backend, convert authors and affiliations as they are typically rendered on PDFs. Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> * fix(xml-jats): replace new line character by a space Instead of removing new line character from text, replace it by a space character. Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> * feat(xml-jats): improve existing parser and extend features Partially support lists, respect reading order, parse more sections, support equations, better text formatting. Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> * chore(xml-jats): rename PubMed objects to JATS Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> --------- Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> |
||
---|---|---|
.. | ||
data | ||
data_scanned | ||
__init__.py | ||
test_backend_asciidoc.py | ||
test_backend_csv.py | ||
test_backend_docling_json.py | ||
test_backend_docling_parse_v2.py | ||
test_backend_docling_parse.py | ||
test_backend_html.py | ||
test_backend_jats.py | ||
test_backend_markdown.py | ||
test_backend_msexcel.py | ||
test_backend_msword.py | ||
test_backend_patent_uspto.py | ||
test_backend_pdfium.py | ||
test_backend_pptx.py | ||
test_cli.py | ||
test_code_formula.py | ||
test_data_gen_flag.py | ||
test_document_picture_classifier.py | ||
test_e2e_conversion.py | ||
test_e2e_ocr_conversion.py | ||
test_input_doc.py | ||
test_interfaces.py | ||
test_invalid_input.py | ||
test_legacy_format_transform.py | ||
test_options.py | ||
verify_utils.py |