docling/tests
Cesar Berrospi Ramis de7b963b09
fix(html): use 'start' attribute when parsing ordered lists from HTML docs (#1062)
* fix(html): use 'start' attribute in ordered lists

When parsing ordered lists in HTML, take into account the 'start' attribute if it exists.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

* chore(html): reduce verbosity in HTML backend

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

---------

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-02-27 09:46:57 +01:00
..
data fix(html): Parse text in div elements as TextItem (#1041) 2025-02-24 12:38:29 +01:00
data_scanned feat: Implement new reading-order model (#916) 2025-02-20 17:51:17 +01:00
__init__.py fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
test_backend_asciidoc.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_csv.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00
test_backend_docling_json.py feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
test_backend_docling_parse_v2.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_docling_parse.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_html.py fix(html): use 'start' attribute when parsing ordered lists from HTML docs (#1062) 2025-02-27 09:46:57 +01:00
test_backend_jats.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00
test_backend_markdown.py fix(markdown): handle nested lists (#910) 2025-02-07 12:55:12 +01:00
test_backend_msexcel.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00
test_backend_msword.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00
test_backend_patent_uspto.py test: avoid testing exact JSON (#1027) 2025-02-20 16:20:07 +01:00
test_backend_pdfium.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_pptx.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00
test_cli.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_code_formula.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_data_gen_flag.py fix(markdown): handle nested lists (#910) 2025-02-07 12:55:12 +01:00
test_document_picture_classifier.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_e2e_conversion.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_e2e_ocr_conversion.py feat: Python 3.13 support (#841) 2025-01-30 17:26:42 +01:00
test_input_doc.py feat(xml-jats): parse XML JATS documents (#967) 2025-02-17 10:43:31 +01:00
test_interfaces.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_invalid_input.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_legacy_format_transform.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_options.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
verify_utils.py test: avoid testing exact JSON in CSV backend (#1038) 2025-02-24 08:10:40 +01:00