fix(HTML): ensure correct concatenation of child strings in table cells and list items
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Adding plain latex equations to table cells
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* Adding test files
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
---------
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* Initial plan
* Fix multi-page TIFF image support
Co-authored-by: cau-git <60343111+cau-git@users.noreply.github.com>
* add RGB conversion
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove pointless test
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add multi-page TIFF test data and verification tests
Co-authored-by: cau-git <60343111+cau-git@users.noreply.github.com>
* Revert "Add multi-page TIFF test data and verification tests"
This reverts commit 130a10e2d9.
* Proper test for 2 page tiff file
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* DCO Remediation Commit for copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
I, copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 420df478f3
I, copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>, hereby add my Signed-off-by to this commit: c1d722725f
I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: 6aa85cc933
I, copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 130a10e2d9
I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: d571f36299
I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: 2aab66288b
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Proper test for 2 page tiff file (2)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: cau-git <60343111+cau-git@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
* A new HTML backend that handles styled html (ignors it) as well as images.
Images are parsed as placeholders with a caption, if it exists.
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: vaaale <2428222+vaaale@users.noreply.github.com>
Signed-off-by: Alexander Vaagan <alexander.vaagan@gmail.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Signed-off-by: vaaale <2428222+vaaale@users.noreply.github.com>
* tests(HTML): re-enable test_ordered_lists
Re-enable test_ordered_lists regression test for the HTML backend since
docling-core now supports ordered lists with custom start value.
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
---------
Signed-off-by: Alexander Vaagan <alexander.vaagan@gmail.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Signed-off-by: vaaale <2428222+vaaale@users.noreply.github.com>
Co-authored-by: Alexander Vaagan <2428222+vaaale@users.noreply.github.com>
* docs: add documentation for confidence scores
Signed-off-by: Fabiano Franz <contact@fabianofranz.com>
* Increase focus on confidence grades, scores are informational only
Signed-off-by: Fabiano Franz <contact@fabianofranz.com>
* Update confidence_scores.md
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
---------
Signed-off-by: Fabiano Franz <contact@fabianofranz.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: fa71cde950
I, Ubuntu <ubuntu@ip-172-31-30-253.eu-central-1.compute.internal>, hereby add my Signed-off-by to this commit: d66da87d96
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Use device_map for transformer models
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add accelerate
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Relax accelerate min version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make pipeline cache+init thread-safe
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Fix a bug in parsing HTML tables in HTML backend.
Fix a bug in test file that prevented JATS backend tests.
Ensure that the JATS backend creates headings with the right level.
Remove unnecessary data files for testing JATS backend.
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Initial plan
* Fix granite vision model URL from preview to stable version
Co-authored-by: cau-git <60343111+cau-git@users.noreply.github.com>
* Update to granite vision 3.3
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
* Update to granite vision 3.3 (2)
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
---------
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: cau-git <60343111+cau-git@users.noreply.github.com>