* Add DoclingParseV3 backend implementation
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Use docling-core with docling-parse types
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes and test updates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix streams
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix streams
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reset tests
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* update test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* update test units
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add back DoclingParse v1 backend, pipeline options
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update locks
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: update docling-core to 2.22.0
Update dependency library docling-core to latest release 2.22.0
Fix regression tests and ground truth files
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Ground-truth files updated
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update tests, use TextCell.from_ocr property
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Text fixes, new test data
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Rename docling backend to v4
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Test all backends, fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reset all tests to use docling-parse v1 for now
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for DPv4 backend init, better test coverage
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* test_input_doc use default backend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Equation groups
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* fix: Proper handling of orphan IDs in layout postprocessing (#1118)
* Fix the handling of orphan IDs in layout postprocessing
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* chore: bump version to 2.25.2 [skip ci]
* docs: add description of DOCLING_ARTIFACTS_PATH env var (#1124)
add env var in docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* fix(CLI): fix help message for abort options (#1130)
fix help message
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* perf: New revision code formula model and document picture classifier (#1140)
* new version code formula model
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* new version document picture classifier
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* new code formula model
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* restored original code formula test pdf
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
---------
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Co-authored-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* feat: Use new TableFormer model weights and default to accurate model version (#1100)
* feat: New tableformer model weights [WIP]
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
* Updated TF version
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Updated tests, after merging with Main, Switched to Accurate TF model by default
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* chore: bump version to 2.26.0 [skip ci]
* fix: Pass tests, update docling-core to 2.22.0 (#1150)
fix: update docling-core to 2.22.0
Update dependency library docling-core to latest release 2.22.0
Fix regression tests and ground truth files
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Updating content hash
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
---------
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Matteo <43417658+Matteo-Omenetti@users.noreply.github.com>
Co-authored-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>