* Use device_map for transformer models
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add accelerate
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Relax accelerate min version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make pipeline cache+init thread-safe
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Establish layout_model spec and example instantations
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Updated naming
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Back to uppercase constants
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix deps issue with openai-whipser>numba>llvmlite
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Pull v1 changed test GT from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Integrate ListItemMarkerProcessor into document assembly
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update to final version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update all test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Upgrade deps
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* scaffolding in place
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* doing scaffolding for audio pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* WIP: got first transcription working
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* all working, time to start cleaning up
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* first working ASR pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added openai-whisper as a first transcription model
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updating with asr_options
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* finalised the first working ASR pipeline with Whisper
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* use whisper from the latest git commit
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* Update docling/datamodel/pipeline_options.py
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
* Update docling/datamodel/pipeline_options.py
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
* updated comment
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* AudioBackend -> DummyBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* file rename
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Rename to NoOpBackend, add test for ASR pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Support every format in NoOpBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add missing audio file and test
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Install ffmpeg system dependency for ASR test
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
* feat: adding new vlm-models support
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the transformers
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* got microsoft/Phi-4-multimodal-instruct to work
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* working on vlm's
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring the VLM part
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* all working, now serious refacgtoring necessary
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring the download_model
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the formulate_prompt
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* pixtral 12b runs via MLX and native transformers
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the VlmPredictionToken
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring minimal_vlm_pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the MyPy
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added pipeline_model_specializations file
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* need to get Phi4 working again ...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* finalising last points for vlms support
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the pipeline for Phi4
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* streamlining all code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixing the tests
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the html backend to the VLM pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the static load_from_doctags
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* restore stable imports
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use AutoModelForVision2Seq for Pixtral and review example (including rename)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove unused value
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* refactor instances of VLM models
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* skip compare example in CI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use lowercase and uppercase only
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename pipeline_vlm_model_spec
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move more argument to options and simplify model init
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add supported_devices
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove not-needed function
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* exclude minimal_vlm
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* missing file
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add message for transformers version
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename to specs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use module import and remove MLX from non-darwin
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove hf_vlm_model and add extra_generation_args
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use single HF VLM model class
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove torch type
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add docs for vision models
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
* feat: Add visualization of bbox on page with html export.
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the cli
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the cli argument to show_layout
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* Add DoclingParseV3 backend implementation
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Use docling-core with docling-parse types
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes and test updates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix streams
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix streams
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reset tests
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* update test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* update test units
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add back DoclingParse v1 backend, pipeline options
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update locks
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: update docling-core to 2.22.0
Update dependency library docling-core to latest release 2.22.0
Fix regression tests and ground truth files
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Ground-truth files updated
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update tests, use TextCell.from_ocr property
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Text fixes, new test data
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Rename docling backend to v4
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Test all backends, fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reset all tests to use docling-parse v1 for now
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for DPv4 backend init, better test coverage
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* test_input_doc use default backend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Equation groups
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* fix: Proper handling of orphan IDs in layout postprocessing (#1118)
* Fix the handling of orphan IDs in layout postprocessing
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* chore: bump version to 2.25.2 [skip ci]
* docs: add description of DOCLING_ARTIFACTS_PATH env var (#1124)
add env var in docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* fix(CLI): fix help message for abort options (#1130)
fix help message
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* perf: New revision code formula model and document picture classifier (#1140)
* new version code formula model
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* new version document picture classifier
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* new code formula model
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
* restored original code formula test pdf
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
---------
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Co-authored-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* feat: Use new TableFormer model weights and default to accurate model version (#1100)
* feat: New tableformer model weights [WIP]
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
* Updated TF version
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Updated tests, after merging with Main, Switched to Accurate TF model by default
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
* chore: bump version to 2.26.0 [skip ci]
* fix: Pass tests, update docling-core to 2.22.0 (#1150)
fix: update docling-core to 2.22.0
Update dependency library docling-core to latest release 2.22.0
Fix regression tests and ground truth files
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
* Updating content hash
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
---------
Signed-off-by: Rafael Teixeira de Lima <Rafael.td.lima@gmail.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Matteo <43417658+Matteo-Omenetti@users.noreply.github.com>
Co-authored-by: Matteo-Omenetti <Matteo.Omenetti1@ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
fix: update docling-core to 2.22.0
Update dependency library docling-core to latest release 2.22.0
Fix regression tests and ground truth files
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>