* updated the backend and pyproject.toml
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the version and test files
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the lock
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* forgot to add 1 updated test-file
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the lock
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* feat: Switch default layout model to DOCLING_LAYOUT_HERON. Update the unit test data.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Use default layout model in model_downloader default args
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Use default layout model in model_downloader default args
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update docling-models tag for TableFormer
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update test GT
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update test GT (from linux CPU)
Signed-off-by: Ubuntu <ubuntu@ip-172-31-30-253.eu-central-1.compute.internal>
* fix: Ensure that the visualisations happen on copies of the page image
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Pinpoint docling-ibm-models to the fix branch for the ReadingOrderPredictor
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update uv.lock
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update tests GT to match the Heron layout model and the improved reading order model in Linux
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Introduce the verify_doctags optional parameter in conversion tests to control if a doctags
comparison should take place. Skip doctags comparisons for certain tests.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Generate tests GT on Mac
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Remove the pinning of the docling-ibm-models and use the release 3.9.1
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Ubuntu <ubuntu@ip-172-31-30-253.eu-central-1.compute.internal>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-253.eu-central-1.compute.internal>
* Add DocumentConverter.extract and full extraction pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add DocumentConverter.extract template arg
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add NuExtract model
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add Extraction pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add proper test, support pydantic class types
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add qr bill example
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add base_extraction_pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add types
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update typing of ExtractionResult and inner fields
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Factor out extract to DocumentExtractor
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Address mypy issues
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add DocumentExtractor
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Resolve circular import issue
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Clean up imports, remove Optional for template arg
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Move new type definitions into datamodel
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update comments
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Respect page-range, disable test_extraction for CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: exploring new version
* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 5815c8f81b0e5ce400332597b6795e5a97ecf775
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* chore: autoformat
DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 5815c8f81b0e5ce400332597b6795e5a97ecf775
* feat: enable configurable runtime for rapidocr and handle new result better;
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* chore: fix linter
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* chore: use new server model
* chore: change default engine type to onnx
* chore: tests update for new rapidocr
* fix: rebase from main and fix clashes
* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 5815c8f81b0e5ce400332597b6795e5a97ecf775
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 02f9db85f562e5cdfda40c52fee55cfd4030d70a
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: a7bcb205faedb881f94a89b3bbd29cb31ccd54f0
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: a39482a98cbcff7a825c8321134732af0c65930a
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 63e9d717fa26951566b02761f3fdfc752c31f805
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: ef12a6ec1ea2846a8a8e2e776eeaa59c2a0c4dfe
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 2222d2340387f8d9d66f3ca9d8e21a0945a44e7a
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: bc6a1dc507d7f146ec4797a2d3840414f46ac64d
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 56e0d67da7c57d4b5caf8eaef8dff7056c3efd32
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 871ca21271412006c76acf3c19426140efed3d50
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 7b1b77159da729d483a581a86c7309acba1712a7
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: a792a714a43e19a91b2b782f54621c1c5efda632
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: d1fed26323ff829b716bc667fe69532839363e45
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 346ec1cad943765f886e5d17fb0a54221124689c
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 4d0bbe5bd6e9f7261b97362ff8823af244267089
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 34a5ad53892a7064a6bf35f890d344d464c78b2f
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 9151959db3ad53535011d1cfdcf9181fdf936bb1
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 8ef5536f2c098826c6c0a05190f8a80614c3f3cb
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 7e18637a35
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 63fb8ff599
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 0cb9444fb8
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 38940d9978
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: b6d461ac42
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: ee55eb3408
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
---------
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
* Prepare existing codes for use with new multi-stage VLM pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add multithreaded VLM pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add VLM task interpreters
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add VLM task interpreters
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove prints
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix KeyboardInterrupt behaviour
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add VLLM backend support, optimize process_images
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Tweak defaults
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Implement proper batch inference for HuggingFaceTransformersVlmModel
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Small fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup hf_transformers_model batching impl
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Adjust example instatiation of multi-stage VLM pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add GoT OCR 2.0
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Factor out changes without multi-stage pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reset defaults for generation
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add torch.compile, fix temperature setting in gen_kwargs
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Expose page_batch_size on CLI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add torch_dtype bfloat16 to SMOLDOCLING and SMOLVLM model spec
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Clip off pad_token
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Use device_map for transformer models
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add accelerate
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Relax accelerate min version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make pipeline cache+init thread-safe
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Establish layout_model spec and example instantations
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Updated naming
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Back to uppercase constants
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix deps issue with openai-whipser>numba>llvmlite
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Pull v1 changed test GT from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Integrate ListItemMarkerProcessor into document assembly
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update to final version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update all test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Upgrade deps
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: adding new vlm-models support
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the transformers
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* got microsoft/Phi-4-multimodal-instruct to work
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* working on vlm's
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring the VLM part
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* all working, now serious refacgtoring necessary
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring the download_model
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the formulate_prompt
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* pixtral 12b runs via MLX and native transformers
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the VlmPredictionToken
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring minimal_vlm_pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the MyPy
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added pipeline_model_specializations file
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* need to get Phi4 working again ...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* finalising last points for vlms support
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the pipeline for Phi4
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* streamlining all code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixing the tests
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the html backend to the VLM pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the static load_from_doctags
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* restore stable imports
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use AutoModelForVision2Seq for Pixtral and review example (including rename)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove unused value
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* refactor instances of VLM models
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* skip compare example in CI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use lowercase and uppercase only
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename pipeline_vlm_model_spec
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move more argument to options and simplify model init
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add supported_devices
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove not-needed function
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* exclude minimal_vlm
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* missing file
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add message for transformers version
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename to specs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use module import and remove MLX from non-darwin
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove hf_vlm_model and add extra_generation_args
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use single HF VLM model class
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove torch type
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add docs for vision models
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
* feat: Add visualization of bbox on page with html export.
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the cli
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the cli argument to show_layout
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>