Peter Staar
f4c1836c96
functional working two-stage, need to implement a good prompt now to leverage bounding boxes
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 16:15:54 +02:00
Peter Staar
b2d5c783ae
working two-stage vlm approach from the cli
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 15:38:15 +02:00
Peter Staar
fb74d0c5b3
working TwoStageVlmModel
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 15:11:53 +02:00
Peter Staar
b2336830eb
fixed the circular dependenciea
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 10:35:47 +02:00
Peter Staar
70872e6539
merged with main and refactored the code to fix MyPy
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 09:58:06 +02:00
Peter Staar
e596143bf8
Merge branch 'main' into dev/add-two-stage-vlm
2025-07-10 06:52:31 +02:00
Peter Staar
0f395688b8
refactored the code and added vlm2stage as a cli option
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-10 06:48:34 +02:00
Christoph Auer
2b8616d6d5
feat: Layout model specification and multiple choices ( #1910 )
...
* Establish layout_model spec and example instantations
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Updated naming
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Back to uppercase constants
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* fix deps issue with openai-whipser>numba>llvmlite
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Pull v1 changed test GT from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-10 06:37:27 +02:00
Panos Vagenas
ec588df971
feat: enable precision control in float serialization ( #1914 )
...
* chore: propagate precision control in float serialization
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* parametrize float serialization, propagate core updates
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* update test float precision
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* repin docling-core
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
---------
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
2025-07-09 16:39:17 +02:00
Peter Staar
dcf6fd6a41
fixed the MyPy complaining
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-09 06:48:03 +02:00
Clément Doumouro
931eb55b88
fix(ocr-utils): unit test and fix the rotate_bounding_box function ( #1897 )
...
Signed-off-by: Clément Doumouro <clement.doumouro@gmail.com >
2025-07-08 18:03:29 +02:00
Peter Staar
c10e2920a4
refactoring redundant code and fixing mypy errors
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-08 16:37:20 +02:00
Peter Staar
b5479ab971
working on MyPy
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-08 15:05:54 +02:00
Peter Staar
49e9a00c05
merged in layout-model-spec
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-08 13:29:30 +02:00
Christoph Auer
517230b9c4
Updated naming
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-08 13:07:56 +02:00
Christoph Auer
af0461e5b1
Move to pipeline_options.layout_options.model
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-08 11:30:42 +02:00
Christoph Auer
f2094f858b
Establish layout_model spec and example instantations
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-08 10:23:18 +02:00
Peter Staar
810446c8dc
feat: working on a two stage VLM model
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-08 09:49:39 +02:00
Peter Staar
4eceefa47c
feat: add TwoStageVlmModel
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-07-08 07:38:48 +02:00
geoHeil
a07ba863c4
feat: add image-text-to-text models in transformers ( #1772 )
...
* feat(dolphin): add dolphin support
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com >
* rename
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com >
* reformat
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com >
* fix mypy
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com >
* add prompt style and examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
---------
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com >
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com >
2025-07-08 05:54:57 +02:00
VIktor Kuropiantnyk
e25873d557
fix: docs are missing osd packages for tesseract on RHEL ( #1905 )
...
Fixed missing packages in the docs on tesseract
Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com >
2025-07-07 17:06:26 +02:00
Shkarupa Alex
b8813eea80
feat(vlm): Dynamic prompts ( #1808 )
...
* Unify temperature options for Vlm models
* Dynamic prompt support with example
* DCO Remediation Commit for Shkarupa Alex <shkarupa.alex@gmail.com >
I, Shkarupa Alex <shkarupa.alex@gmail.com >, hereby add my Signed-off-by to this commit: 34d446cb98
I, Shkarupa Alex <shkarupa.alex@gmail.com >, hereby add my Signed-off-by to this commit: 9c595d574f
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
* Replace Page with SegmentedPage
* Fix example HF repo link
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
* Sign-off
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
* DCO Remediation Commit for Shkarupa Alex <shkarupa.alex@gmail.com >
I, Shkarupa Alex <shkarupa.alex@gmail.com >, hereby add my Signed-off-by to this commit: 1a162066dd
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
* Use lmstudio-community model
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
* Swap inference engine to LM Studio
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
---------
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
2025-07-07 16:58:42 +02:00
Michele Dolfi
edd4356aac
fix: use only backend for picture classifier ( #1904 )
...
use backend for picture classifier
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
2025-07-07 16:23:16 +02:00
Michele Dolfi
dd8fde7f19
fix: typo in asr options ( #1902 )
...
fix typo
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
2025-07-07 08:59:14 +02:00
github-actions[bot]
f4a1c06937
chore: bump version to 2.40.0 [skip ci]
v2.40.0
2025-07-04 15:31:36 +00:00
Christoph Auer
ec6cf6f7e8
feat: Introduce LayoutOptions to control layout postprocessing behaviour ( #1870 )
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-04 15:36:13 +02:00
Christoph Auer
598c9c53d4
fix: Secure torch model inits with global locks ( #1884 )
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-04 07:27:26 +02:00
Qiefan Jiang
13865c06f5
perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell ( #1875 )
...
* perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of sheet.cell
* DCO Remediation Commit for Qiefan Jiang <jiangqiefan@bytedance.com >
I, Qiefan Jiang <jiangqiefan@bytedance.com >, hereby add my Signed-off-by to this commit: 274102a8d4
Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com >
* fix lint
* DCO Remediation Commit for Qiefan Jiang <jiangqiefan@bytedance.com >
I, Qiefan Jiang <jiangqiefan@bytedance.com >, hereby add my Signed-off-by to this commit: b6b5b090a9
Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com >
---------
Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com >
2025-07-03 13:12:06 +02:00
William Easton
3089cf2d26
perf: Move expensive imports closer to usage ( #1863 )
...
* Move expensive imports closer to usage
Signed-off-by: William Easton <bill.easton@elastic.co >
* DCO Remediation Commit for William Easton <bill.easton@elastic.co >
I, William Easton <bill.easton@elastic.co >, hereby add my Signed-off-by to this commit: 8a7412ce5bb131a01bb6403067aeb948c9093b0b
Signed-off-by: William Easton <bill.easton@elastic.co >
* formatting fixes
Signed-off-by: William Easton <bill.easton@elastic.co >
* DCO Remediation Commit for William Easton <bill.easton@elastic.co >
I, William Easton <bill.easton@elastic.co >, hereby add my Signed-off-by to this commit: 8a7412ce5bb131a01bb6403067aeb948c9093b0b
I, William Easton <bill.easton@elastic.co >, hereby add my Signed-off-by to this commit: 963e34325071db5e844841f10c27b396a054a0a1
Signed-off-by: William Easton <bill.easton@elastic.co >
* Fix baseocrmodel test issue
Signed-off-by: William Easton <bill.easton@elastic.co >
---------
Signed-off-by: William Easton <bill.easton@elastic.co >
2025-07-01 22:27:17 +02:00
Christoph Auer
56a0e104f7
feat: Integrate ListItemMarkerProcessor into document assembly ( #1825 )
...
* Integrate ListItemMarkerProcessor into document assembly
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update to final version
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update all test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Upgrade deps
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-07-01 10:04:58 +02:00
Christoph Auer
bdfee4e2d0
chore: Safer unloading of DPv4 backend ( #1867 )
...
fix: Safer unloading of DPv4 backend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-06-30 14:41:21 +02:00
Nikos Livathinos
ae39a9411a
fix: Ensure that TesseractOcrModel does not crash in case OSD is not installed ( #1866 )
...
fix: Ensure that TesseractOcrModel does not crash if tesseract OSD is not installed
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com >
2025-06-30 10:55:56 +02:00
github-actions[bot]
bb99be6c24
chore: bump version to 2.39.0 [skip ci]
v2.39.0
2025-06-27 15:37:53 +00:00
Panos Vagenas
0533da1923
feat: leverage new list modeling, capture default markers ( #1856 )
...
* chore: update docling-core & regenerate test data
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* update backends to leverage new list modeling
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* repin docling-core
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* ensure availability of latest docling-core API
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
---------
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
2025-06-27 16:37:15 +02:00
Michael Honaker
e79e4f0ab6
fix(markdown): make parsing of rich table cells valid ( #1821 )
...
* fix: update md table classification
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com >
* Fix ground truth header changes
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com >
* Fix merge issues
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com >
* Fix minor ground truth errors
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com >
---------
Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com >
2025-06-26 19:50:45 +02:00
github-actions[bot]
ee4781075a
chore: bump version to 2.38.1 [skip ci]
v2.38.1
2025-06-25 16:27:46 +00:00
pranaymiri
d337825b8e
fix: updated granite vision model version for picture description ( #1852 )
...
* updated granite model version
* DCO Remediation Commit for Miriyala Pranay <miriyalapranay146@gmail.com >
I, Miriyala Pranay <miriyalapranay146@gmail.com >, hereby add my Signed-off-by to this commit: 5de0d5034c
Signed-off-by: Miriyala Pranay <miriyalapranay146@gmail.com >
---------
Signed-off-by: Miriyala Pranay <miriyalapranay146@gmail.com >
2025-06-25 17:49:56 +02:00
Panos Vagenas
7c5614a37a
fix(markdown): fix single-formatted headings & list items ( #1820 )
...
* fix(markdown): fix formatting & inline edge cases (show behavior before change)
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* add change and updated test data
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* update lock
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
* improve test case
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
---------
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
2025-06-25 13:05:06 +02:00
Michele Dolfi
41e8cae26b
fix: fix response type of ollama ( #1850 )
...
fix response type of ollama
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
2025-06-25 11:33:09 +02:00
Allen N.
4002de1f92
fix: Handle missing runs to avoid out of range exception ( #1844 )
...
Fixes #1681 on upstream
Signed-off-by: Allen Nikka <allennikka@gmail.com >
2025-06-25 07:55:27 +02:00
github-actions[bot]
1dc63d0aa9
chore: bump version to 2.38.0 [skip ci]
v2.38.0
2025-06-23 18:14:24 +00:00
Peter W. J. Staar
f3ae3029b8
docs: update readme and add ASR example ( #1836 )
...
* updated the README
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* added minimal_asr_pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* Updated README and added ASR example
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* Updated docs.index.md
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* updated CI and mkdocs
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* added link tp existing audio file
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* added link tp existing audio file
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* reformatting
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
2025-06-23 18:55:16 +02:00
Peter W. J. Staar
1557e7ce3e
feat: Support audio input ( #1763 )
...
* scaffolding in place
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* doing scaffolding for audio pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* WIP: got first transcription working
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* all working, time to start cleaning up
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* first working ASR pipeline
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* added openai-whisper as a first transcription model
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* updating with asr_options
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* finalised the first working ASR pipeline with Whisper
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* use whisper from the latest git commit
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
* Update docling/datamodel/pipeline_options.py
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com >
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com >
* Update docling/datamodel/pipeline_options.py
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com >
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com >
* updated comment
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
* AudioBackend -> DummyBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* file rename
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Rename to NoOpBackend, add test for ASR pipeline
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Support every format in NoOpBackend
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Add missing audio file and test
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Install ffmpeg system dependency for ASR test
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com >
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com >
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com >
Co-authored-by: Christoph Auer <cau@zurich.ibm.com >
2025-06-23 14:47:26 +02:00
Cesar Berrospi Ramis
d26dac61a8
fix(docx): ensure list items have a list parent ( #1827 )
...
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com >
2025-06-20 14:47:25 +02:00
mkrssg
1350a8d3e5
fix(msword_backend): Identify text in the same line after an image #1425 ( #1610 )
...
* fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com >
* test: add test file and case for fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com >
* test: added groundtruth test files for fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com >
* fix: extraneous empty paragraphs for test files
Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com >
---------
Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com >
Co-authored-by: Michael Krissgau <michael.krissgau@ibm.com >
2025-06-20 10:55:30 +02:00
Michele Dolfi
64ac043786
docs: support running examples from root or subfolder ( #1816 )
...
support running examples from root or subfolder
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
2025-06-19 11:10:40 +02:00
Christoph Auer
dd7f64ff28
fix: Ensure uninitialized pages are removed before assembling document ( #1812 )
...
Ensure uninitialized pages are removed before assembling document
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2025-06-19 07:33:25 +02:00
Panos Vagenas
861abcdcb0
feat(markdown): add formatting & improve inline support ( #1804 )
...
feat(markdown): support formatting & hyperlinks
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com >
2025-06-18 15:57:57 +02:00
Shkarupa Alex
215b540f6c
feat: Maximum image size for Vlm models ( #1802 )
...
* Image scale moved to base vlm options.
Added max_size image limit (options and vlm models).
* DCO Remediation Commit for Shkarupa Alex <shkarupa.alex@gmail.com >
I, Shkarupa Alex <shkarupa.alex@gmail.com >, hereby add my Signed-off-by to this commit: e93602a0d0
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
---------
Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com >
2025-06-18 12:57:37 +02:00
Mahafuzur Rahman
dbab30e92c
fix: formula conversion with page_range param set ( #1791 )
...
When page_range param is used for formula conversion,
the system throws list index out of range error.
Included tests to validate that the fix works.
Signed-off-by: Masum <masumsofts@yahoo.com >
2025-06-17 13:58:45 +02:00