Michele Dolfi
6d279f1c41
add docs for vision models
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-02 15:16:23 +02:00
Michele Dolfi
738385004a
Merge remote-tracking branch 'origin/main' into dev/add-other-vlm-models
2025-06-02 14:08:23 +02:00
Edgar Hipp
11ca4f7a7b
docs: fix typo in index.md ( #1676 )
...
Signed-off-by: Edgar Hipp <hipp.edg@gmail.com>
2025-06-02 12:35:59 +02:00
Michele Dolfi
c0847c97a7
use module import and remove MLX from non-darwin
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-02 10:45:46 +02:00
Michele Dolfi
b9c1698263
rename to specs
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-02 10:40:06 +02:00
Michele Dolfi
7f6df727e3
add supported_devices
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-01 21:12:43 +02:00
Michele Dolfi
3ff1712787
rename pipeline_vlm_model_spec
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-01 18:29:20 +02:00
Michele Dolfi
2bd15cc809
add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-01 18:24:04 +02:00
Michele Dolfi
0b2c1d5eda
refactor instances of VLM models
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-01 16:55:56 +02:00
Michele Dolfi
9dbf08a084
use AutoModelForVision2Seq for Pixtral and review example (including rename)
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-01 16:30:58 +02:00
Peter Staar
a4e6777bb3
fixed the merge conflicts
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-23 16:30:18 +02:00
Peter Staar
e93cc3ce09
fixing the tests
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-18 07:38:06 +02:00
Peter Staar
0c7c7c11c2
reformatted the code
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-16 16:31:11 +02:00
Peter Staar
d5b6c871cf
streamlining all code
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-16 16:27:27 +02:00
Peter Staar
661f7c9780
fixed the pipeline for Phi4
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-16 15:55:49 +02:00
Peter Staar
d41b856961
finalising last points for vlms support
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-16 12:39:26 +02:00
Panos Vagenas
7c4c356e76
chore: fix chunking example data link ( #1596 )
...
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-05-16 08:44:47 +02:00
Peter Staar
fc61258273
merged with main
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-15 07:46:06 +02:00
Peter Staar
e2c95d09bc
need to get Phi4 working again ...
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-15 07:32:55 +02:00
Peter Staar
7c67d2b2fe
fixed the MyPy
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-14 17:51:43 +02:00
Panos Vagenas
9f28abf061
docs: add advanced chunking & serialization example ( #1589 )
...
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-05-14 14:35:07 +02:00
Peter Staar
a3716b1961
refactoring minimal_vlm_pipeline
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-14 13:57:32 +02:00
Peter Staar
7c97b494ec
added the VlmPredictionToken
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-14 12:23:46 +02:00
Elwin
12dab0a1e8
feat: support image/webp file type ( #1415 )
...
* support image/webp file type
Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com>
Signed-off-by: Elwin <hzywong@gmail.com>
* docs: add webp image format in supported_formats.md
Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com>
Signed-off-by: Elwin <hzywong@gmail.com>
* test: add a test case for `image/webp` file
Signed-off-by: Elwin <hzywong@gmail.com>
* style: apply styling
Signed-off-by: Elwin <hzywong@gmail.com>
* test: update test case of converting `image/webp` file with more ocr engines
Signed-off-by: Elwin <hzywong@gmail.com>
* style: apply styling
Signed-off-by: Elwin <hzywong@gmail.com>
* rename test file
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com>
Signed-off-by: Elwin <hzywong@gmail.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-14 09:47:28 +02:00
Peter Staar
f159075b67
pixtral 12b runs via MLX and native transformers
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-14 07:39:20 +02:00
Peter Staar
4c0bc61e54
refactoring the download_model
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-14 05:31:54 +02:00
Peter Staar
3407955a47
all working, now serious refacgtoring necessary
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-13 18:23:55 +02:00
Peter Staar
7fbe021359
working on vlm's
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-13 06:07:11 +02:00
Peter Staar
bd2d01f0ac
Merge branch 'main' into dev/add-other-vlm-models
2025-05-12 08:52:52 +02:00
Oleg Lavrovsky
844babb390
docs: update links in data_prep_kit ( #1559 )
...
Update data_prep_kit.md
The links were broken, since the repository was renamed. I also noticed that PDF2Parquet is now referred to as Docling2Parquet.
Signed-off-by: Oleg Lavrovsky <31819+loleg@users.noreply.github.com>
2025-05-11 20:38:25 +02:00
Peter Staar
18e1ec4df2
feat: adding new vlm-models support
...
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-05-11 09:30:10 +02:00
Panos Vagenas
3220a592e7
docs: add serialization docs, update chunking docs ( #1556 )
...
* docs: add serializers docs, update chunking docs
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
* update notebook to improve MD table rendering
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
---------
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-05-08 21:43:01 +02:00
nkh0472
a097ccd8d5
chore: typo fix ( #1465 )
...
* typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
* chore: typo fix
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
---------
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
2025-04-28 08:52:09 +02:00
Emmanuel Ferdman
3afbe6c969
docs: update supported formats guide ( #1463 )
...
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-04-28 08:51:54 +02:00
Ryan Lin
a2fbbba9f7
feat: add tutorial using Milvus and Docling for RAG pipeline ( #1449 )
...
* feat: add milvus rag with docling tutorial
Signed-off-by: Ryan Lin <linjinhong@yandex.com>
* chore: run pre-commit
Signed-off-by: Ryan Lin <linjinhong@yandex.com>
* feat: add RAG with Milvus example to mkdocs
Signed-off-by: Ryan Lin <linjinhong@yandex.com>
---------
Signed-off-by: Ryan Lin <linjinhong@yandex.com>
2025-04-25 09:12:35 +02:00
nkh0472
c2470ed216
docs: Fix wrong output format in example code ( #1427 )
...
fix: wrong output format
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
2025-04-22 12:32:55 +02:00
Michele Dolfi
64918a81ac
docs: Add OpenSSF Best Practices badge ( #1430 )
...
* docs: add openssf badge
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add badge to docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-22 11:23:28 +02:00
Ben Cox
995b3b0ab1
docs: Typo fixes in docling_document.md ( #1400 )
...
Signed-off-by: Ben Cox <1038350+ind1go@users.noreply.github.com>
2025-04-22 08:49:08 +02:00
Leandro Rosas
88948b0bba
docs: Updated the [Usage] link in architecture.md ( #1416 )
...
Fixed the [Usage] link in architecture.md
Changed the usage link in the tip box from "../usage.md#adjust-pipeline-features" to "../usage/index.md#adjust-pipeline-features" as the previous link is not valid.
Signed-off-by: Leandro Rosas <36343022+leandrosas101@users.noreply.github.com>
2025-04-19 10:20:52 +02:00
Felix Dittrich
a7dd59c5cb
docs(ocr): Add docs entry for OnnxTR OCR plugin ( #1382 )
...
feat(ocr): Add docs entry for OnnxTR OCR plugin
Signed-off-by: felix <felixdittrich92@gmail.com>
2025-04-15 09:46:59 +02:00
Michele Dolfi
5458a88464
ci: add coverage and ruff ( #1383 )
...
* add coverage calculation and push
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* new codecov version and usage of token
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* enable ruff formatter instead of black and isort
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* apply ruff lint fixes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* apply ruff unsafe fixes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add removed imports
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* runs 1 on linter issues
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* finalize linter fixes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* Update pyproject.toml
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-04-14 18:01:26 +02:00
Juil Park
a026b4e84b
docs: Add Notes for Installing in Intel macOS ( #1377 )
...
docs: Add Notes for Intel macOS
Signed-off-by: Juil Park <park@juil.dev>
2025-04-14 10:21:13 +02:00
Peter W. J. Staar
c0ba88edf1
feat(cli): add option for html with split-page mode ( #1355 )
...
* updated the cli to output html in split-page mode
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* add pin for new docling-core with html split argument
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* relock with fixed html export in docling-core
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update test results
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update more tests
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update lock with docling-core fixes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update test results
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add again chunking extras
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-14 08:41:50 +02:00
Gabe Goodhart
c605edd8e9
feat: OllamaVlmModel for Granite Vision 3.2 ( #1337 )
...
* build: Add ollama sdk dependency
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* feat: Add option plumbing for OllamaVlmOptions in pipeline_options
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* feat: Full implementation of OllamaVlmModel
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* feat: Connect "granite_vision_ollama" pipeline option to CLI
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* Revert "build: Add ollama sdk dependency"
After consideration, we're going to use the generic OpenAI API instead
of the Ollama-specific API to avoid duplicate work.
This reverts commit bc6b366468cdd66b52540aac9c7d8b584ab48ad0.
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* refactor: Move OpenAI API call logic into utils.utils
This will allow reuse of this logic in a generic VLM model
NOTE: There is a subtle change here in the ordering of the text prompt and
the image in the call to the OpenAI API. When run against Ollama, this
ordering makes a big difference. If the prompt comes before the image, the
result is terse and not usable whereas the prompt coming after the image
works as expected and matches the non-OpenAI chat API.
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* refactor: Refactor from Ollama SDK to generic OpenAI API
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* fix: Linting, formatting, and bug fixes
The one bug fix was in the timeout arg to openai_image_request. Otherwise,
this is all style changes to get MyPy and black passing cleanly.
Branch: OllamaVlmModel
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* remove model from download enum
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* generalize input args for other API providers
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename and refactor
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* require flag for remote services
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* disable example from CI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add examples to docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-10 18:03:04 +02:00
Michele Dolfi
2e99e5a54f
docs: add plugins docs ( #1319 )
...
add plugin docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-08 09:44:37 +02:00
Panos Vagenas
71148eb381
docs: add visual grounding example ( #1270 )
...
Run Docs CD / build-deploy-docs (push) Failing after 1m28s
Run Docs CI / build-docs (push) Failing after 54s
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-04-02 14:03:19 +02:00
Clément Doumouro
0974ba4e1c
docs(examples): batch conversion doc raises_on_error
( #1147 )
...
Signed-off-by: Clément Doumouro <clement.doumouro@gmail.com>
2025-03-25 11:14:39 +01:00
Maxim Lysak
1c26769785
feat(SmolDocling): Support MLX acceleration in VLM pipeline ( #1199 )
...
* Initial implementation to support MLX for VLM pipeline and SmolDocling
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* mlx_model unit
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Add CLI choices for VLM pipeline and model
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Initial implementation to support MLX for VLM pipeline and SmolDocling
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* mlx_model unit
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Add CLI choices for VLM pipeline and model
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Updated minimal vlm pipeline example
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* make vlm_pipeline python3.9 compatible
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Fixed extract_text_from_backend definition
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Updated README
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Updated example
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Updated documentation
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* corrections in the documentation
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
* Consmetic changes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-03-19 15:38:54 +01:00
Michele Dolfi
1d680b0a32
docs: Linux Foundation AI & Data ( #1183 )
...
* point the auxiliary files to the community repo and add lfai in README
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update docs index
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-19 09:05:57 +01:00
Michele Dolfi
54a78c307d
docs: move apify to docs ( #1182 )
...
Run Docs CD / build-deploy-docs (push) Failing after 1m24s
Run Docs CI / build-docs (push) Failing after 52s
move apify to docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-18 16:43:55 +01:00