Merge branch 'dev/add-granite-docling-extension' of github.com:DS4SD/docling into dev/add-granite-docling-extension

This commit is contained in:
Christoph Auer
2025-09-16 16:34:09 +02:00
10 changed files with 41 additions and 13 deletions

View File

@@ -32,7 +32,7 @@ from docling.pipeline.vlm_pipeline import VlmPipeline
source = "https://arxiv.org/pdf/2501.17887"
###### USING SIMPLE DEFAULT VALUES
# - SmolDocling model
# - GraniteDocling model
# - Using the transformers framework
converter = DocumentConverter(
@@ -53,7 +53,7 @@ print(doc.export_to_markdown())
# For more options see the `compare_vlm_models.py` example.
pipeline_options = VlmPipelineOptions(
vlm_options=vlm_model_specs.SMOLDOCLING_MLX,
vlm_options=vlm_model_specs.GRANITEDOCLING_MLX,
)
converter = DocumentConverter(

2
docs/index.md vendored
View File

@@ -28,7 +28,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images
* 👓 Support of several Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview))
* 👓 Support of several Visual Language Models ([GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M))
* 🎙️ Support for Audio with Automatic Speech Recognition (ASR) models
* 🔌 Connect to any agent using the [Docling MCP](https://docling-project.github.io/docling/usage/mcp/) server
* 💻 Simple and convenient CLI

4
docs/usage/index.md vendored
View File

@@ -31,9 +31,9 @@ You can additionally use Docling directly from your terminal, for instance:
docling https://arxiv.org/pdf/2206.01062
```
The CLI provides various options, such as 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) (incl. MLX acceleration) & other VLMs:
The CLI provides various options, such as 🥚[GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M) (incl. MLX acceleration) & other VLMs:
```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
docling --pipeline vlm --vlm-model granitedocling https://arxiv.org/pdf/2206.01062
```
For all available options, run `docling --help` or check the [CLI reference](../reference/cli.md).

View File

@@ -45,6 +45,8 @@ The following table reports the models currently available out-of-the-box.
| Model instance | Model | Framework | Device | Num pages | Inference time (sec) |
| ---------------|------ | --------- | ------ | --------- | ---------------------|
| `vlm_model_specs.GRANITEDOCLING_TRANSFORMERS` | [ibm-granite/granite-docling-258M](https://huggingface.co/ibm-granite/granite-docling-258M) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | - |
| `vlm_model_specs.GRANITEDOCLING_MLX` | [ibm-granite/granite-docling-258M-mlx-bf16](https://huggingface.co/ibm-granite/granite-docling-258M-mlx-bf16) | `MLX`| MPS | 1 | - |
| `vlm_model_specs.SMOLDOCLING_TRANSFORMERS` | [ds4sd/SmolDocling-256M-preview](https://huggingface.co/ds4sd/SmolDocling-256M-preview) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | 102.212 |
| `vlm_model_specs.SMOLDOCLING_MLX` | [ds4sd/SmolDocling-256M-preview-mlx-bf16](https://huggingface.co/ds4sd/SmolDocling-256M-preview-mlx-bf16) | `MLX`| MPS | 1 | 6.15453 |
| `vlm_model_specs.QWEN25_VL_3B_MLX` | [mlx-community/Qwen2.5-VL-3B-Instruct-bf16](https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-bf16) | `MLX`| MPS | 1 | 23.4951 |