update docs and README

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2025-09-15 15:44:42 +02:00
parent c5a59eb979
commit 43d3c74bb2
6 changed files with 11 additions and 9 deletions

View File

@@ -36,7 +36,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments * 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI * 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images * 🔍 Extensive OCR support for scanned PDFs and images
* 👓 Support of several Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) * 👓 Support of several Visual Language Models ([GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M))
* 🎙️ Audio support with Automatic Speech Recognition (ASR) models * 🎙️ Audio support with Automatic Speech Recognition (ASR) models
* 🔌 Connect to any agent using the [MCP server](https://docling-project.github.io/docling/usage/mcp/) * 🔌 Connect to any agent using the [MCP server](https://docling-project.github.io/docling/usage/mcp/)
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
@@ -88,9 +88,9 @@ Docling has a built-in CLI to run conversions.
docling https://arxiv.org/pdf/2206.01062 docling https://arxiv.org/pdf/2206.01062
``` ```
You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) and other VLMs via Docling CLI: You can also use 🥚[GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M) and other VLMs via Docling CLI:
```bash ```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062 docling --pipeline vlm --vlm-model granitedocling https://arxiv.org/pdf/2206.01062
``` ```
This will use MLX acceleration on supported Apple Silicon hardware. This will use MLX acceleration on supported Apple Silicon hardware.

View File

@@ -9,7 +9,7 @@ from docling.pipeline.vlm_pipeline import VlmPipeline
source = "https://arxiv.org/pdf/2501.17887" source = "https://arxiv.org/pdf/2501.17887"
###### USING SIMPLE DEFAULT VALUES ###### USING SIMPLE DEFAULT VALUES
# - SmolDocling model # - GraniteDocling model
# - Using the transformers framework # - Using the transformers framework
converter = DocumentConverter( converter = DocumentConverter(
@@ -29,7 +29,7 @@ print(doc.export_to_markdown())
# For more options see the compare_vlm_models.py example. # For more options see the compare_vlm_models.py example.
pipeline_options = VlmPipelineOptions( pipeline_options = VlmPipelineOptions(
vlm_options=vlm_model_specs.SMOLDOCLING_MLX, vlm_options=vlm_model_specs.GRANITEDOCLING_MLX,
) )
converter = DocumentConverter( converter = DocumentConverter(

2
docs/index.md vendored
View File

@@ -28,7 +28,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments * 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI * 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images * 🔍 Extensive OCR support for scanned PDFs and images
* 👓 Support of several Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) * 👓 Support of several Visual Language Models ([GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M))
* 🎙️ Support for Audio with Automatic Speech Recognition (ASR) models * 🎙️ Support for Audio with Automatic Speech Recognition (ASR) models
* 🔌 Connect to any agent using the [Docling MCP](https://docling-project.github.io/docling/usage/mcp/) server * 🔌 Connect to any agent using the [Docling MCP](https://docling-project.github.io/docling/usage/mcp/) server
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI

4
docs/usage/index.md vendored
View File

@@ -31,9 +31,9 @@ You can additionally use Docling directly from your terminal, for instance:
docling https://arxiv.org/pdf/2206.01062 docling https://arxiv.org/pdf/2206.01062
``` ```
The CLI provides various options, such as 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) (incl. MLX acceleration) & other VLMs: The CLI provides various options, such as 🥚[GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M) (incl. MLX acceleration) & other VLMs:
```bash ```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062 docling --pipeline vlm --vlm-model granitedocling https://arxiv.org/pdf/2206.01062
``` ```
For all available options, run `docling --help` or check the [CLI reference](../reference/cli.md). For all available options, run `docling --help` or check the [CLI reference](../reference/cli.md).

View File

@@ -45,6 +45,8 @@ The following table reports the models currently available out-of-the-box.
| Model instance | Model | Framework | Device | Num pages | Inference time (sec) | | Model instance | Model | Framework | Device | Num pages | Inference time (sec) |
| ---------------|------ | --------- | ------ | --------- | ---------------------| | ---------------|------ | --------- | ------ | --------- | ---------------------|
| `vlm_model_specs.GRANITEDOCLING_TRANSFORMERS` | [ibm-granite/granite-docling-258M](https://huggingface.co/ibm-granite/granite-docling-258M) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | - |
| `vlm_model_specs.GRANITEDOCLING_MLX` | [ibm-granite/granite-docling-258M-mlx-bf16](https://huggingface.co/ibm-granite/granite-docling-258M-mlx-bf16) | `MLX`| MPS | 1 | - |
| `vlm_model_specs.SMOLDOCLING_TRANSFORMERS` | [ds4sd/SmolDocling-256M-preview](https://huggingface.co/ds4sd/SmolDocling-256M-preview) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | 102.212 | | `vlm_model_specs.SMOLDOCLING_TRANSFORMERS` | [ds4sd/SmolDocling-256M-preview](https://huggingface.co/ds4sd/SmolDocling-256M-preview) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | 102.212 |
| `vlm_model_specs.SMOLDOCLING_MLX` | [ds4sd/SmolDocling-256M-preview-mlx-bf16](https://huggingface.co/ds4sd/SmolDocling-256M-preview-mlx-bf16) | `MLX`| MPS | 1 | 6.15453 | | `vlm_model_specs.SMOLDOCLING_MLX` | [ds4sd/SmolDocling-256M-preview-mlx-bf16](https://huggingface.co/ds4sd/SmolDocling-256M-preview-mlx-bf16) | `MLX`| MPS | 1 | 6.15453 |
| `vlm_model_specs.QWEN25_VL_3B_MLX` | [mlx-community/Qwen2.5-VL-3B-Instruct-bf16](https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-bf16) | `MLX`| MPS | 1 | 23.4951 | | `vlm_model_specs.QWEN25_VL_3B_MLX` | [mlx-community/Qwen2.5-VL-3B-Instruct-bf16](https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-bf16) | `MLX`| MPS | 1 | 23.4951 |

View File

@@ -83,7 +83,7 @@ nav:
- "Custom conversion": examples/custom_convert.py - "Custom conversion": examples/custom_convert.py
- "Batch conversion": examples/batch_convert.py - "Batch conversion": examples/batch_convert.py
- "Multi-format conversion": examples/run_with_formats.py - "Multi-format conversion": examples/run_with_formats.py
- "VLM pipeline with SmolDocling": examples/minimal_vlm_pipeline.py - "VLM pipeline with GraniteDocling": examples/minimal_vlm_pipeline.py
- "VLM pipeline with remote model": examples/vlm_pipeline_api_model.py - "VLM pipeline with remote model": examples/vlm_pipeline_api_model.py
- "VLM comparison": examples/compare_vlm_models.py - "VLM comparison": examples/compare_vlm_models.py
- "ASR pipeline with Whisper": examples/minimal_asr_pipeline.py - "ASR pipeline with Whisper": examples/minimal_asr_pipeline.py