mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 20:58:11 +00:00
update docs and README
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
4
docs/examples/minimal_vlm_pipeline.py
vendored
4
docs/examples/minimal_vlm_pipeline.py
vendored
@@ -9,7 +9,7 @@ from docling.pipeline.vlm_pipeline import VlmPipeline
|
||||
source = "https://arxiv.org/pdf/2501.17887"
|
||||
|
||||
###### USING SIMPLE DEFAULT VALUES
|
||||
# - SmolDocling model
|
||||
# - GraniteDocling model
|
||||
# - Using the transformers framework
|
||||
|
||||
converter = DocumentConverter(
|
||||
@@ -29,7 +29,7 @@ print(doc.export_to_markdown())
|
||||
# For more options see the compare_vlm_models.py example.
|
||||
|
||||
pipeline_options = VlmPipelineOptions(
|
||||
vlm_options=vlm_model_specs.SMOLDOCLING_MLX,
|
||||
vlm_options=vlm_model_specs.GRANITEDOCLING_MLX,
|
||||
)
|
||||
|
||||
converter = DocumentConverter(
|
||||
|
||||
2
docs/index.md
vendored
2
docs/index.md
vendored
@@ -28,7 +28,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
|
||||
* 🔒 Local execution capabilities for sensitive data and air-gapped environments
|
||||
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
|
||||
* 🔍 Extensive OCR support for scanned PDFs and images
|
||||
* 👓 Support of several Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview))
|
||||
* 👓 Support of several Visual Language Models ([GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M))
|
||||
* 🎙️ Support for Audio with Automatic Speech Recognition (ASR) models
|
||||
* 🔌 Connect to any agent using the [Docling MCP](https://docling-project.github.io/docling/usage/mcp/) server
|
||||
* 💻 Simple and convenient CLI
|
||||
|
||||
4
docs/usage/index.md
vendored
4
docs/usage/index.md
vendored
@@ -31,9 +31,9 @@ You can additionally use Docling directly from your terminal, for instance:
|
||||
docling https://arxiv.org/pdf/2206.01062
|
||||
```
|
||||
|
||||
The CLI provides various options, such as 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) (incl. MLX acceleration) & other VLMs:
|
||||
The CLI provides various options, such as 🥚[GraniteDocling](https://huggingface.co/ibm-granite/granite-docling-258M) (incl. MLX acceleration) & other VLMs:
|
||||
```bash
|
||||
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
|
||||
docling --pipeline vlm --vlm-model granitedocling https://arxiv.org/pdf/2206.01062
|
||||
```
|
||||
|
||||
For all available options, run `docling --help` or check the [CLI reference](../reference/cli.md).
|
||||
|
||||
2
docs/usage/vision_models.md
vendored
2
docs/usage/vision_models.md
vendored
@@ -45,6 +45,8 @@ The following table reports the models currently available out-of-the-box.
|
||||
|
||||
| Model instance | Model | Framework | Device | Num pages | Inference time (sec) |
|
||||
| ---------------|------ | --------- | ------ | --------- | ---------------------|
|
||||
| `vlm_model_specs.GRANITEDOCLING_TRANSFORMERS` | [ibm-granite/granite-docling-258M](https://huggingface.co/ibm-granite/granite-docling-258M) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | - |
|
||||
| `vlm_model_specs.GRANITEDOCLING_MLX` | [ibm-granite/granite-docling-258M-mlx-bf16](https://huggingface.co/ibm-granite/granite-docling-258M-mlx-bf16) | `MLX`| MPS | 1 | - |
|
||||
| `vlm_model_specs.SMOLDOCLING_TRANSFORMERS` | [ds4sd/SmolDocling-256M-preview](https://huggingface.co/ds4sd/SmolDocling-256M-preview) | `Transformers/AutoModelForVision2Seq` | MPS | 1 | 102.212 |
|
||||
| `vlm_model_specs.SMOLDOCLING_MLX` | [ds4sd/SmolDocling-256M-preview-mlx-bf16](https://huggingface.co/ds4sd/SmolDocling-256M-preview-mlx-bf16) | `MLX`| MPS | 1 | 6.15453 |
|
||||
| `vlm_model_specs.QWEN25_VL_3B_MLX` | [mlx-community/Qwen2.5-VL-3B-Instruct-bf16](https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-bf16) | `MLX`| MPS | 1 | 23.4951 |
|
||||
|
||||
Reference in New Issue
Block a user