Merge branch 'dev/mlx' of github.com:DS4SD/docling into dev/mlx

This commit is contained in:
Christoph Auer 2025-03-19 14:56:28 +01:00
commit 39a949df6e
4 changed files with 33 additions and 13 deletions

View File

@ -35,7 +35,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments * 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI * 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images * 🔍 Extensive OCR support for scanned PDFs and images
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) * 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) 🆕
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
### Coming soon ### Coming soon
@ -57,7 +57,7 @@ More [detailed installation instructions](https://docling-project.github.io/docl
## Getting started ## Getting started
To convert individual documents, use `convert()`, for example: To convert individual documents with python, use `convert()`, for example:
```python ```python
from docling.document_converter import DocumentConverter from docling.document_converter import DocumentConverter
@ -71,6 +71,21 @@ print(result.document.export_to_markdown()) # output: "## Docling Technical Rep
More [advanced usage options](https://docling-project.github.io/docling/usage/) are available in More [advanced usage options](https://docling-project.github.io/docling/usage/) are available in
the docs. the docs.
## CLI
Docling has built-in CLI to run conversions.
A simple example would look like this:
```bash
docling https://arxiv.org/pdf/2206.01062
```
You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) via Docling CLI:
```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
```
This will use MLX acceleration on supported Apple Silicon hardware.
Read more [here](https://docling-project.github.io/docling/usage/)
## Documentation ## Documentation
Check out Docling's [documentation](https://docling-project.github.io/docling/), for details on Check out Docling's [documentation](https://docling-project.github.io/docling/), for details on

View File

@ -68,18 +68,13 @@ for source in sources:
res = converter.convert(source) res = converter.convert(source)
print("------------------------------------------------")
print("MD:")
print("------------------------------------------------")
print("") print("")
print(res.document.export_to_markdown()) print(res.document.export_to_markdown())
doctags = ""
for page in res.pages: for page in res.pages:
print("") print("")
print("Predicted page in DOCTAGS:") print("Predicted page in DOCTAGS:")
print(page.predictions.vlm_response.text) print(page.predictions.vlm_response.text)
doctags += page.predictions.vlm_response.text
res.document.save_as_html( res.document.save_as_html(
filename=Path("{}/{}.html".format(out_path, res.input.file.stem)), filename=Path("{}/{}.html".format(out_path, res.input.file.stem)),
@ -90,14 +85,17 @@ for source in sources:
with (out_path / f"{res.input.file.stem}.json").open("w") as fp: with (out_path / f"{res.input.file.stem}.json").open("w") as fp:
fp.write(json.dumps(res.document.export_to_dict())) fp.write(json.dumps(res.document.export_to_dict()))
with (out_path / f"{res.input.file.stem}.md").open("w") as fp: res.document.save_as_json(
fp.write(res.document.export_to_markdown()) out_path / f"{res.input.file.stem}.md",
image_mode=ImageRefMode.PLACEHOLDER,
)
with (out_path / f"{res.input.file.stem}.doctag").open("w") as fp: res.document.save_as_markdown(
fp.write(doctags) out_path / f"{res.input.file.stem}.md",
image_mode=ImageRefMode.PLACEHOLDER,
)
pg_num = res.document.num_pages() pg_num = res.document.num_pages()
print("") print("")
inference_time = time.time() - start_time inference_time = time.time() - start_time
print( print(

View File

@ -26,7 +26,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments * 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI * 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images * 🔍 Extensive OCR support for scanned PDFs and images
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) * 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) 🆕
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
### Coming soon ### Coming soon

View File

@ -26,6 +26,13 @@ To see all available options (export formats etc.) run `docling --help`. More de
### Advanced options ### Advanced options
#### SmolDocling via CLI
You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) via Docling CLI:
```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
```
This will use MLX acceleration on supported Apple Silicon hardware.
#### Model prefetching and offline usage #### Model prefetching and offline usage
By default, models are downloaded automatically upon first usage. If you would prefer By default, models are downloaded automatically upon first usage. If you would prefer