Merge branch 'dev/mlx' of github.com:DS4SD/docling into dev/mlx

This commit is contained in:
Christoph Auer 2025-03-19 14:56:28 +01:00
commit 39a949df6e
4 changed files with 33 additions and 13 deletions

View File

@ -35,7 +35,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview))
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) 🆕
* 💻 Simple and convenient CLI
### Coming soon
@ -57,7 +57,7 @@ More [detailed installation instructions](https://docling-project.github.io/docl
## Getting started
To convert individual documents, use `convert()`, for example:
To convert individual documents with python, use `convert()`, for example:
```python
from docling.document_converter import DocumentConverter
@ -71,6 +71,21 @@ print(result.document.export_to_markdown()) # output: "## Docling Technical Rep
More [advanced usage options](https://docling-project.github.io/docling/usage/) are available in
the docs.
## CLI
Docling has built-in CLI to run conversions.
A simple example would look like this:
```bash
docling https://arxiv.org/pdf/2206.01062
```
You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) via Docling CLI:
```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
```
This will use MLX acceleration on supported Apple Silicon hardware.
Read more [here](https://docling-project.github.io/docling/usage/)
## Documentation
Check out Docling's [documentation](https://docling-project.github.io/docling/), for details on

View File

@ -68,18 +68,13 @@ for source in sources:
res = converter.convert(source)
print("------------------------------------------------")
print("MD:")
print("------------------------------------------------")
print("")
print(res.document.export_to_markdown())
doctags = ""
for page in res.pages:
print("")
print("Predicted page in DOCTAGS:")
print(page.predictions.vlm_response.text)
doctags += page.predictions.vlm_response.text
res.document.save_as_html(
filename=Path("{}/{}.html".format(out_path, res.input.file.stem)),
@ -90,14 +85,17 @@ for source in sources:
with (out_path / f"{res.input.file.stem}.json").open("w") as fp:
fp.write(json.dumps(res.document.export_to_dict()))
with (out_path / f"{res.input.file.stem}.md").open("w") as fp:
fp.write(res.document.export_to_markdown())
res.document.save_as_json(
out_path / f"{res.input.file.stem}.md",
image_mode=ImageRefMode.PLACEHOLDER,
)
with (out_path / f"{res.input.file.stem}.doctag").open("w") as fp:
fp.write(doctags)
res.document.save_as_markdown(
out_path / f"{res.input.file.stem}.md",
image_mode=ImageRefMode.PLACEHOLDER,
)
pg_num = res.document.num_pages()
print("")
inference_time = time.time() - start_time
print(

View File

@ -26,7 +26,7 @@ Docling simplifies document processing, parsing diverse formats — including ad
* 🔒 Local execution capabilities for sensitive data and air-gapped environments
* 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
* 🔍 Extensive OCR support for scanned PDFs and images
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview))
* 🥚 Support of Visual Language Models ([SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview)) 🆕
* 💻 Simple and convenient CLI
### Coming soon

View File

@ -26,6 +26,13 @@ To see all available options (export formats etc.) run `docling --help`. More de
### Advanced options
#### SmolDocling via CLI
You can also use 🥚[SmolDocling](https://huggingface.co/ds4sd/SmolDocling-256M-preview) via Docling CLI:
```bash
docling --pipeline vlm --vlm-model smoldocling https://arxiv.org/pdf/2206.01062
```
This will use MLX acceleration on supported Apple Silicon hardware.
#### Model prefetching and offline usage
By default, models are downloaded automatically upon first usage. If you would prefer