mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
* adding granite-docling preview Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the model specs Signed-off-by: Peter Staar <taa@zurich.ibm.com> * typo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite-docling and add to the model downloader Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs and README Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix model name in CLI usage example Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> * Fix VLM model name in README.md Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
4.7 KiB
Vendored
4.7 KiB
Vendored
Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.
Features
- 🗂️ Parsing of multiple document formats incl. PDF, DOCX, PPTX, XLSX, HTML, WAV, MP3, images (PNG, TIFF, JPEG, ...), and more
- 📑 Advanced PDF understanding incl. page layout, reading order, table structure, code, formulas, image classification, and more
- 🧬 Unified, expressive DoclingDocument representation format
- ↪️ Various export formats and options, including Markdown, HTML, DocTags and lossless JSON
- 🔒 Local execution capabilities for sensitive data and air-gapped environments
- 🤖 Plug-and-play integrations incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
- 🔍 Extensive OCR support for scanned PDFs and images
- 👓 Support of several Visual Language Models (GraniteDocling)
- 🎙️ Support for Audio with Automatic Speech Recognition (ASR) models
- 🔌 Connect to any agent using the Docling MCP server
- 💻 Simple and convenient CLI
What's new
- 📤 Structured [information extraction][extraction] [🧪 beta]
- 📑 New layout model (Heron) by default, for faster PDF parsing
- 🔌 MCP server for agentic applications
Coming soon
- 📝 Metadata extraction, including title, authors, references & language
- 📝 Chart understanding (Barchart, Piechart, LinePlot, etc)
- 📝 Complex chemistry understanding (Molecular structures)
- 📝 Parsing of Web Video Text Tracks (WebVTT) files
Get started
Check out our getting started page to get the ball rolling!
Live assistant
Do you want to leverage the power of AI and get live support on Docling? Try out the Chat with Dosu functionalities provided by our friends at Dosu.
LF AI & Data
Docling is hosted as a project in the LF AI & Data Foundation.
IBM ❤️ Open Source AI
The project was started by the AI for knowledge team at IBM Research Zurich.
