Docling DS4SD%2Fdocling | Trendshift

[![arXiv](https://img.shields.io/badge/arXiv-2408.09869-b31b1b.svg)](https://arxiv.org/abs/2408.09869) [![PyPI version](https://img.shields.io/pypi/v/docling)](https://pypi.org/project/docling/) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/docling)](https://pypi.org/project/docling/) [![Poetry](https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json)](https://python-poetry.org/) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/) [![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)](https://pydantic.dev) [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit) [![License MIT](https://img.shields.io/github/license/DS4SD/docling)](https://opensource.org/licenses/MIT) [![PyPI Downloads](https://static.pepy.tech/badge/docling/month)](https://pepy.tech/projects/docling) Docling parses documents and exports them to the desired format with ease and speed. ## Features * πŸ—‚οΈ Parsing of [multiple documents formats][supported_formats] incl. PDF, DOCX, XLSX, HTML, images, & more * πŸ“‘ Advanced PDF understanding including page layout, reading order & table structure * 🧬 Unified, expressive [DoclingDocument][docling_document] representation format * β†ͺ️ Various [export formats][supported_formats] and options, including Markdown, HTML, and lossless JSON * πŸ”’ Local execution capabilities for sensitive data and air-gapped environments * πŸ€– Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI * πŸ” OCR support for scanned PDFs and images * πŸ’» Simple and convenient CLI ### Coming soon * ♾️ Equation & code extraction * πŸ“ Metadata extraction, including title, authors, references & language ## Get started
Concepts
Learn Docling fundamendals
Examples
Try out recipes for various use cases, including conversion, RAG, and more
Integrations
Check out integrations with popular frameworks and tools
Reference
See more API details
## IBM ❀️ Open Source AI Docling has been brought to you by IBM. [supported_formats]: ./supported_formats.md [docling_document]: ./concepts/docling_document.md [integrations]: ./integrations/index.md