mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-27 20:44:16 +00:00
518 B
518 B
Docling
Docling bundles PDF document conversion to JSON and Markdown in an easy, selfcontained package.
Features Converts any PDF document to JSON or Markdown format, stable and lightning fast. Understands detailed page layout, reading order and recovers table structures. Extracts metadata from the document, such as title, authors, references and language. Includes OCR support for scanned PDFs. Integrates easily with LLM app / RAG frameworks like LlamaIndex and LangChain Provides a simple and convenient CLI.