mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 20:58:11 +00:00
docs: introduce docs site (#141)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
29
docs/index.md
Normal file
29
docs/index.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Docling
|
||||
|
||||
<p align="center">
|
||||
<a href="https://ds4sd.github.io/docling/">
|
||||
<img loading="lazy" alt="Docling" src="assets/logo.png" width="150" />
|
||||
</a>
|
||||
</p>
|
||||
|
||||
|
||||
[](https://arxiv.org/abs/2408.09869)
|
||||
[](https://pypi.org/project/docling/)
|
||||

|
||||
[](https://python-poetry.org/)
|
||||
[](https://github.com/psf/black)
|
||||
[](https://pycqa.github.io/isort/)
|
||||
[](https://pydantic.dev)
|
||||
[](https://github.com/pre-commit/pre-commit)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
|
||||
Docling bundles PDF document conversion to JSON and Markdown in an easy, self-contained package.
|
||||
|
||||
## Features
|
||||
|
||||
* ⚡ Converts any PDF document to JSON or Markdown format, stable and lightning fast
|
||||
* 📑 Understands detailed page layout, reading order and recovers table structures
|
||||
* 📝 Extracts metadata from the document, such as title, authors, references and language
|
||||
* 🔍 Includes OCR support for scanned PDFs
|
||||
* 🤖 Integrates easily with LLM app / RAG frameworks like LlamaIndex 🦙 & LangChain 🦜🔗
|
||||
* 💻 Provides a simple and convenient CLI
|
||||
Reference in New Issue
Block a user