From 704393613f32cbb875657eb5a965b74c9b57cb3b Mon Sep 17 00:00:00 2001 From: Christoph Auer Date: Wed, 30 Oct 2024 14:14:00 +0100 Subject: [PATCH] Rewrite feature items on README Signed-off-by: Christoph Auer --- README.md | 6 +++--- docs/index.md | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index ae923ccc..c3af0f79 100644 --- a/README.md +++ b/README.md @@ -22,9 +22,9 @@ Docling parses documents and exports them to the desired format with ease and sp ## Features -* 🗂️ Multi-format support for input (PDF, DOCX, PPTX, Bitmap images, HTML, AsciiDoc, MarkDown) and output (Markdown, JSON, YAML) -* 📑 Advanced PDF document understanding incl. page layout, reading order & table structures -* 🧩 Strongly typed Pydantic v2 data structure named [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) which supports hierarchies and provides native iterators and chunkers. +* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON +* 📑 Advanced PDF document understanding including page layout, reading order & table structures +* 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format * 📝 Metadata extraction, including title, authors, references & language * 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications * 🔍 OCR support for scanned PDFs diff --git a/docs/index.md b/docs/index.md index d561f490..68cdd12a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -19,9 +19,9 @@ Docling parses documents and exports them to the desired format with ease and sp ## Features -* 🗂️ Multi-format support for input (PDF, DOCX, PPTX, Bitmap images, HTML, AsciiDoc, MarkDown) and output (Markdown, JSON, YAML) +* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON * 📑 Advanced PDF document understanding incl. page layout, reading order & table structures -* 🧩 Strongly typed Pydantic v2 data structure named [DoclingDocument](./concepts/docling_document.md) which supports hierarchies and provides native iterators and chunkers. +* 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format * 📝 Metadata extraction, including title, authors, references & language * 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications * 🔍 OCR support for scanned PDFs