diff --git a/README.md b/README.md index cbd15460..69eccd7c 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@

[![arXiv](https://img.shields.io/badge/arXiv-2408.09869-b31b1b.svg)](https://arxiv.org/abs/2408.09869) -[![Docs](https://img.shields.io/badge/docs-live-brightgreen)](https://ds4sd.github.io/docling/) +[![Docs](https://img.shields.io/badge/docs-live-brightgreen)](https://docling-project.github.io/docling/) [![PyPI version](https://img.shields.io/pypi/v/docling)](https://pypi.org/project/docling/) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/docling)](https://pypi.org/project/docling/) [![Poetry](https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json)](https://python-poetry.org/) @@ -51,7 +51,7 @@ pip install docling Works on macOS, Linux and Windows environments. Both x86_64 and arm64 architectures. -More [detailed installation instructions](https://ds4sd.github.io/docling/installation/) are available in the docs. +More [detailed installation instructions](https://docling-project.github.io/docling/installation/) are available in the docs. ## Getting started @@ -66,23 +66,23 @@ result = converter.convert(source) print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]" ``` -More [advanced usage options](https://ds4sd.github.io/docling/usage/) are available in +More [advanced usage options](https://docling-project.github.io/docling/usage/) are available in the docs. ## Documentation -Check out Docling's [documentation](https://ds4sd.github.io/docling/), for details on +Check out Docling's [documentation](https://docling-project.github.io/docling/), for details on installation, usage, concepts, recipes, extensions, and more. ## Examples -Go hands-on with our [examples](https://ds4sd.github.io/docling/examples/), +Go hands-on with our [examples](https://docling-project.github.io/docling/examples/), demonstrating how to address different application use cases with Docling. ## Integrations To further accelerate your AI application development, check out Docling's native -[integrations](https://ds4sd.github.io/docling/integrations/) with popular frameworks +[integrations](https://docling-project.github.io/docling/integrations/) with popular frameworks and tools. ## Get help and support @@ -123,6 +123,6 @@ For individual model usage, please refer to the model licenses found in the orig Docling has been brought to you by IBM. -[supported_formats]: https://ds4sd.github.io/docling/usage/supported_formats/ -[docling_document]: https://ds4sd.github.io/docling/concepts/docling_document/ -[integrations]: https://ds4sd.github.io/docling/integrations/ +[supported_formats]: https://docling-project.github.io/docling/usage/supported_formats/ +[docling_document]: https://docling-project.github.io/docling/concepts/docling_document/ +[integrations]: https://docling-project.github.io/docling/integrations/ diff --git a/docling/cli/models.py b/docling/cli/models.py index cc4a43ac..7bc313c1 100644 --- a/docling/cli/models.py +++ b/docling/cli/models.py @@ -121,7 +121,7 @@ def download( "Using the CLI:", f"`docling --artifacts-path={output_dir} FILE`", "\n", - "Using Python: see the documentation at .", + "Using Python: see the documentation at .", ) diff --git a/docling/models/ocr_mac_model.py b/docling/models/ocr_mac_model.py index 38bcf1ca..1860a262 100644 --- a/docling/models/ocr_mac_model.py +++ b/docling/models/ocr_mac_model.py @@ -26,7 +26,7 @@ class OcrMacModel(BaseOcrModel): "ocrmac is not correctly installed. " "Please install it via `pip install ocrmac` to use this OCR engine. " "Alternatively, Docling has support for other OCR engines. See the documentation: " - "https://ds4sd.github.io/docling/installation/" + "https://docling-project.github.io/docling/installation/" ) try: from ocrmac import ocrmac diff --git a/docling/models/tesseract_ocr_model.py b/docling/models/tesseract_ocr_model.py index c41806f5..e06cddc2 100644 --- a/docling/models/tesseract_ocr_model.py +++ b/docling/models/tesseract_ocr_model.py @@ -31,14 +31,14 @@ class TesseractOcrModel(BaseOcrModel): "Note that tesserocr might have to be manually compiled for working with " "your Tesseract installation. The Docling documentation provides examples for it. " "Alternatively, Docling has support for other OCR engines. See the documentation: " - "https://ds4sd.github.io/docling/installation/" + "https://docling-project.github.io/docling/installation/" ) missing_langs_errmsg = ( "tesserocr is not correctly configured. No language models have been detected. " "Please ensure that the TESSDATA_PREFIX envvar points to tesseract languages dir. " "You can find more information how to setup other OCR engines in Docling " "documentation: " - "https://ds4sd.github.io/docling/installation/" + "https://docling-project.github.io/docling/installation/" ) try: diff --git a/docs/examples/backend_xml_rag.ipynb b/docs/examples/backend_xml_rag.ipynb index 3bba3e81..091f116d 100644 --- a/docs/examples/backend_xml_rag.ipynb +++ b/docs/examples/backend_xml_rag.ipynb @@ -36,7 +36,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is an example of using [Docling](https://ds4sd.github.io/docling/) for converting structured data (XML) into a unified document\n", + "This is an example of using [Docling](https://docling-project.github.io/docling/) for converting structured data (XML) into a unified document\n", "representation format, `DoclingDocument`, and leverage its riched structured content for RAG applications.\n", "\n", "Data used in this example consist of patents from the [United States Patent and Trademark Office (USPTO)](https://www.uspto.gov/) and medical\n", diff --git a/docs/examples/hybrid_chunking.ipynb b/docs/examples/hybrid_chunking.ipynb index 6a5f5882..2f6d9457 100644 --- a/docs/examples/hybrid_chunking.ipynb +++ b/docs/examples/hybrid_chunking.ipynb @@ -103,7 +103,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "> 👉 **NOTE**: As you see above, using the `HybridChunker` can sometimes lead to a warning from the transformers library, however this is a \"false alarm\" — for details check [here](https://ds4sd.github.io/docling/faq/#hybridchunker-triggers-warning-token-indices-sequence-length-is-longer-than-the-specified-maximum-sequence-length-for-this-model)." + "> 👉 **NOTE**: As you see above, using the `HybridChunker` can sometimes lead to a warning from the transformers library, however this is a \"false alarm\" — for details check [here](https://docling-project.github.io/docling/faq/#hybridchunker-triggers-warning-token-indices-sequence-length-is-longer-than-the-specified-maximum-sequence-length-for-this-model)." ] }, { diff --git a/docs/examples/rag_azuresearch.ipynb b/docs/examples/rag_azuresearch.ipynb index dcfd19e3..9f867b1d 100644 --- a/docs/examples/rag_azuresearch.ipynb +++ b/docs/examples/rag_azuresearch.ipynb @@ -36,7 +36,7 @@ "## A recipe 🧑‍🍳 🐥 💚\n", "\n", "This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system using:\n", - "- [Docling](https://ds4sd.github.io/docling/) for document parsing and chunking\n", + "- [Docling](https://docling-project.github.io/docling/) for document parsing and chunking\n", "- [Azure AI Search](https://azure.microsoft.com/products/ai-services/ai-search/?msockid=0109678bea39665431e37323ebff6723) for vector indexing and retrieval\n", "- [Azure OpenAI](https://azure.microsoft.com/products/ai-services/openai-service?msockid=0109678bea39665431e37323ebff6723) for embeddings and chat completion\n", "\n", diff --git a/docs/examples/rag_weaviate.ipynb b/docs/examples/rag_weaviate.ipynb index 6f033e11..7c020f43 100644 --- a/docs/examples/rag_weaviate.ipynb +++ b/docs/examples/rag_weaviate.ipynb @@ -29,7 +29,7 @@ "\n", "## A recipe 🧑‍🍳 🐥 💚\n", "\n", - "This is a code recipe that uses [Weaviate](https://weaviate.io/) to perform RAG over PDF documents parsed by [Docling](https://ds4sd.github.io/docling/).\n", + "This is a code recipe that uses [Weaviate](https://weaviate.io/) to perform RAG over PDF documents parsed by [Docling](https://docling-project.github.io/docling/).\n", "\n", "In this notebook, we accomplish the following:\n", "* Parse the top machine learning papers on [arXiv](https://arxiv.org/) using Docling\n", diff --git a/mkdocs.yml b/mkdocs.yml index 816e9851..8f2adf24 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,5 +1,5 @@ site_name: Docling -site_url: https://ds4sd.github.io/docling/ +site_url: https://docling-project.github.io/docling/ repo_name: docling-project/docling repo_url: https://github.com/docling-project/docling diff --git a/tests/data/groundtruth/docling_v2/word_tables.docx.html b/tests/data/groundtruth/docling_v2/word_tables.docx.html index 2dc087f7..73b3335e 100644 --- a/tests/data/groundtruth/docling_v2/word_tables.docx.html +++ b/tests/data/groundtruth/docling_v2/word_tables.docx.html @@ -2,7 +2,7 @@ + href="https://docling-project.github.io/docling/assets/logo.png"/> Powered by Docling