update README

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
Panos Vagenas 2024-07-25 22:58:05 +02:00
parent b9fd50e7de
commit 33d5d7d787

View File

@ -37,12 +37,28 @@ pip install docling
To develop for Docling, you need Python 3.10 / 3.11 / 3.12 and Poetry. You can then install from your local clone's root dir: To develop for Docling, you need Python 3.10 / 3.11 / 3.12 and Poetry. You can then install from your local clone's root dir:
```bash ```bash
poetry install poetry install --all-extras
``` ```
## Usage ## Usage
For basic usage, see the [convert.py](https://github.com/DS4SD/docling/blob/main/examples/convert.py) example module. Run with: ### Convert a single document
To convert invidual PDF documents, use `convert_single()`, for example:
```python
from docling.document_converter import DocumentConverter
source = "https://arxiv.org/pdf/2206.01062" # PDF path or URL
converter = DocumentConverter()
doc = converter.convert_single(source)
print(doc.export_to_markdown()) # output: "## DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis [...]"
```
### Convert a batch of documents
For an example of converting multiple documents, see [convert.py](https://github.com/DS4SD/docling/blob/main/examples/convert.py).
From a local repo clone, you can run it with:
``` ```
python examples/convert.py python examples/convert.py