docs: improve examples (#27)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2024-08-07 17:16:35 +02:00
committed by GitHub
parent 20cbe7c24a
commit 9550db8e64
5 changed files with 139 additions and 25 deletions

View File

@@ -56,17 +56,21 @@ print(doc.export_to_markdown()) # output: "## DocLayNet: A Large Human-Annotate
### Convert a batch of documents
For an example of batch-converting documents, see [convert.py](https://github.com/DS4SD/docling/blob/main/examples/convert.py).
For an example of batch-converting documents, see [batch_convert.py](https://github.com/DS4SD/docling/blob/main/examples/batch_convert.py).
From a local repo clone, you can run it with:
```
python examples/convert.py
python examples/batch_convert.py
```
The output of the above command will be written to `./scratch`.
### Adjust pipeline features
The example file [custom_convert.py](https://github.com/DS4SD/docling/blob/main/examples/custom_convert.py) contains multiple ways
one can adjust the conversion pipeline and features.
#### Control pipeline options
You can control if table structure recognition or OCR should be performed by arguments passed to `DocumentConverter`: