mirror of
https://github.com/DS4SD/docling.git
synced 2025-08-01 15:02:21 +00:00
docs: add explicit artifacts path example
[skip ci] Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
parent
244ca69cfd
commit
aa187552e5
@ -108,6 +108,30 @@ doc_converter = DocumentConverter(
|
||||
)
|
||||
```
|
||||
|
||||
##### Provide specific artifacts path
|
||||
|
||||
By default, artifacts such as models are downloaded automatically upon first usage. If you would prefer to use a local path where the artifacts have been explicitly prefetched, you can do that as follows:
|
||||
|
||||
```python
|
||||
from docling.datamodel.base_models import InputFormat
|
||||
from docling.datamodel.pipeline_options import PdfPipelineOptions
|
||||
from docling.document_converter import DocumentConverter, PdfFormatOption
|
||||
from docling.pipeline.standard_pdf_pipeline import StandardPdfPipeline
|
||||
|
||||
#to explicitly prefetch:
|
||||
# artifacts_path = StandardPdfPipeline.download_models_hf()
|
||||
|
||||
artifacts_path = "/local/path/to/artifacts"
|
||||
|
||||
pipeline_options = PdfPipelineOptions(artifacts_path=artifacts_path)
|
||||
doc_converter = DocumentConverter(
|
||||
format_options={
|
||||
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
#### Impose limits on the document size
|
||||
|
||||
You can limit the file size and number of pages which should be allowed to process per document:
|
||||
|
Loading…
Reference in New Issue
Block a user