mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
fix: refine conversion result (#52)
- fields `output` & `assembled` need not be optional - introduced "synonym" `ConversionResult` for `ConvertedDocument` & deprecated the latter Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
@@ -49,10 +49,10 @@ To convert invidual PDF documents, use `convert_single()`, for example:
|
||||
```python
|
||||
from docling.document_converter import DocumentConverter
|
||||
|
||||
source = "https://arxiv.org/pdf/2206.01062" # PDF path or URL
|
||||
source = "https://arxiv.org/pdf/2408.09869" # PDF path or URL
|
||||
converter = DocumentConverter()
|
||||
doc = converter.convert_single(source)
|
||||
print(doc.render_as_markdown()) # output: "## DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis [...]"
|
||||
result = converter.convert_single(source)
|
||||
print(result.render_as_markdown()) # output: "## Docling Technical Report[...]"
|
||||
```
|
||||
|
||||
### Convert a batch of documents
|
||||
@@ -118,7 +118,7 @@ You can convert PDFs from a binary stream instead of from the filesystem as foll
|
||||
buf = BytesIO(your_binary_stream)
|
||||
docs = [DocumentStream(filename="my_doc.pdf", stream=buf)]
|
||||
conv_input = DocumentConversionInput.from_streams(docs)
|
||||
converted_docs = doc_converter.convert(conv_input)
|
||||
results = doc_converter.convert(conv_input)
|
||||
```
|
||||
### Limit resource usage
|
||||
|
||||
|
||||
Reference in New Issue
Block a user