mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 20:58:11 +00:00
fix: updated the render_as_doctags with the new arguments from docling-core (#93)
* updated the render_as_doctags with the new arguments from docling-core Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ensuring that docling-core is >1.5.0 to accomodate with the latest export-to-doctags parameters Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the doctags tests Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the README Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fix poetry lock Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Fix formatting problems Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fixed the doctag export in docling/utils/export.py Signed-off-by: Peter Staar <taa@zurich.ibm.com> * propagate xsize and ysize Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
committed by
GitHub
parent
dce9934a0f
commit
4794ce460a
@@ -70,7 +70,9 @@ from docling.document_converter import DocumentConverter
|
||||
source = "https://arxiv.org/pdf/2408.09869" # PDF path or URL
|
||||
converter = DocumentConverter()
|
||||
result = converter.convert_single(source)
|
||||
|
||||
print(result.render_as_markdown()) # output: "## Docling Technical Report[...]"
|
||||
print(result.render_as_doctags()) # output: "<document><title><page_1><loc_20>..."
|
||||
```
|
||||
|
||||
### Convert a batch of documents
|
||||
|
||||
Reference in New Issue
Block a user