docling/docling
Nikos Livathinos c6b3763ecb
feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290)
- When the OCR is forced, any existing PDF cells are rejected.
- Introduce the force-ocr cmd parameter in docling CLI.
- Update unit tests.
- Add the full_page_ocr.py example in mkdocs.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
2024-11-12 09:46:14 +01:00
..
backend fix: Added handling of grouped elements in pptx backend (#307) 2024-11-11 16:38:21 +01:00
cli feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) 2024-11-12 09:46:14 +01:00
datamodel feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) 2024-11-12 09:46:14 +01:00
models feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) 2024-11-12 09:46:14 +01:00
pipeline feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
utils feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
document_converter.py fix: allow to explicitly initialize the pipeline (#189) 2024-10-30 17:54:53 +01:00