mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
docs: fix examples rendering (#2281)
fix examples rendering Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
This commit is contained in:
48
docs/examples/batch_convert.py
vendored
48
docs/examples/batch_convert.py
vendored
@@ -1,32 +1,32 @@
|
|||||||
"""
|
# %% [markdown]
|
||||||
Batch convert multiple PDF files and export results in several formats.
|
# Batch convert multiple PDF files and export results in several formats.
|
||||||
|
|
||||||
What this example does
|
# What this example does
|
||||||
- Loads a small set of sample PDFs.
|
# - Loads a small set of sample PDFs.
|
||||||
- Runs the Docling PDF pipeline once per file.
|
# - Runs the Docling PDF pipeline once per file.
|
||||||
- Writes outputs to `scratch/` in multiple formats (JSON, HTML, Markdown, text, doctags, YAML).
|
# - Writes outputs to `scratch/` in multiple formats (JSON, HTML, Markdown, text, doctags, YAML).
|
||||||
|
|
||||||
Prerequisites
|
# Prerequisites
|
||||||
- Install Docling and dependencies as described in the repository README.
|
# - Install Docling and dependencies as described in the repository README.
|
||||||
- Ensure you can import `docling` from your Python environment.
|
# - Ensure you can import `docling` from your Python environment.
|
||||||
# - YAML export requires `PyYAML` (`pip install pyyaml`).
|
# <!-- YAML export requires `PyYAML` (`pip install pyyaml`). -->
|
||||||
|
|
||||||
Input documents
|
# Input documents
|
||||||
- By default, this example uses a few PDFs from `tests/data/pdf/` in the repo.
|
# - By default, this example uses a few PDFs from `tests/data/pdf/` in the repo.
|
||||||
- If you cloned without test data, or want to use your own files, edit
|
# - If you cloned without test data, or want to use your own files, edit
|
||||||
`input_doc_paths` below to point to PDFs on your machine.
|
# `input_doc_paths` below to point to PDFs on your machine.
|
||||||
|
|
||||||
Output formats (controlled by flags)
|
# Output formats (controlled by flags)
|
||||||
- `USE_V2 = True` enables the current Docling document exports (recommended).
|
# - `USE_V2 = True` enables the current Docling document exports (recommended).
|
||||||
- `USE_LEGACY = False` keeps legacy Deep Search exports disabled.
|
# - `USE_LEGACY = False` keeps legacy Deep Search exports disabled.
|
||||||
You can set it to `True` if you need legacy formats for compatibility tests.
|
# You can set it to `True` if you need legacy formats for compatibility tests.
|
||||||
|
|
||||||
Notes
|
# Notes
|
||||||
- Set `pipeline_options.generate_page_images = True` to include page images in HTML.
|
# - Set `pipeline_options.generate_page_images = True` to include page images in HTML.
|
||||||
- The script logs conversion progress and raises if any documents fail.
|
# - The script logs conversion progress and raises if any documents fail.
|
||||||
# - This example shows both helper methods like `save_as_*` and lower-level
|
# <!-- This example shows both helper methods like `save_as_*` and lower-level
|
||||||
# `export_to_*` + manual file writes; outputs may overlap intentionally.
|
# `export_to_*` + manual file writes; outputs may overlap intentionally. -->
|
||||||
"""
|
# %%
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
|
|||||||
4
docs/examples/minimal.py
vendored
4
docs/examples/minimal.py
vendored
@@ -1,7 +1,4 @@
|
|||||||
# %% [markdown]
|
# %% [markdown]
|
||||||
# Simple conversion: one document to Markdown
|
|
||||||
# ==========================================
|
|
||||||
#
|
|
||||||
# What this example does
|
# What this example does
|
||||||
# - Converts a single source (URL or local file path) to a unified Docling
|
# - Converts a single source (URL or local file path) to a unified Docling
|
||||||
# document and prints Markdown to stdout.
|
# document and prints Markdown to stdout.
|
||||||
@@ -17,6 +14,7 @@
|
|||||||
# Notes
|
# Notes
|
||||||
# - The converter auto-detects supported formats (PDF, DOCX, HTML, PPTX, images, etc.).
|
# - The converter auto-detects supported formats (PDF, DOCX, HTML, PPTX, images, etc.).
|
||||||
# - For batch processing or saving outputs to files, see `docs/examples/batch_convert.py`.
|
# - For batch processing or saving outputs to files, see `docs/examples/batch_convert.py`.
|
||||||
|
# %%
|
||||||
|
|
||||||
from docling.document_converter import DocumentConverter
|
from docling.document_converter import DocumentConverter
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user