docs: Describe examples (#2262)

* Update .py examples with clearer guidance,
update out of date imports and calls

Signed-off-by: Mingxuan Zhao <43148277+mingxzhao@users.noreply.github.com>

* Fix minimal.py string error, fix ruff format error

Signed-off-by: Mingxuan Zhao <43148277+mingxzhao@users.noreply.github.com>

* fix more CI issues

Signed-off-by: Mingxuan Zhao <43148277+mingxzhao@users.noreply.github.com>

---------

Signed-off-by: Mingxuan Zhao <43148277+mingxzhao@users.noreply.github.com>
This commit is contained in:
Mingxuan Zhao
2025-09-16 10:00:38 -04:00
committed by GitHub
parent 0e95171dd6
commit ff351fd40c
21 changed files with 608 additions and 85 deletions

View File

@@ -1,9 +1,32 @@
# %% [markdown]
# Simple conversion: one document to Markdown
# ==========================================
#
# What this example does
# - Converts a single source (URL or local file path) to a unified Docling
# document and prints Markdown to stdout.
#
# Requirements
# - Python 3.9+
# - Install Docling: `pip install docling`
#
# How to run
# - Use the default sample URL: `python docs/examples/minimal.py`
# - To use your own file or URL, edit the `source` variable below.
#
# Notes
# - The converter auto-detects supported formats (PDF, DOCX, HTML, PPTX, images, etc.).
# - For batch processing or saving outputs to files, see `docs/examples/batch_convert.py`.
from docling.document_converter import DocumentConverter
source = "https://arxiv.org/pdf/2408.09869" # document per local path or URL
# Change this to a local path or another URL if desired.
# Note: using the default URL requires network access; if offline, provide a
# local file path (e.g., Path("/path/to/file.pdf")).
source = "https://arxiv.org/pdf/2408.09869"
converter = DocumentConverter()
doc = converter.convert(source).document
result = converter.convert(source)
print(doc.export_to_markdown())
# output: ## Docling Technical Report [...]"
# Print Markdown to stdout.
print(result.document.export_to_markdown())