docs: add automatic generation of CLI reference (#325)

* docs: add automatic generation of CLI reference

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* install deps for building CLI ref

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2024-11-15 13:18:17 +01:00
committed by GitHub
parent 25fd149c38
commit ca8524ecae
7 changed files with 48 additions and 68 deletions

9
docs/cli.md Normal file
View File

@@ -0,0 +1,9 @@
# CLI Reference
This page provides documentation for our command line tools.
::: mkdocs-click
:module: docling.cli.main
:command: click_app
:prog_name: docling
:style: table

View File

@@ -22,54 +22,7 @@ A simple example would look like this:
docling https://arxiv.org/pdf/2206.01062
```
To see all available options (export formats etc.) run `docling --help`.
<details>
<summary><b>CLI reference</b></summary>
Here are the available options as of this writing (for an up-to-date listing, run `docling --help`):
```console
$ docling --help
Usage: docling [OPTIONS] source
╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * input_sources source PDF files to convert. Can be local file / directory paths or URL. [default: None] │
│ [required] │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --from [docx|pptx|html|image|pdf|asciidoc|md] Specify input formats to convert from. │
│ Defaults to all formats. │
│ [default: None] │
│ --to [md|json|text|doctags] Specify output formats. Defaults to │
│ Markdown. │
│ [default: None] │
│ --ocr --no-ocr If enabled, the bitmap content will be │
│ processed using OCR. │
│ [default: ocr] │
│ --ocr-engine [easyocr|tesseract_cli|tesseract] The OCR engine to use. │
│ [default: easyocr] │
│ --pdf-backend [pypdfium2|dlparse_v1|dlparse_v2] The PDF backend to use. │
│ [default: dlparse_v1] │
│ --table-mode [fast|accurate] The mode to use in the table structure │
│ model. │
│ [default: fast] │
│ --artifacts-path PATH If provided, the location of the model │
│ artifacts. │
│ [default: None] │
│ --abort-on-error --no-abort-on-error If enabled, the bitmap content will be │
│ processed using OCR. │
│ [default: no-abort-on-error] │
│ --output PATH Output directory where results are │
│ saved. │
│ [default: .] │
│ --version Show version information. │
│ --help Show this message and exit. │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
</details>
To see all available options (export formats etc.) run `docling --help`. More details in the [CLI reference page](./cli.md).