docs: add documentation for confidence scores (#1912)

* docs: add documentation for confidence scores Signed-off-by: Fabiano Franz <contact@fabianofranz.com> * Increase focus on confidence grades, scores are informational only Signed-off-by: Fabiano Franz <contact@fabianofranz.com> * Update confidence_scores.md Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> --------- Signed-off-by: Fabiano Franz <contact@fabianofranz.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
2025-07-23 18:45:00 +00:00 · 2025-07-21 05:16:17 -03:00 · 2025-07-21 05:16:17 -03:00 · 5d98bcea1b
commit 5d98bcea1b
parent 7561be537a
3 changed files with 62 additions and 0 deletions
--- a/docs/assets/confidence_scores.png
+++ b/docs/assets/confidence_scores.png
--- a/docs/concepts/confidence_scores.md
+++ b/docs/concepts/confidence_scores.md
@ -0,0 +1,61 @@
+## Introduction
+
+**Confidence grades** were introduced in [v2.34.0](https://github.com/docling-project/docling/releases/tag/v2.34.0) to help users understand how well a conversion performed and guide decisions about post-processing workflows. They are available in the [`confidence`](../../reference/document_converter/#docling.document_converter.ConversionResult.confidence) field of the [`ConversionResult`](../../reference/document_converter/#docling.document_converter.ConversionResult) object returned by the document converter.
+
+## Purpose
+
+Complex layouts, poor scan quality, or challenging formatting can lead to suboptimal document conversion results that may require additional attention or alternative conversion pipelines.
+
+Confidence scores provide a quantitative assessment of document conversion quality. Each confidence report includes a **numerical score** (0.0 to 1.0) measuring conversion accuracy, and a **quality grade** (poor, fair, good, excellent) for quick interpretation.
+
+!!! note "Focus on quality grades!"
+
+    Users can and should safely focus on the document-level grade fields — `mean_grade` and `low_grade` — to assess overall conversion quality. Numerical scores are used internally and are for informational purposes only; their computation and weighting may change in the future.
+
+Use cases for confidence grades include:
+
+- Identify documents requiring manual review after the conversion
+- Adjust conversion pipelines to the most appropriate for each document type
+- Set confidence thresholds for unattended batch conversions
+- Catch potential conversion issues early in your workflow.
+
+## Concepts
+
+### Scores and grades
+
+A confidence report contains *scores* and *grades*:
+
+- **Scores**: Numerical values between 0.0 and 1.0, where higher values indicate better conversion quality, for internal use only
+- **Grades**: Categorical quality assessments based on score thresholds, used to assess the overall conversion confidence:
+  - `POOR`
+  - `FAIR`
+  - `GOOD`
+  - `EXCELLENT`
+
+### Types of confidence calculated
+
+Each confidence report includes four component scores and grades:
+
+- **`layout_score`**: Overall quality of document element recognition 
+- **`ocr_score`**: Quality of OCR-extracted content
+- **`parse_score`**: 10th percentile score of digital text cells (emphasizes problem areas)
+- **`table_score`**: Table extraction quality *(not yet implemented)*
+
+### Summary grades
+
+Two aggregate grades provide overall document quality assessment:
+
+- **`mean_grade`**: Average of the four component scores
+- **`low_grade`**: 5th percentile score (highlights worst-performing areas)
+
+### Page-level vs document-level
+
+Confidence grades are calculated at two levels:
+
+- **Page-level**: Individual scores and grades for each page, stored in the `pages` field
+- **Document-level**: Overall scores and grades for the entire document, calculated as averages of the page-level grades and stored in fields equally named in the root [`ConfidenceReport`](h../../reference/document_converter/#docling.document_converter.ConversionResult.confidence)
+
+### Example
+
+![confidence_scores](../assets/confidence_scores.png)
+
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -68,6 +68,7 @@ nav:
    - Architecture: concepts/architecture.md
    - Docling Document: concepts/docling_document.md
    - Serialization: concepts/serialization.md
+    - Confidence Scores: concepts/confidence_scores.md
    - Chunking: concepts/chunking.md
    - Plugins: concepts/plugins.md
  - Examples: