Merge branch 'docling-project:main' into main

2025-07-27 04:24:45 +00:00 · 2025-05-27 00:11:41 +05:30 · 2025-05-27 00:11:41 +05:30 · 7c7baf814d
commit 7c7baf814d
parent a8a119c93b 2579d89510
110 changed files with 99417 additions and 896 deletions
--- a/.actor/Dockerfile
+++ b/.actor/Dockerfile
@ -64,7 +64,6 @@ ENV EASYOCR_MODULE_PATH=/tmp/easyocr-models
 COPY --chown=1000:1000 .actor/actor.sh .actor/actor.sh
 COPY --chown=1000:1000 .actor/actor.json .actor/actor.json
 COPY --chown=1000:1000 .actor/input_schema.json .actor/input_schema.json
 COPY --chown=1000:1000 .actor/docling_processor.py .actor/docling_processor.py
 RUN chmod +x .actor/actor.sh
 # Copy the build files from builder
--- a/.actor/README.md
+++ b/.actor/README.md
@ -2,7 +2,7 @@
 [![Docling Actor](https://apify.com/actor-badge?actor=vancura/docling?fpr=docling)](https://apify.com/vancura/docling)
-This Actor (specification v1) wraps the [Docling project](https://ds4sd.github.io/docling/) to provide serverless document processing in the cloud. It can process complex documents (PDF, DOCX, images) and convert them into structured formats (Markdown, JSON, HTML, Text, or DocTags) with optional OCR support.
+This Actor (specification v1) wraps the [Docling project](https://github.com/docling-project/docling) to provide serverless document processing in the cloud. It can process complex documents (PDF, DOCX, images) and convert them into structured formats (Markdown, JSON, HTML, Text, or DocTags) with optional OCR support.
 ## What are Actors?
@ -14,7 +14,7 @@ This Actor (specification v1) wraps the [Docling project](https://ds4sd.github.i
 2. [Usage](#usage)
 3. [Input Parameters](#input-parameters)
 4. [Output](#output)
-5. [Performance & Resources](#performance--resources)
+5. [Performance and Resources](#performance-and-resources)
 6. [Troubleshooting](#troubleshooting)
 7. [Local Development](#local-development)
 8. [Architecture](#architecture)
@ -190,7 +190,7 @@ Access logs via:
 apify key-value-stores get-record DOCLING_LOG
 ```
-## Performance & Resources
+## Performance and Resources
 - **Docker Image Size**: ~4GB
 - **Memory Requirements**:
--- a/.actor/actor.json
+++ b/.actor/actor.json
@ -1,10 +1,10 @@
 {
  "actorSpecification": 1,
  "name": "docling",
-  "version": "0.0",
+  "version": "1.0",
  "environmentVariables": {},
  "dockerFile": "./Dockerfile",
-  "input": "./input_schema.json",
+  "inputSchema": "./input_schema.json",
  "scripts": {
    "run": "./actor.sh"
  }
--- a/.actor/actor.sh
+++ b/.actor/actor.sh
@ -154,17 +154,6 @@ else
    echo "Warning: No build files directory found. Some tools may be unavailable."
 fi
 # Copy Python processor script to tools directory
 PYTHON_SCRIPT_PATH="$(dirname "$0")/docling_processor.py"
 if [ -f "$PYTHON_SCRIPT_PATH" ]; then
    echo "Copying Python processor script to tools directory..."
    cp "$PYTHON_SCRIPT_PATH" "$TOOLS_DIR/"
    chmod +x "$TOOLS_DIR/docling_processor.py"
 else
    echo "ERROR: Python processor script not found at $PYTHON_SCRIPT_PATH"
    exit 1
 fi
 # Check OCR directories and ensure they're writable
 echo "Checking OCR directory permissions..."
 OCR_DIR="/opt/app-root/src/.EasyOCR"
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,3 +1,51 @@
 ## [v2.34.0](https://github.com/docling-project/docling/releases/tag/v2.34.0) - 2025-05-22
 ### Feature
 * **ocr:** Auto-detect rotated pages in Tesseract ([#1167](https://github.com/docling-project/docling/issues/1167)) ([`45265bf`](https://github.com/docling-project/docling/commit/45265bf8b1a6d6ad5367bb3f17fb3fa9d4366a05))
 * Establish confidence estimation for document and pages ([#1313](https://github.com/docling-project/docling/issues/1313)) ([`9087524`](https://github.com/docling-project/docling/commit/90875247e5813da1de17f3cd4475937e8bd45571))
 ### Fix
 * Fix ZeroDivisionError for cell_bbox.area() ([#1636](https://github.com/docling-project/docling/issues/1636)) ([`c2f595d`](https://github.com/docling-project/docling/commit/c2f595d2830ca2e28e68c5da606e89541264f156))
 * **integration:** Update the Apify Actor integration ([#1619](https://github.com/docling-project/docling/issues/1619)) ([`14d4f5b`](https://github.com/docling-project/docling/commit/14d4f5b109fa65d777ab147b3ce9b5174d020a5d))
 ## [v2.33.0](https://github.com/docling-project/docling/releases/tag/v2.33.0) - 2025-05-20
 ### Feature
 * Add textbox content extraction in msword_backend ([#1538](https://github.com/docling-project/docling/issues/1538)) ([`12a0e64`](https://github.com/docling-project/docling/commit/12a0e648929ce75da73617904792a50f5145fe4a))
 ### Fix
 * Fix issue with detecting docx files, and files with upper case extensions ([#1609](https://github.com/docling-project/docling/issues/1609)) ([`f4d9d41`](https://github.com/docling-project/docling/commit/f4d9d4111b0a6eb87fc1c05a56618fc430d1e7a2))
 * Load_from_doctags static usage ([#1617](https://github.com/docling-project/docling/issues/1617)) ([`0e00a26`](https://github.com/docling-project/docling/commit/0e00a263fa0c45f6cf2ae0bd94f9387c28e51ed0))
 * Incorrect force_backend_text behaviour for VLM DocTag pipelines ([#1371](https://github.com/docling-project/docling/issues/1371)) ([`f2e9c07`](https://github.com/docling-project/docling/commit/f2e9c0784c842612641171754ce51362e298088d))
 * **pypdfium:** Resolve overlapping text when merging bounding boxes ([#1549](https://github.com/docling-project/docling/issues/1549)) ([`98b5eeb`](https://github.com/docling-project/docling/commit/98b5eeb8440d34ac84f58271c8b8eea88881260a))
 ## [v2.32.0](https://github.com/docling-project/docling/releases/tag/v2.32.0) - 2025-05-14
 ### Feature
 * Improve parallelization for remote services API calls ([#1548](https://github.com/docling-project/docling/issues/1548)) ([`3a04f2a`](https://github.com/docling-project/docling/commit/3a04f2a367e32913f91faa2325f928b85112e632))
 * Support image/webp file type ([#1415](https://github.com/docling-project/docling/issues/1415)) ([`12dab0a`](https://github.com/docling-project/docling/commit/12dab0a1e8d181d99e4711ffdbbc33d158234fb4))
 ### Fix
 * **ocr:** Orig field in TesseractOcrCliModel as str ([#1553](https://github.com/docling-project/docling/issues/1553)) ([`9f8b479`](https://github.com/docling-project/docling/commit/9f8b479f17bbfaf79c3c897980ad15742ec86568))
 * **settings:** Fix nested settings load via environment variables ([#1551](https://github.com/docling-project/docling/issues/1551)) ([`2efb7a7`](https://github.com/docling-project/docling/commit/2efb7a7c06a8e51516cc9b93e5dbcdea69f562fa))
 ### Documentation
 * Add advanced chunking & serialization example ([#1589](https://github.com/docling-project/docling/issues/1589)) ([`9f28abf`](https://github.com/docling-project/docling/commit/9f28abf0610560645b40352dfdfc3525fa86c28d))
 ## [v2.31.2](https://github.com/docling-project/docling/releases/tag/v2.31.2) - 2025-05-13
 ### Fix
 * AsciiDoc header identification (#1562) ([#1563](https://github.com/docling-project/docling/issues/1563)) ([`4046d0b`](https://github.com/docling-project/docling/commit/4046d0b2f38254679de5fc78aaf2fe630d6bb61c))
 * Restrict click version and update lock file ([#1582](https://github.com/docling-project/docling/issues/1582)) ([`8baa85a`](https://github.com/docling-project/docling/commit/8baa85a49d3a456d198c52aac8e0b4ac70c92e72))
 ## [v2.31.1](https://github.com/docling-project/docling/releases/tag/v2.31.1) - 2025-05-12
 ### Fix
--- a/docling/backend/asciidoc_backend.py
+++ b/docling/backend/asciidoc_backend.py
@ -287,7 +287,7 @@ class AsciiDocBackend(DeclarativeDocumentBackend):
    #   =========   Section headers
    def _is_section_header(self, line):
-        return re.match(r"^==+", line)
+        return re.match(r"^==+\s+", line)
    def _parse_section_header(self, line):
        match = re.match(r"^(=+)\s+(.*)", line)
--- a/docling/backend/docling_parse_backend.py
+++ b/docling/backend/docling_parse_backend.py
@ -60,7 +60,7 @@ class DoclingParsePageBackend(PdfPageBackend):
                coord_origin=CoordOrigin.BOTTOMLEFT,
            ).to_top_left_origin(page_height=page_size.height * scale)
-            overlap_frac = cell_bbox.intersection_area_with(bbox) / cell_bbox.area()
+            overlap_frac = cell_bbox.intersection_over_self(bbox)
            if overlap_frac > 0.5:
                if len(text_piece) > 0:
--- a/docling/backend/docling_parse_v2_backend.py
+++ b/docling/backend/docling_parse_v2_backend.py
@ -71,7 +71,7 @@ class DoclingParseV2PageBackend(PdfPageBackend):
                coord_origin=CoordOrigin.BOTTOMLEFT,
            ).to_top_left_origin(page_height=page_size.height * scale)
-            overlap_frac = cell_bbox.intersection_area_with(bbox) / cell_bbox.area()
+            overlap_frac = cell_bbox.intersection_over_self(bbox)
            if overlap_frac > 0.5:
                if len(text_piece) > 0:
--- a/docling/backend/docling_parse_v4_backend.py
+++ b/docling/backend/docling_parse_v4_backend.py
@ -46,7 +46,7 @@ class DoclingParseV4PageBackend(PdfPageBackend):
                .scaled(scale)
            )
-            overlap_frac = cell_bbox.intersection_area_with(bbox) / cell_bbox.area()
+            overlap_frac = cell_bbox.intersection_over_self(bbox)
            if overlap_frac > 0.5:
                if len(text_piece) > 0:
--- a/docling/backend/msword_backend.py
+++ b/docling/backend/msword_backend.py
@ -2,7 +2,7 @@ import logging
 import re
 from io import BytesIO
 from pathlib import Path
-from typing import Any, Optional, Union
+from typing import Any, List, Optional, Union
 from docling_core.types.doc import (
    DocItemLabel,
@ -24,7 +24,6 @@ from docx.text.hyperlink import Hyperlink
 from docx.text.paragraph import Paragraph
 from docx.text.run import Run
 from lxml import etree
 from lxml.etree import XPath
 from PIL import Image, UnidentifiedImageError
 from pydantic import AnyUrl
 from typing_extensions import override
@ -59,6 +58,11 @@ class MsWordDocumentBackend(DeclarativeDocumentBackend):
        self.parents: dict[int, Optional[NodeItem]] = {}
        self.numbered_headers: dict[int, int] = {}
        self.equation_bookends: str = "<eq>{EQ}</eq>"
        # Track processed textbox elements to avoid duplication
        self.processed_textbox_elements: List[int] = []
        # Track content hash of processed paragraphs to avoid duplicate content
        self.processed_paragraph_content: List[str] = []
        for i in range(-1, self.max_levels):
            self.parents[i] = None
@ -175,10 +179,74 @@ class MsWordDocumentBackend(DeclarativeDocumentBackend):
                "a": "http://schemas.openxmlformats.org/drawingml/2006/main",
                "r": "http://schemas.openxmlformats.org/officeDocument/2006/relationships",
                "w": "http://schemas.openxmlformats.org/wordprocessingml/2006/main",
                "wp": "http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing",
                "mc": "http://schemas.openxmlformats.org/markup-compatibility/2006",
                "v": "urn:schemas-microsoft-com:vml",
                "wps": "http://schemas.microsoft.com/office/word/2010/wordprocessingShape",
                "w10": "urn:schemas-microsoft-com:office:word",
                "a14": "http://schemas.microsoft.com/office/drawing/2010/main",
            }
-            xpath_expr = XPath(".//a:blip", namespaces=namespaces)
+            xpath_expr = etree.XPath(".//a:blip", namespaces=namespaces)
            drawing_blip = xpath_expr(element)
            # Check for textbox content - check multiple textbox formats
            # Only process if the element hasn't been processed before
            element_id = id(element)
            if element_id not in self.processed_textbox_elements:
                # Modern Word textboxes
                txbx_xpath = etree.XPath(
                    ".//w:txbxContent|.//v:textbox//w:p", namespaces=namespaces
                )
                textbox_elements = txbx_xpath(element)
                # No modern textboxes found, check for alternate/legacy textbox formats
                if not textbox_elements and tag_name in ["drawing", "pict"]:
                    # Additional checks for textboxes in DrawingML and VML formats
                    alt_txbx_xpath = etree.XPath(
                        ".//wps:txbx//w:p|.//w10:wrap//w:p|.//a:p//a:t",
                        namespaces=namespaces,
                    )
                    textbox_elements = alt_txbx_xpath(element)
                    # Check for shape text that's not in a standard textbox
                    if not textbox_elements:
                        shape_text_xpath = etree.XPath(
                            ".//a:bodyPr/ancestor::*//a:t|.//a:txBody//a:t",
                            namespaces=namespaces,
                        )
                        shape_text_elements = shape_text_xpath(element)
                        if shape_text_elements:
                            # Create custom text elements from shape text
                            text_content = " ".join(
                                [t.text for t in shape_text_elements if t.text]
                            )
                            if text_content.strip():
                                _log.debug(f"Found shape text: {text_content[:50]}...")
                                # Create a paragraph-like element to process with standard handler
                                level = self._get_level()
                                shape_group = doc.add_group(
                                    label=GroupLabel.SECTION,
                                    parent=self.parents[level - 1],
                                    name="shape-text",
                                )
                                doc.add_text(
                                    label=DocItemLabel.PARAGRAPH,
                                    parent=shape_group,
                                    text=text_content,
                                )
                if textbox_elements:
                    # Mark the parent element as processed
                    self.processed_textbox_elements.append(element_id)
                    # Also mark all found textbox elements as processed
                    for tb_element in textbox_elements:
                        self.processed_textbox_elements.append(id(tb_element))
                    _log.debug(
                        f"Found textbox content with {len(textbox_elements)} elements"
                    )
                    self._handle_textbox_content(textbox_elements, docx_obj, doc)
            # Check for Tables
            if element.tag.endswith("tbl"):
                try:
@ -291,15 +359,17 @@ class MsWordDocumentBackend(DeclarativeDocumentBackend):
    @classmethod
    def _get_format_from_run(cls, run: Run) -> Optional[Formatting]:
-        has_any_formatting = run.bold or run.italic or run.underline
+        # The .bold and .italic properties are booleans, but .underline can be an enum
-        return (
+        # like WD_UNDERLINE.THICK (value 6), so we need to convert it to a boolean
-            Formatting(
+        has_bold = run.bold or False
-                bold=run.bold or False,
+        has_italic = run.italic or False
-                italic=run.italic or False,
+        # Convert any non-None underline value to True
-                underline=run.underline or False,
+        has_underline = bool(run.underline is not None and run.underline)
-            )
+
-            if has_any_formatting
+        return Formatting(
-            else None
+            bold=has_bold,
            italic=has_italic,
            underline=has_underline,
        )
    def _get_paragraph_elements(self, paragraph: Paragraph):
@ -355,6 +425,182 @@ class MsWordDocumentBackend(DeclarativeDocumentBackend):
        return paragraph_elements
    def _get_paragraph_position(self, paragraph_element):
        """Extract vertical position information from paragraph element."""
        # First try to directly get the index from w:p element that has an order-related attribute
        if (
            hasattr(paragraph_element, "getparent")
            and paragraph_element.getparent() is not None
        ):
            parent = paragraph_element.getparent()
            # Get all paragraph siblings
            paragraphs = [
                p for p in parent.getchildren() if etree.QName(p).localname == "p"
            ]
            # Find index of current paragraph within its siblings
            try:
                paragraph_index = paragraphs.index(paragraph_element)
                return paragraph_index  # Use index as position for consistent ordering
            except ValueError:
                pass
        # Look for position hints in element attributes and ancestor elements
        for elem in (*[paragraph_element], *paragraph_element.iterancestors()):
            # Check for direct position attributes
            for attr_name in ["y", "top", "positionY", "y-position", "position"]:
                value = elem.get(attr_name)
                if value:
                    try:
                        # Remove any non-numeric characters (like 'pt', 'px', etc.)
                        clean_value = re.sub(r"[^0-9.]", "", value)
                        if clean_value:
                            return float(clean_value)
                    except (ValueError, TypeError):
                        pass
            # Check for position in transform attribute
            transform = elem.get("transform")
            if transform:
                # Extract translation component from transform matrix
                match = re.search(r"translate\([^,]+,\s*([0-9.]+)", transform)
                if match:
                    try:
                        return float(match.group(1))
                    except ValueError:
                        pass
            # Check for anchors or relative position indicators in Word format
            # 'dist' attributes can indicate relative positioning
            for attr_name in ["distT", "distB", "anchor", "relativeFrom"]:
                if elem.get(attr_name) is not None:
                    return elem.sourceline  # Use the XML source line number as fallback
        # For VML shapes, look for specific attributes
        for ns_uri in paragraph_element.nsmap.values():
            if "vml" in ns_uri:
                # Try to extract position from style attribute
                style = paragraph_element.get("style")
                if style:
                    match = re.search(r"top:([0-9.]+)pt", style)
                    if match:
                        try:
                            return float(match.group(1))
                        except ValueError:
                            pass
        # If no better position indicator found, use XML source line number as proxy for order
        return (
            paragraph_element.sourceline
            if hasattr(paragraph_element, "sourceline")
            else None
        )
    def _collect_textbox_paragraphs(self, textbox_elements):
        """Collect and organize paragraphs from textbox elements."""
        processed_paragraphs = []
        container_paragraphs = {}
        for element in textbox_elements:
            element_id = id(element)
            # Skip if we've already processed this exact element
            if element_id in processed_paragraphs:
                continue
            tag_name = etree.QName(element).localname
            processed_paragraphs.append(element_id)
            # Handle paragraphs directly found (VML textboxes)
            if tag_name == "p":
                # Find the containing textbox or shape element
                container_id = None
                for ancestor in element.iterancestors():
                    if any(ns in ancestor.tag for ns in ["textbox", "shape", "txbx"]):
                        container_id = id(ancestor)
                        break
                if container_id not in container_paragraphs:
                    container_paragraphs[container_id] = []
                container_paragraphs[container_id].append(
                    (element, self._get_paragraph_position(element))
                )
            # Handle txbxContent elements (Word DrawingML textboxes)
            elif tag_name == "txbxContent":
                paragraphs = element.findall(".//w:p", namespaces=element.nsmap)
                container_id = id(element)
                if container_id not in container_paragraphs:
                    container_paragraphs[container_id] = []
                for p in paragraphs:
                    p_id = id(p)
                    if p_id not in processed_paragraphs:
                        processed_paragraphs.append(p_id)
                        container_paragraphs[container_id].append(
                            (p, self._get_paragraph_position(p))
                        )
            else:
                # Try to extract any paragraphs from unknown elements
                paragraphs = element.findall(".//w:p", namespaces=element.nsmap)
                container_id = id(element)
                if container_id not in container_paragraphs:
                    container_paragraphs[container_id] = []
                for p in paragraphs:
                    p_id = id(p)
                    if p_id not in processed_paragraphs:
                        processed_paragraphs.append(p_id)
                        container_paragraphs[container_id].append(
                            (p, self._get_paragraph_position(p))
                        )
        return container_paragraphs
    def _handle_textbox_content(
        self,
        textbox_elements: list,
        docx_obj: DocxDocument,
        doc: DoclingDocument,
    ) -> None:
        """Process textbox content and add it to the document structure."""
        level = self._get_level()
        # Create a textbox group to contain all text from the textbox
        textbox_group = doc.add_group(
            label=GroupLabel.SECTION, parent=self.parents[level - 1], name="textbox"
        )
        # Set this as the current parent to ensure textbox content
        # is properly nested in document structure
        original_parent = self.parents[level]
        self.parents[level] = textbox_group
        # Collect and organize paragraphs
        container_paragraphs = self._collect_textbox_paragraphs(textbox_elements)
        # Process all paragraphs
        all_paragraphs = []
        # Sort paragraphs within each container, then process containers
        for container_id, paragraphs in container_paragraphs.items():
            # Sort by vertical position within each container
            sorted_container_paragraphs = sorted(
                paragraphs,
                key=lambda x: (
                    x[1] is None,
                    x[1] if x[1] is not None else float("inf"),
                ),
            )
            # Add the sorted paragraphs to our processing list
            all_paragraphs.extend(sorted_container_paragraphs)
        # Process all the paragraphs
        for p, _ in all_paragraphs:
            self._handle_text_elements(p, docx_obj, doc, is_from_textbox=True)
        # Restore original parent
        self.parents[level] = original_parent
        return
    def _handle_equations_in_text(self, element, text):
        only_texts = []
        only_equations = []
@ -423,10 +669,21 @@ class MsWordDocumentBackend(DeclarativeDocumentBackend):
        element: BaseOxmlElement,
        docx_obj: DocxDocument,
        doc: DoclingDocument,
        is_from_textbox: bool = False,
    ) -> None:
        paragraph = Paragraph(element, docx_obj)
        # Skip if from a textbox and this exact paragraph content was already processed
        # Skip if from a textbox and this exact paragraph content was already processed
        raw_text = paragraph.text
        if is_from_textbox and raw_text:
            # Create a simple hash of content to detect duplicates
            content_hash = f"{len(raw_text)}:{raw_text[:50]}"
            if content_hash in self.processed_paragraph_content:
                _log.debug(f"Skipping duplicate paragraph content: {content_hash}")
                return
            self.processed_paragraph_content.append(content_hash)
        text, equations = self._handle_equations_in_text(element=element, text=raw_text)
        if text is None:
--- a/docling/backend/pypdfium2_backend.py
+++ b/docling/backend/pypdfium2_backend.py
@ -175,13 +175,18 @@ class PyPdfiumPageBackend(PdfPageBackend):
                if len(group) == 1:
                    return group[0]
                merged_text = "".join(cell.text for cell in group)
                merged_bbox = BoundingBox(
                    l=min(cell.rect.to_bounding_box().l for cell in group),
                    t=min(cell.rect.to_bounding_box().t for cell in group),
                    r=max(cell.rect.to_bounding_box().r for cell in group),
                    b=max(cell.rect.to_bounding_box().b for cell in group),
                )
                assert self._ppage is not None
                self.text_page = self._ppage.get_textpage()
                bbox = merged_bbox.to_bottom_left_origin(page_size.height)
                merged_text = self.text_page.get_text_bounded(*bbox.as_tuple())
                return TextCell(
                    index=group[0].index,
                    text=merged_text,
--- a/docling/datamodel/base_models.py
+++ b/docling/datamodel/base_models.py
@ -1,6 +1,9 @@
 import math
 from collections import defaultdict
 from enum import Enum
-from typing import TYPE_CHECKING, Dict, List, Optional, Union
+from typing import TYPE_CHECKING, Annotated, Dict, List, Literal, Optional, Union
 import numpy as np
 from docling_core.types.doc import (
    BoundingBox,
    DocItemLabel,
@ -16,7 +19,7 @@ from docling_core.types.io import (
    DocumentStream,
 )
 from PIL.Image import Image
-from pydantic import BaseModel, ConfigDict
+from pydantic import BaseModel, ConfigDict, Field, computed_field
 if TYPE_CHECKING:
    from docling.backend.pdf_backend import PdfPageBackend
@ -91,6 +94,7 @@ FormatToMimeType: Dict[InputFormat, List[str]] = {
        "image/tiff",
        "image/gif",
        "image/bmp",
        "image/webp",
    ],
    InputFormat.PDF: ["application/pdf"],
    InputFormat.ASCIIDOC: ["text/asciidoc"],
@ -298,3 +302,97 @@ class OpenAiApiResponse(BaseModel):
    choices: List[OpenAiResponseChoice]
    created: int
    usage: OpenAiResponseUsage
 # Create a type alias for score values
 ScoreValue = float
 class QualityGrade(str, Enum):
    POOR = "poor"
    FAIR = "fair"
    GOOD = "good"
    EXCELLENT = "excellent"
    UNSPECIFIED = "unspecified"
 class PageConfidenceScores(BaseModel):
    parse_score: ScoreValue = np.nan
    layout_score: ScoreValue = np.nan
    table_score: ScoreValue = np.nan
    ocr_score: ScoreValue = np.nan
    def _score_to_grade(self, score: ScoreValue) -> QualityGrade:
        if score < 0.5:
            return QualityGrade.POOR
        elif score < 0.8:
            return QualityGrade.FAIR
        elif score < 0.9:
            return QualityGrade.GOOD
        elif score >= 0.9:
            return QualityGrade.EXCELLENT
        return QualityGrade.UNSPECIFIED
    @computed_field  # type: ignore
    @property
    def mean_grade(self) -> QualityGrade:
        return self._score_to_grade(self.mean_score)
    @computed_field  # type: ignore
    @property
    def low_grade(self) -> QualityGrade:
        return self._score_to_grade(self.low_score)
    @computed_field  # type: ignore
    @property
    def mean_score(self) -> ScoreValue:
        return ScoreValue(
            np.nanmean(
                [
                    self.ocr_score,
                    self.table_score,
                    self.layout_score,
                    self.parse_score,
                ]
            )
        )
    @computed_field  # type: ignore
    @property
    def low_score(self) -> ScoreValue:
        return ScoreValue(
            np.nanquantile(
                [
                    self.ocr_score,
                    self.table_score,
                    self.layout_score,
                    self.parse_score,
                ],
                q=0.05,
            )
        )
 class ConfidenceReport(PageConfidenceScores):
    pages: Dict[int, PageConfidenceScores] = Field(
        default_factory=lambda: defaultdict(PageConfidenceScores)
    )
    @computed_field  # type: ignore
    @property
    def mean_score(self) -> ScoreValue:
        return ScoreValue(
            np.nanmean(
                [c.mean_score for c in self.pages.values()],
            )
        )
    @computed_field  # type: ignore
    @property
    def low_score(self) -> ScoreValue:
        return ScoreValue(
            np.nanmean(
                [c.low_score for c in self.pages.values()],
            )
        )
--- a/docling/datamodel/document.py
+++ b/docling/datamodel/document.py
@ -47,7 +47,7 @@ from docling_core.types.legacy_doc.document import (
 )
 from docling_core.utils.file import resolve_source_to_stream
 from docling_core.utils.legacy import docling_document_to_legacy
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
 from typing_extensions import deprecated
 from docling.backend.abstract_backend import (
@ -56,6 +56,7 @@ from docling.backend.abstract_backend import (
 )
 from docling.datamodel.base_models import (
    AssembledUnit,
    ConfidenceReport,
    ConversionStatus,
    DocumentStream,
    ErrorItem,
@ -201,6 +202,7 @@ class ConversionResult(BaseModel):
    pages: List[Page] = []
    assembled: AssembledUnit = AssembledUnit()
    timings: Dict[str, ProfilingItem] = {}
    confidence: ConfidenceReport = Field(default_factory=ConfidenceReport)
    document: DoclingDocument = _EMPTY_DOCLING_DOC
@ -302,7 +304,7 @@ class _DocumentConversionInput(BaseModel):
                    if ("." in obj.name and not obj.name.startswith("."))
                    else ""
                )
-                mime = _DocumentConversionInput._mime_from_extension(ext)
+                mime = _DocumentConversionInput._mime_from_extension(ext.lower())
            if mime is not None and mime.lower() == "application/zip":
                objname = obj.name.lower()
                if objname.endswith(".xlsx"):
@ -376,6 +378,13 @@ class _DocumentConversionInput(BaseModel):
            mime = FormatToMimeType[InputFormat.JSON_DOCLING][0]
        elif ext in FormatToExtensions[InputFormat.PDF]:
            mime = FormatToMimeType[InputFormat.PDF][0]
        elif ext in FormatToExtensions[InputFormat.DOCX]:
            mime = FormatToMimeType[InputFormat.DOCX][0]
        elif ext in FormatToExtensions[InputFormat.PPTX]:
            mime = FormatToMimeType[InputFormat.PPTX][0]
        elif ext in FormatToExtensions[InputFormat.XLSX]:
            mime = FormatToMimeType[InputFormat.XLSX][0]
        return mime
    @staticmethod
--- a/docling/datamodel/pipeline_options.py
+++ b/docling/datamodel/pipeline_options.py
@ -225,6 +225,7 @@ class PictureDescriptionApiOptions(PictureDescriptionBaseOptions):
    headers: Dict[str, str] = {}
    params: Dict[str, Any] = {}
    timeout: float = 20
    concurrency: int = 1
    prompt: str = "Describe this image in a few sentences."
    provenance: str = ""
@ -295,6 +296,7 @@ class ApiVlmOptions(BaseVlmOptions):
    params: Dict[str, Any] = {}
    scale: float = 2.0
    timeout: float = 60
    concurrency: int = 1
    response_format: ResponseFormat
--- a/docling/datamodel/settings.py
+++ b/docling/datamodel/settings.py
@ -56,13 +56,15 @@ class DebugSettings(BaseModel):
 class AppSettings(BaseSettings):
-    model_config = SettingsConfigDict(env_prefix="DOCLING_", env_nested_delimiter="_")
+    model_config = SettingsConfigDict(
        env_prefix="DOCLING_", env_nested_delimiter="_", env_nested_max_split=1
    )
-    perf: BatchConcurrencySettings
+    perf: BatchConcurrencySettings = BatchConcurrencySettings()
-    debug: DebugSettings
+    debug: DebugSettings = DebugSettings()
    cache_dir: Path = Path.home() / ".cache" / "docling"
    artifacts_path: Optional[Path] = None
-settings = AppSettings(perf=BatchConcurrencySettings(), debug=DebugSettings())
+settings = AppSettings()
--- a/docling/models/api_vlm_model.py
+++ b/docling/models/api_vlm_model.py
@ -1,4 +1,5 @@
 from collections.abc import Iterable
 from concurrent.futures import ThreadPoolExecutor
 from docling.datamodel.base_models import Page, VlmPrediction
 from docling.datamodel.document import ConversionResult
@ -27,6 +28,7 @@ class ApiVlmModel(BasePageModel):
                )
            self.timeout = self.vlm_options.timeout
            self.concurrency = self.vlm_options.concurrency
            self.prompt_content = (
                f"This is a page from a document.\n{self.vlm_options.prompt}"
            )
@ -38,10 +40,10 @@ class ApiVlmModel(BasePageModel):
    def __call__(
        self, conv_res: ConversionResult, page_batch: Iterable[Page]
    ) -> Iterable[Page]:
-        for page in page_batch:
+        def _vlm_request(page):
            assert page._backend is not None
            if not page._backend.is_valid():
-                yield page
+                return page
            else:
                with TimeRecorder(conv_res, "vlm"):
                    assert page.size is not None
@ -63,4 +65,7 @@ class ApiVlmModel(BasePageModel):
                    page.predictions.vlm_response = VlmPrediction(text=page_tags)
-                yield page
+                return page
        with ThreadPoolExecutor(max_workers=self.concurrency) as executor:
            yield from executor.map(_vlm_request, page_batch)
--- a/docling/models/layout_model.py
+++ b/docling/models/layout_model.py
@ -5,6 +5,7 @@ from collections.abc import Iterable
 from pathlib import Path
 from typing import Optional
 import numpy as np
 from docling_core.types.doc import DocItemLabel
 from docling_ibm_models.layoutmodel.layout_predictor import LayoutPredictor
 from PIL import Image
@ -184,6 +185,14 @@ class LayoutModel(BasePageModel):
                    ).postprocess()
                    # processed_clusters, processed_cells = clusters, page.cells
                    conv_res.confidence.pages[page.page_no].layout_score = float(
                        np.mean([c.confidence for c in processed_clusters])
                    )
                    conv_res.confidence.pages[page.page_no].ocr_score = float(
                        np.mean([c.confidence for c in processed_cells if c.from_ocr])
                    )
                    page.cells = processed_cells
                    page.predictions.layout = LayoutPrediction(
                        clusters=processed_clusters
--- a/docling/models/page_assemble_model.py
+++ b/docling/models/page_assemble_model.py
@ -3,6 +3,7 @@ import re
 from collections.abc import Iterable
 from typing import List
 import numpy as np
 from pydantic import BaseModel
 from docling.datamodel.base_models import (
--- a/docling/models/page_preprocessing_model.py
+++ b/docling/models/page_preprocessing_model.py
@ -1,11 +1,13 @@
 import re
 from collections.abc import Iterable
 from pathlib import Path
 from typing import Optional
 import numpy as np
 from PIL import ImageDraw
 from pydantic import BaseModel
-from docling.datamodel.base_models import Page
+from docling.datamodel.base_models import Page, ScoreValue
 from docling.datamodel.document import ConversionResult
 from docling.datamodel.settings import settings
 from docling.models.base_model import BasePageModel
@ -21,6 +23,14 @@ class PagePreprocessingModel(BasePageModel):
    def __init__(self, options: PagePreprocessingOptions):
        self.options = options
        # Pre-compiled regex patterns for efficiency
        self.GLYPH_RE = re.compile(r"GLYPH<[0-9A-Fa-f]+>")
        self.SLASH_G_RE = re.compile(r"(?:/G\d+){2,}")
        self.FRAG_RE = re.compile(r"\b[A-Za-z](?:/[a-z]{1,3}\.[a-z]{1,3}){2,}\b")
        self.SLASH_NUMBER_GARBAGE_RE = re.compile(
            r"(?:/\w+\s*){2,}"
        )  # Two or more "/token " sequences
    def __call__(
        self, conv_res: ConversionResult, page_batch: Iterable[Page]
    ) -> Iterable[Page]:
@ -60,6 +70,18 @@ class PagePreprocessingModel(BasePageModel):
        if self.options.create_parsed_page:
            page.parsed_page = page._backend.get_segmented_page()
        # Rate the text quality from the PDF parser, and aggregate on page
        text_scores = []
        for c in page.cells:
            score = self.rate_text_quality(c.text)
            text_scores.append(score)
        conv_res.confidence.pages[page.page_no].parse_score = float(
            np.nanquantile(
                text_scores, q=0.10
            )  # To emphasise problems in the parse_score, we take the 10% percentile score of all text cells.
        )
        # DEBUG code:
        def draw_text_boxes(image, cells, show: bool = False):
            draw = ImageDraw.Draw(image)
@ -88,3 +110,30 @@ class PagePreprocessingModel(BasePageModel):
            draw_text_boxes(page.get_image(scale=1.0), page.cells)
        return page
    def rate_text_quality(self, text: str) -> float:
        # Hard errors: if any of these patterns are found, return 0.0 immediately.
        blacklist_chars = ["<EFBFBD>"]
        if (
            any(text.find(c) >= 0 for c in blacklist_chars)
            or self.GLYPH_RE.search(text)
            or self.SLASH_G_RE.search(text)
            or self.SLASH_NUMBER_GARBAGE_RE.match(
                text
            )  # Check if text is mostly slash-number pattern
        ):
            return 0.0
        penalty = 0.0
        # Apply a penalty only if the fragmented words pattern occurs at least three times.
        frag_matches = self.FRAG_RE.findall(text)
        if len(frag_matches) >= 3:
            penalty += 0.1 * len(frag_matches)
        # Additional heuristic: if the average token length is below 2, add a penalty.
        # tokens = text.split()
        # if tokens and (sum(map(len, tokens)) / len(tokens)) < 2:
        #    penalty += 0.2
        return max(1.0 - penalty, 0.0)
--- a/docling/models/picture_description_api_model.py
+++ b/docling/models/picture_description_api_model.py
@ -1,4 +1,5 @@
 from collections.abc import Iterable
 from concurrent.futures import ThreadPoolExecutor
 from pathlib import Path
 from typing import Optional, Type, Union
@ -37,6 +38,7 @@ class PictureDescriptionApiModel(PictureDescriptionBaseModel):
            accelerator_options=accelerator_options,
        )
        self.options: PictureDescriptionApiOptions
        self.concurrency = self.options.concurrency
        if self.enabled:
            if not enable_remote_services:
@ -48,8 +50,8 @@ class PictureDescriptionApiModel(PictureDescriptionBaseModel):
    def _annotate_images(self, images: Iterable[Image.Image]) -> Iterable[str]:
        # Note: technically we could make a batch request here,
        # but not all APIs will allow for it. For example, vllm won't allow more than 1.
-        for image in images:
+        def _api_request(image):
-            yield api_image_request(
+            return api_image_request(
                image=image,
                prompt=self.options.prompt,
                url=self.options.url,
@ -57,3 +59,6 @@ class PictureDescriptionApiModel(PictureDescriptionBaseModel):
                headers=self.options.headers,
                **self.options.params,
            )
        with ThreadPoolExecutor(max_workers=self.concurrency) as executor:
            yield from executor.map(_api_request, images)
--- a/docling/models/tesseract_ocr_cli_model.py
+++ b/docling/models/tesseract_ocr_cli_model.py
@ -2,6 +2,7 @@ import csv
 import io
 import logging
 import os
 import subprocess
 import tempfile
 from collections.abc import Iterable
 from pathlib import Path
@ -10,7 +11,7 @@ from typing import List, Optional, Tuple, Type
 import pandas as pd
 from docling_core.types.doc import BoundingBox, CoordOrigin
-from docling_core.types.doc.page import BoundingRectangle, TextCell
+from docling_core.types.doc.page import TextCell
 from docling.datamodel.base_models import Page
 from docling.datamodel.document import ConversionResult
@ -21,7 +22,11 @@ from docling.datamodel.pipeline_options import (
 )
 from docling.datamodel.settings import settings
 from docling.models.base_ocr_model import BaseOcrModel
-from docling.utils.ocr_utils import map_tesseract_script
+from docling.utils.ocr_utils import (
    map_tesseract_script,
    parse_tesseract_orientation,
    tesseract_box_to_bounding_rectangle,
 )
 from docling.utils.profiling import TimeRecorder
 _log = logging.getLogger(__name__)
@ -49,6 +54,7 @@ class TesseractOcrCliModel(BaseOcrModel):
        self._version: Optional[str] = None
        self._tesseract_languages: Optional[List[str]] = None
        self._script_prefix: Optional[str] = None
        self._is_auto: bool = "auto" in self.options.lang
        if self.enabled:
            try:
@ -93,14 +99,13 @@ class TesseractOcrCliModel(BaseOcrModel):
        return name, version
-    def _run_tesseract(self, ifilename: str):
+    def _run_tesseract(self, ifilename: str, osd: pd.DataFrame):
        r"""
        Run tesseract CLI
        """
        cmd = [self.options.tesseract_cmd]
-
+        if self._is_auto:
-        if "auto" in self.options.lang:
+            lang = self._parse_language(osd)
            lang = self._detect_language(ifilename)
            if lang is not None:
                cmd.append("-l")
                cmd.append(lang)
@ -115,13 +120,12 @@ class TesseractOcrCliModel(BaseOcrModel):
        cmd += [ifilename, "stdout", "tsv"]
        _log.info("command: {}".format(" ".join(cmd)))
-        proc = Popen(cmd, stdout=PIPE, stderr=DEVNULL)
+        output = subprocess.run(cmd, stdout=PIPE, stderr=DEVNULL, check=True)
        output, _ = proc.communicate()
        # _log.info(output)
        # Decode the byte string to a regular string
-        decoded_data = output.decode("utf-8")
+        decoded_data = output.stdout.decode("utf-8")
        # _log.info(decoded_data)
        # Read the TSV file generated by Tesseract
@ -139,22 +143,24 @@ class TesseractOcrCliModel(BaseOcrModel):
        return df_filtered
-    def _detect_language(self, ifilename: str):
+    def _perform_osd(self, ifilename: str) -> pd.DataFrame:
        r"""
        Run tesseract in PSM 0 mode to detect the language
        """
        assert self._tesseract_languages is not None
        cmd = [self.options.tesseract_cmd]
        cmd.extend(["--psm", "0", "-l", "osd", ifilename, "stdout"])
        _log.info("command: {}".format(" ".join(cmd)))
-        proc = Popen(cmd, stdout=PIPE, stderr=DEVNULL)
+        output = subprocess.run(cmd, capture_output=True, check=True)
-        output, _ = proc.communicate()
+        decoded_data = output.stdout.decode("utf-8")
        decoded_data = output.decode("utf-8")
        df_detected = pd.read_csv(
            io.StringIO(decoded_data), sep=":", header=None, names=["key", "value"]
        )
-        scripts = df_detected.loc[df_detected["key"] == "Script"].value.tolist()
+        return df_detected
    def _parse_language(self, df_osd: pd.DataFrame) -> Optional[str]:
        assert self._tesseract_languages is not None
        scripts = df_osd.loc[df_osd["key"] == "Script"].value.tolist()
        if len(scripts) == 0:
            _log.warning("Tesseract cannot detect the script of the page")
            return None
@ -182,9 +188,8 @@ class TesseractOcrCliModel(BaseOcrModel):
        cmd = [self.options.tesseract_cmd]
        cmd.append("--list-langs")
        _log.info("command: {}".format(" ".join(cmd)))
-        proc = Popen(cmd, stdout=PIPE, stderr=DEVNULL)
+        output = subprocess.run(cmd, stdout=PIPE, stderr=DEVNULL, check=True)
-        output, _ = proc.communicate()
+        decoded_data = output.stdout.decode("utf-8")
        decoded_data = output.decode("utf-8")
        df_list = pd.read_csv(io.StringIO(decoded_data), header=None)
        self._tesseract_languages = df_list[0].tolist()[1:]
@ -203,7 +208,7 @@ class TesseractOcrCliModel(BaseOcrModel):
            yield from page_batch
            return
-        for page in page_batch:
+        for page_i, page in enumerate(page_batch):
            assert page._backend is not None
            if not page._backend.is_valid():
                yield page
@ -212,7 +217,7 @@ class TesseractOcrCliModel(BaseOcrModel):
                    ocr_rects = self.get_ocr_rects(page)
                    all_ocr_cells = []
-                    for ocr_rect in ocr_rects:
+                    for ocr_rect_i, ocr_rect in enumerate(ocr_rects):
                        # Skip zero area boxes
                        if ocr_rect.area() == 0:
                            continue
@ -225,8 +230,42 @@ class TesseractOcrCliModel(BaseOcrModel):
                            ) as image_file:
                                fname = image_file.name
                                high_res_image.save(image_file)
-
+                            doc_orientation = 0
-                            df_result = self._run_tesseract(fname)
+                            try:
                                df_osd = self._perform_osd(fname)
                                doc_orientation = _parse_orientation(df_osd)
                            except subprocess.CalledProcessError as exc:
                                _log.error(
                                    "OSD failed (doc %s, page: %s, "
                                    "OCR rectangle: %s, processed image file %s):\n %s",
                                    conv_res.input.file,
                                    page_i,
                                    ocr_rect_i,
                                    image_file,
                                    exc.stderr,
                                )
                                # Skipping if OSD fail when in auto mode, otherwise proceed
                                # to OCR in the hope OCR will succeed while OSD failed
                                if self._is_auto:
                                    continue
                            if doc_orientation != 0:
                                high_res_image = high_res_image.rotate(
                                    -doc_orientation, expand=True
                                )
                                high_res_image.save(fname)
                            try:
                                df_result = self._run_tesseract(fname, df_osd)
                            except subprocess.CalledProcessError as exc:
                                _log.error(
                                    "tesseract OCR failed (doc %s, page: %s, "
                                    "OCR rectangle: %s, processed image file %s):\n %s",
                                    conv_res.input.file,
                                    page_i,
                                    ocr_rect_i,
                                    image_file,
                                    exc.stderr,
                                )
                                continue
                        finally:
                            if os.path.exists(fname):
                                os.remove(fname)
@ -238,31 +277,30 @@ class TesseractOcrCliModel(BaseOcrModel):
                            text = row["text"]
                            conf = row["conf"]
-                            l = float(row["left"])  # noqa: E741
+                            left, top = float(row["left"]), float(row["top"])
-                            b = float(row["top"])
+                            right = left + float(row["width"])
-                            w = float(row["width"])
+                            bottom = top + row["height"]
-                            h = float(row["height"])
+                            bbox = BoundingBox(
-
+                                l=left,
-                            t = b + h
+                                t=top,
-                            r = l + w
+                                r=right,
-
+                                b=bottom,
                                coord_origin=CoordOrigin.TOPLEFT,
                            )
                            rect = tesseract_box_to_bounding_rectangle(
                                bbox,
                                original_offset=ocr_rect,
                                scale=self.scale,
                                orientation=doc_orientation,
                                im_size=high_res_image.size,
                            )
                            cell = TextCell(
                                index=ix,
                                text=str(text),
-                                orig=text,
+                                orig=str(text),
                                from_ocr=True,
                                confidence=conf / 100.0,
-                                rect=BoundingRectangle.from_bounding_box(
+                                rect=rect,
                                    BoundingBox.from_tuple(
                                        coord=(
                                            (l / self.scale) + ocr_rect.l,
                                            (b / self.scale) + ocr_rect.t,
                                            (r / self.scale) + ocr_rect.l,
                                            (t / self.scale) + ocr_rect.t,
                                        ),
                                        origin=CoordOrigin.TOPLEFT,
                                    )
                                ),
                            )
                            all_ocr_cells.append(cell)
@ -278,3 +316,9 @@ class TesseractOcrCliModel(BaseOcrModel):
    @classmethod
    def get_options_type(cls) -> Type[OcrOptions]:
        return TesseractCliOcrOptions
 def _parse_orientation(df_osd: pd.DataFrame) -> int:
    orientations = df_osd.loc[df_osd["key"] == "Orientation in degrees"].value.tolist()
    orientation = parse_tesseract_orientation(orientations[0].strip())
    return orientation
--- a/docling/models/tesseract_ocr_model.py
+++ b/docling/models/tesseract_ocr_model.py
@ -1,12 +1,11 @@
 from __future__ import annotations
 import logging
 from collections.abc import Iterable
 from pathlib import Path
-from typing import Optional, Type
+from typing import Iterable, Optional, Type
 from docling_core.types.doc import BoundingBox, CoordOrigin
-from docling_core.types.doc.page import BoundingRectangle, TextCell
+from docling_core.types.doc.page import TextCell
 from docling.datamodel.base_models import Page
 from docling.datamodel.document import ConversionResult
@ -17,7 +16,11 @@ from docling.datamodel.pipeline_options import (
 )
 from docling.datamodel.settings import settings
 from docling.models.base_ocr_model import BaseOcrModel
-from docling.utils.ocr_utils import map_tesseract_script
+from docling.utils.ocr_utils import (
    map_tesseract_script,
    parse_tesseract_orientation,
    tesseract_box_to_bounding_rectangle,
 )
 from docling.utils.profiling import TimeRecorder
 _log = logging.getLogger(__name__)
@ -38,7 +41,7 @@ class TesseractOcrModel(BaseOcrModel):
            accelerator_options=accelerator_options,
        )
        self.options: TesseractOcrOptions
-
+        self._is_auto: bool = "auto" in self.options.lang
        self.scale = 3  # multiplier for 72 dpi == 216 dpi.
        self.reader = None
        self.script_readers: dict[str, tesserocr.PyTessBaseAPI] = {}
@ -95,13 +98,13 @@ class TesseractOcrModel(BaseOcrModel):
            if lang == "auto":
                self.reader = tesserocr.PyTessBaseAPI(**tesserocr_kwargs)
                self.osd_reader = tesserocr.PyTessBaseAPI(
                    **{"lang": "osd", "psm": tesserocr.PSM.OSD_ONLY} | tesserocr_kwargs
                )
            else:
                self.reader = tesserocr.PyTessBaseAPI(
                    **{"lang": lang} | tesserocr_kwargs,
                )
            self.osd_reader = tesserocr.PyTessBaseAPI(
                **{"lang": "osd", "psm": tesserocr.PSM.OSD_ONLY} | tesserocr_kwargs
            )
            self.reader_RIL = tesserocr.RIL
    def __del__(self):
@ -118,19 +121,20 @@ class TesseractOcrModel(BaseOcrModel):
            yield from page_batch
            return
-        for page in page_batch:
+        for page_i, page in enumerate(page_batch):
            assert page._backend is not None
            if not page._backend.is_valid():
                yield page
            else:
                with TimeRecorder(conv_res, "ocr"):
                    assert self.reader is not None
                    assert self.osd_reader is not None
                    assert self._tesserocr_languages is not None
                    ocr_rects = self.get_ocr_rects(page)
                    all_ocr_cells = []
-                    for ocr_rect in ocr_rects:
+                    for ocr_rect_i, ocr_rect in enumerate(ocr_rects):
                        # Skip zero area boxes
                        if ocr_rect.area() == 0:
                            continue
@ -139,16 +143,27 @@ class TesseractOcrModel(BaseOcrModel):
                        )
                        local_reader = self.reader
                        if "auto" in self.options.lang:
                            assert self.osd_reader is not None
                        self.osd_reader.SetImage(high_res_image)
                        osd = self.osd_reader.DetectOrientationScript()
-
+                        # No text, or Orientation and Script detection failure
                            # No text, probably
                        if osd is None:
                            _log.error(
                                "OSD failed for doc (doc %s, page: %s, "
                                "OCR rectangle: %s)",
                                conv_res.input.file,
                                page_i,
                                ocr_rect_i,
                            )
                            # Skipping if OSD fail when in auto mode, otherwise proceed
                            # to OCR in the hope OCR will succeed while OSD failed
                            if self._is_auto:
                                continue
-
+                        doc_orientation = parse_tesseract_orientation(osd["orient_deg"])
                        if doc_orientation != 0:
                            high_res_image = high_res_image.rotate(
                                -doc_orientation, expand=True
                            )
                        if self._is_auto:
                            script = osd["script_name"]
                            script = map_tesseract_script(script)
                            lang = f"{self.script_prefix}{script}"
@ -188,11 +203,23 @@ class TesseractOcrModel(BaseOcrModel):
                            # Extract text within the bounding box
                            text = local_reader.GetUTF8Text().strip()
                            confidence = local_reader.MeanTextConf()
-                            left = box["x"] / self.scale
+                            left, top = box["x"], box["y"]
-                            bottom = box["y"] / self.scale
+                            right = left + box["w"]
-                            right = (box["x"] + box["w"]) / self.scale
+                            bottom = top + box["h"]
-                            top = (box["y"] + box["h"]) / self.scale
+                            bbox = BoundingBox(
-
+                                l=left,
                                t=top,
                                r=right,
                                b=bottom,
                                coord_origin=CoordOrigin.TOPLEFT,
                            )
                            rect = tesseract_box_to_bounding_rectangle(
                                bbox,
                                original_offset=ocr_rect,
                                scale=self.scale,
                                orientation=doc_orientation,
                                im_size=high_res_image.size,
                            )
                            cells.append(
                                TextCell(
                                    index=ix,
@ -200,12 +227,7 @@ class TesseractOcrModel(BaseOcrModel):
                                    orig=text,
                                    from_ocr=True,
                                    confidence=confidence,
-                                    rect=BoundingRectangle.from_bounding_box(
+                                    rect=rect,
                                        BoundingBox.from_tuple(
                                            coord=(left, top, right, bottom),
                                            origin=CoordOrigin.TOPLEFT,
                                        ),
                                    ),
                                )
                            )
--- a/docling/pipeline/standard_pdf_pipeline.py
+++ b/docling/pipeline/standard_pdf_pipeline.py
@ -3,11 +3,12 @@ import warnings
 from pathlib import Path
 from typing import Optional, cast
 import numpy as np
 from docling_core.types.doc import DocItem, ImageRef, PictureItem, TableItem
 from docling.backend.abstract_backend import AbstractDocumentBackend
 from docling.backend.pdf_backend import PdfDocumentBackend
-from docling.datamodel.base_models import AssembledUnit, Page
+from docling.datamodel.base_models import AssembledUnit, Page, PageConfidenceScores
 from docling.datamodel.document import ConversionResult
 from docling.datamodel.pipeline_options import PdfPipelineOptions
 from docling.datamodel.settings import settings
@ -60,7 +61,7 @@ class StandardPdfPipeline(PaginatedPipeline):
            or self.pipeline_options.generate_table_images
        )
-        self.glm_model = ReadingOrderModel(options=ReadingOrderOptions())
+        self.reading_order_model = ReadingOrderModel(options=ReadingOrderOptions())
        ocr_model = self.get_ocr_model(artifacts_path=artifacts_path)
@ -197,7 +198,7 @@ class StandardPdfPipeline(PaginatedPipeline):
                elements=all_elements, headers=all_headers, body=all_body
            )
-            conv_res.document = self.glm_model(conv_res)
+            conv_res.document = self.reading_order_model(conv_res)
            # Generate page images in the output
            if self.pipeline_options.generate_page_images:
@ -244,6 +245,30 @@ class StandardPdfPipeline(PaginatedPipeline):
                            cropped_im, dpi=int(72 * scale)
                        )
            # Aggregate confidence values for document:
            if len(conv_res.pages) > 0:
                conv_res.confidence.layout_score = float(
                    np.nanmean(
                        [c.layout_score for c in conv_res.confidence.pages.values()]
                    )
                )
                conv_res.confidence.parse_score = float(
                    np.nanquantile(
                        [c.parse_score for c in conv_res.confidence.pages.values()],
                        q=0.1,  # parse score should relate to worst 10% of pages.
                    )
                )
                conv_res.confidence.table_score = float(
                    np.nanmean(
                        [c.table_score for c in conv_res.confidence.pages.values()]
                    )
                )
                conv_res.confidence.ocr_score = float(
                    np.nanmean(
                        [c.ocr_score for c in conv_res.confidence.pages.values()]
                    )
                )
        return conv_res
    @classmethod
--- a/docling/pipeline/vlm_pipeline.py
+++ b/docling/pipeline/vlm_pipeline.py
@ -3,7 +3,7 @@ from io import BytesIO
 from pathlib import Path
 from typing import List, Optional, Union, cast
-# from docling_core.types import DoclingDocument
+from docling_core.types import DoclingDocument
 from docling_core.types.doc import BoundingBox, DocItem, ImageRef, PictureItem, TextItem
 from docling_core.types.doc.document import DocTagsDocument
 from PIL import Image as PILImage
@ -133,24 +133,22 @@ class VlmPipeline(PaginatedPipeline):
                doctags_doc = DocTagsDocument.from_doctags_and_image_pairs(
                    doctags_list_c, image_list_c
                )
-                conv_res.document.load_from_doctags(doctags_doc)
+                conv_res.document = DoclingDocument.load_from_doctags(doctags_doc)
                # If forced backend text, replace model predicted text with backend one
                if page.size:
                if self.force_backend_text:
                    scale = self.pipeline_options.images_scale
                    for element, _level in conv_res.document.iterate_items():
-                            if (
+                        if not isinstance(element, TextItem) or len(element.prov) == 0:
-                                not isinstance(element, TextItem)
+                            continue
-                                or len(element.prov) == 0
+                        page_ix = element.prov[0].page_no - 1
-                            ):
+                        page = conv_res.pages[page_ix]
                        if not page.size:
                            continue
                        crop_bbox = (
                            element.prov[0]
                            .bbox.scaled(scale=scale)
-                                .to_top_left_origin(
+                            .to_top_left_origin(page_height=page.size.height * scale)
                                    page_height=page.size.height * scale
                                )
                        )
                        txt = self.extract_text_from_backend(page, crop_bbox)
                        element.text = txt
--- a/docling/utils/layout_postprocessor.py
+++ b/docling/utils/layout_postprocessor.py
@ -90,17 +90,12 @@ class SpatialClusterIndex:
        containment_threshold: float,
    ) -> bool:
        """Check if two bboxes overlap sufficiently."""
-        area1, area2 = bbox1.area(), bbox2.area()
+        if bbox1.area() <= 0 or bbox2.area() <= 0:
        if area1 <= 0 or area2 <= 0:
            return False
-        overlap_area = bbox1.intersection_area_with(bbox2)
+        iou = bbox1.intersection_over_union(bbox2)
-        if overlap_area <= 0:
+        containment1 = bbox1.intersection_over_self(bbox2)
-            return False
+        containment2 = bbox2.intersection_over_self(bbox1)
        iou = overlap_area / (area1 + area2 - overlap_area)
        containment1 = overlap_area / area1
        containment2 = overlap_area / area2
        return (
            iou > overlap_threshold
@ -321,9 +316,7 @@ class LayoutPostprocessor:
        for special in special_clusters:
            contained = []
            for cluster in self.regular_clusters:
-                overlap = cluster.bbox.intersection_area_with(special.bbox)
+                containment = cluster.bbox.intersection_over_self(special.bbox)
                if overlap > 0:
                    containment = overlap / cluster.bbox.area()
                if containment > 0.8:
                    contained.append(cluster)
@ -379,9 +372,7 @@ class LayoutPostprocessor:
            for regular in self.regular_clusters:
                if regular.label == DocItemLabel.TABLE:
                    # Calculate overlap
-                    overlap = regular.bbox.intersection_area_with(wrapper.bbox)
+                    overlap_ratio = wrapper.bbox.intersection_over_self(regular.bbox)
                    wrapper_area = wrapper.bbox.area()
                    overlap_ratio = overlap / wrapper_area
                    conf_diff = wrapper.confidence - regular.confidence
@ -421,8 +412,7 @@ class LayoutPostprocessor:
        # Rule 2: CODE vs others
        if candidate.label == DocItemLabel.CODE:
            # Calculate how much of the other cluster is contained within the CODE cluster
-            overlap = other.bbox.intersection_area_with(candidate.bbox)
+            containment = other.bbox.intersection_over_self(candidate.bbox)
            containment = overlap / other.bbox.area()
            if containment > 0.8:  # other is 80% contained within CODE
                return True
@ -586,11 +576,9 @@ class LayoutPostprocessor:
                if cell.rect.to_bounding_box().area() <= 0:
                    continue
-                overlap = cell.rect.to_bounding_box().intersection_area_with(
+                overlap_ratio = cell.rect.to_bounding_box().intersection_over_self(
                    cluster.bbox
                )
                overlap_ratio = overlap / cell.rect.to_bounding_box().area()
                if overlap_ratio > best_overlap:
                    best_overlap = overlap_ratio
                    best_cluster = cluster
--- a/docling/utils/ocr_utils.py
+++ b/docling/utils/ocr_utils.py
@ -1,3 +1,11 @@
 from typing import Optional, Tuple
 from docling_core.types.doc import BoundingBox, CoordOrigin
 from docling_core.types.doc.page import BoundingRectangle
 from docling.utils.orientation import CLIPPED_ORIENTATIONS, rotate_bounding_box
 def map_tesseract_script(script: str) -> str:
    r""" """
    if script == "Katakana" or script == "Hiragana":
@ -7,3 +15,55 @@ def map_tesseract_script(script: str) -> str:
    elif script == "Korean":
        script = "Hangul"
    return script
 def parse_tesseract_orientation(orientation: str) -> int:
    # Tesseract orientation is [0, 90, 180, 270] clockwise, bounding rectangle angles
    # are [0, 360[ counterclockwise
    parsed = int(orientation)
    if parsed not in CLIPPED_ORIENTATIONS:
        msg = (
            f"invalid tesseract document orientation {orientation}, "
            f"expected orientation: {sorted(CLIPPED_ORIENTATIONS)}"
        )
        raise ValueError(msg)
    parsed = -parsed
    parsed %= 360
    return parsed
 def tesseract_box_to_bounding_rectangle(
    bbox: BoundingBox,
    *,
    original_offset: Optional[BoundingBox] = None,
    scale: float,
    orientation: int,
    im_size: Tuple[int, int],
 ) -> BoundingRectangle:
    # box is in the top, left, height, width format, top left coordinates
    rect = rotate_bounding_box(bbox, angle=-orientation, im_size=im_size)
    rect = BoundingRectangle(
        r_x0=rect.r_x0 / scale,
        r_y0=rect.r_y0 / scale,
        r_x1=rect.r_x1 / scale,
        r_y1=rect.r_y1 / scale,
        r_x2=rect.r_x2 / scale,
        r_y2=rect.r_y2 / scale,
        r_x3=rect.r_x3 / scale,
        r_y3=rect.r_y3 / scale,
        coord_origin=CoordOrigin.TOPLEFT,
    )
    if original_offset is not None:
        if original_offset.coord_origin is not CoordOrigin.TOPLEFT:
            msg = f"expected coordinate origin to be {CoordOrigin.TOPLEFT.value}"
            raise ValueError(msg)
        if original_offset is not None:
            rect.r_x0 += original_offset.l
            rect.r_x1 += original_offset.l
            rect.r_x2 += original_offset.l
            rect.r_x3 += original_offset.l
            rect.r_y0 += original_offset.t
            rect.r_y1 += original_offset.t
            rect.r_y2 += original_offset.t
            rect.r_y3 += original_offset.t
    return rect
--- a/docling/utils/orientation.py
+++ b/docling/utils/orientation.py
@ -0,0 +1,71 @@
 from typing import Tuple
 from docling_core.types.doc import BoundingBox, CoordOrigin
 from docling_core.types.doc.page import BoundingRectangle
 CLIPPED_ORIENTATIONS = [0, 90, 180, 270]
 def rotate_bounding_box(
    bbox: BoundingBox, angle: int, im_size: Tuple[int, int]
 ) -> BoundingRectangle:
    # The box is left top width height in TOPLEFT coordinates
    # Bounding rectangle start with r_0 at the bottom left whatever the
    # coordinate system. Then other corners are found rotating counterclockwise
    bbox = bbox.to_top_left_origin(im_size[1])
    left, top, width, height = bbox.l, bbox.t, bbox.width, bbox.height
    im_h, im_w = im_size
    angle = angle % 360
    if angle == 0:
        r_x0 = left
        r_y0 = top + height
        r_x1 = r_x0 + width
        r_y1 = r_y0
        r_x2 = r_x0 + width
        r_y2 = r_y0 - height
        r_x3 = r_x0
        r_y3 = r_y0 - height
    elif angle == 90:
        r_x0 = im_w - (top + height)
        r_y0 = left
        r_x1 = r_x0
        r_y1 = r_y0 + width
        r_x2 = r_x0 + height
        r_y2 = r_y0 + width
        r_x3 = r_x0
        r_y3 = r_y0 + width
    elif angle == 180:
        r_x0 = im_h - left
        r_y0 = im_w - (top + height)
        r_x1 = r_x0 - width
        r_y1 = r_y0
        r_x2 = r_x0 - width
        r_y2 = r_y0 + height
        r_x3 = r_x0
        r_y3 = r_y0 + height
    elif angle == 270:
        r_x0 = top + height
        r_y0 = im_h - left
        r_x1 = r_x0
        r_y1 = r_y0 - width
        r_x2 = r_x0 - height
        r_y2 = r_y0 - width
        r_x3 = r_x0 - height
        r_y3 = r_y0
    else:
        msg = (
            f"invalid orientation {angle}, expected values in:"
            f" {sorted(CLIPPED_ORIENTATIONS)}"
        )
        raise ValueError(msg)
    return BoundingRectangle(
        r_x0=r_x0,
        r_y0=r_y0,
        r_x1=r_x1,
        r_y1=r_y1,
        r_x2=r_x2,
        r_y2=r_y2,
        r_x3=r_x3,
        r_y3=r_y3,
        coord_origin=CoordOrigin.TOPLEFT,
    )
--- a/docs/concepts/chunking.md
+++ b/docs/concepts/chunking.md
@ -71,7 +71,10 @@ tokens), &
 chunks with same headings & captions) — users can opt out of this step via param
 `merge_peers` (by default `True`)
-👉 Example: see  [here](../examples/hybrid_chunking.ipynb).
+👉 Usage examples:
 - [Hybrid chunking](../examples/hybrid_chunking.ipynb)
 - [Advanced chunking & serialization](../examples/advanced_chunking_and_serialization.ipynb)
 ## Hierarchical Chunker
--- a/docs/examples/advanced_chunking_and_serialization.ipynb
+++ b/docs/examples/advanced_chunking_and_serialization.ipynb
@ -0,0 +1,559 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Advanced chunking & serialization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook we show how to customize the serialization strategies that come into\n",
    "play during chunking."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will work with a document that contains some [picture annotations](../pictures_description):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from docling_core.types.doc.document import DoclingDocument\n",
    "\n",
    "SOURCE = \"./data/2408.09869v3_enriched.json\"\n",
    "\n",
    "doc = DoclingDocument.load_from_json(SOURCE)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we define the chunker (for more details check out [Hybrid Chunking](../hybrid_chunking)):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from docling_core.transforms.chunker.hybrid_chunker import HybridChunker\n",
    "from docling_core.transforms.chunker.tokenizer.base import BaseTokenizer\n",
    "from docling_core.transforms.chunker.tokenizer.huggingface import HuggingFaceTokenizer\n",
    "from transformers import AutoTokenizer\n",
    "\n",
    "EMBED_MODEL_ID = \"sentence-transformers/all-MiniLM-L6-v2\"\n",
    "\n",
    "tokenizer: BaseTokenizer = HuggingFaceTokenizer(\n",
    "    tokenizer=AutoTokenizer.from_pretrained(EMBED_MODEL_ID),\n",
    ")\n",
    "chunker = HybridChunker(tokenizer=tokenizer)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tokenizer.get_max_tokens()=512\n"
     ]
    }
   ],
   "source": [
    "print(f\"{tokenizer.get_max_tokens()=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Defining some helper methods:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Iterable, Optional\n",
    "\n",
    "from docling_core.transforms.chunker.base import BaseChunk\n",
    "from docling_core.transforms.chunker.hierarchical_chunker import DocChunk\n",
    "from docling_core.types.doc.labels import DocItemLabel\n",
    "from rich.console import Console\n",
    "from rich.panel import Panel\n",
    "\n",
    "console = Console(\n",
    "    width=200,  # for getting Markdown tables rendered nicely\n",
    ")\n",
    "\n",
    "\n",
    "def find_n_th_chunk_with_label(\n",
    "    iter: Iterable[BaseChunk], n: int, label: DocItemLabel\n",
    ") -> Optional[DocChunk]:\n",
    "    num_found = -1\n",
    "    for i, chunk in enumerate(iter):\n",
    "        doc_chunk = DocChunk.model_validate(chunk)\n",
    "        for it in doc_chunk.meta.doc_items:\n",
    "            if it.label == label:\n",
    "                num_found += 1\n",
    "                if num_found == n:\n",
    "                    return i, chunk\n",
    "    return None, None\n",
    "\n",
    "\n",
    "def print_chunk(chunks, chunk_pos):\n",
    "    chunk = chunks[chunk_pos]\n",
    "    ctx_text = chunker.contextualize(chunk=chunk)\n",
    "    num_tokens = tokenizer.count_tokens(text=ctx_text)\n",
    "    doc_items_refs = [it.self_ref for it in chunk.meta.doc_items]\n",
    "    title = f\"{chunk_pos=} {num_tokens=} {doc_items_refs=}\"\n",
    "    console.print(Panel(ctx_text, title=title))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Table serialization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using the default strategy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we inspect the first chunk containing a table — using the default serialization strategy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Token indices sequence length is longer than the specified maximum sequence length for this model (652 > 512). Running this sequence through the model will result in indexing errors\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">╭────────────────────────────────────────────────────────────── chunk_pos=13 num_tokens=426 doc_items_refs=['#/texts/72', '#/tables/0'] ───────────────────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ 4 Performance                                                                                                                                                                                        │\n",
       "│ Table 1: Runtime characteristics of Docling with the standard model pipeline and settings, on our test dataset of 225 pages, on two different systems. OCR is disabled. We show the time-to-solution │\n",
       "│ (TTS), computed throughput in pages per second, and the peak memory used (resident set size) for both the Docling-native PDF backend and for the pypdfium backend, using 4 and 16 threads.           │\n",
       "│                                                                                                                                                                                                      │\n",
       "│ Apple M3 Max, Thread budget. = 4. Apple M3 Max, native backend.TTS = 177 s 167 s. Apple M3 Max, native backend.Pages/s = 1.27 1.34. Apple M3 Max, native backend.Mem = 6.20 GB. Apple M3 Max,        │\n",
       "│ pypdfium backend.TTS = 103 s 92 s. Apple M3 Max, pypdfium backend.Pages/s = 2.18 2.45. Apple M3 Max, pypdfium backend.Mem = 2.56 GB. (16 cores) Intel(R) Xeon E5-2690, Thread budget. = 16 4 16. (16 │\n",
       "│ cores) Intel(R) Xeon E5-2690, native backend.TTS = 375 s 244 s. (16 cores) Intel(R) Xeon E5-2690, native backend.Pages/s = 0.60 0.92. (16 cores) Intel(R) Xeon E5-2690, native backend.Mem = 6.16    │\n",
       "│ GB. (16 cores) Intel(R) Xeon E5-2690, pypdfium backend.TTS = 239 s 143 s. (16 cores) Intel(R) Xeon E5-2690, pypdfium backend.Pages/s = 0.94 1.57. (16 cores) Intel(R) Xeon E5-2690, pypdfium         │\n",
       "│ backend.Mem = 2.42 GB                                                                                                                                                                                │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n",
       "</pre>\n"
      ],
      "text/plain": [
       "╭────────────────────────────────────────────────────────────── chunk_pos=13 num_tokens=426 doc_items_refs=['#/texts/72', '#/tables/0'] ───────────────────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ 4 Performance                                                                                                                                                                                        │\n",
       "│ Table 1: Runtime characteristics of Docling with the standard model pipeline and settings, on our test dataset of 225 pages, on two different systems. OCR is disabled. We show the time-to-solution │\n",
       "│ (TTS), computed throughput in pages per second, and the peak memory used (resident set size) for both the Docling-native PDF backend and for the pypdfium backend, using 4 and 16 threads.           │\n",
       "│                                                                                                                                                                                                      │\n",
       "│ Apple M3 Max, Thread budget. = 4. Apple M3 Max, native backend.TTS = 177 s 167 s. Apple M3 Max, native backend.Pages/s = 1.27 1.34. Apple M3 Max, native backend.Mem = 6.20 GB. Apple M3 Max,        │\n",
       "│ pypdfium backend.TTS = 103 s 92 s. Apple M3 Max, pypdfium backend.Pages/s = 2.18 2.45. Apple M3 Max, pypdfium backend.Mem = 2.56 GB. (16 cores) Intel(R) Xeon E5-2690, Thread budget. = 16 4 16. (16 │\n",
       "│ cores) Intel(R) Xeon E5-2690, native backend.TTS = 375 s 244 s. (16 cores) Intel(R) Xeon E5-2690, native backend.Pages/s = 0.60 0.92. (16 cores) Intel(R) Xeon E5-2690, native backend.Mem = 6.16    │\n",
       "│ GB. (16 cores) Intel(R) Xeon E5-2690, pypdfium backend.TTS = 239 s 143 s. (16 cores) Intel(R) Xeon E5-2690, pypdfium backend.Pages/s = 0.94 1.57. (16 cores) Intel(R) Xeon E5-2690, pypdfium         │\n",
       "│ backend.Mem = 2.42 GB                                                                                                                                                                                │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "chunker = HybridChunker(tokenizer=tokenizer)\n",
    "\n",
    "chunk_iter = chunker.chunk(dl_doc=doc)\n",
    "\n",
    "chunks = list(chunk_iter)\n",
    "i, chunk = find_n_th_chunk_with_label(chunks, n=0, label=DocItemLabel.TABLE)\n",
    "print_chunk(\n",
    "    chunks=chunks,\n",
    "    chunk_pos=i,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    <strong>INFO</strong>: As you see above, using the <code>HybridChunker</code> can sometimes lead to a warning from the transformers library, however this is a \"false alarm\" — for details check <a href=\"https://docling-project.github.io/docling/faq/#hybridchunker-triggers-warning-token-indices-sequence-length-is-longer-than-the-specified-maximum-sequence-length-for-this-model\">here</a>.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configuring a different strategy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can configure a different serialization strategy. In the example below, we specify a different table serializer that serializes tables to Markdown instead of the triplet notation used by default:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">╭────────────────────────────────────────────────────────────── chunk_pos=13 num_tokens=431 doc_items_refs=['#/texts/72', '#/tables/0'] ───────────────────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ 4 Performance                                                                                                                                                                                        │\n",
       "│ Table 1: Runtime characteristics of Docling with the standard model pipeline and settings, on our test dataset of 225 pages, on two different systems. OCR is disabled. We show the time-to-solution │\n",
       "│ (TTS), computed throughput in pages per second, and the peak memory used (resident set size) for both the Docling-native PDF backend and for the pypdfium backend, using 4 and 16 threads.           │\n",
       "│                                                                                                                                                                                                      │\n",
       "│ | CPU                              | Thread budget   | native backend   | native backend   | native backend   | pypdfium backend   | pypdfium backend   | pypdfium backend   |                       │\n",
       "│ |----------------------------------|-----------------|------------------|------------------|------------------|--------------------|--------------------|--------------------|                       │\n",
       "│ |                                  |                 | TTS              | Pages/s          | Mem              | TTS                | Pages/s            | Mem                |                       │\n",
       "│ | Apple M3 Max                     | 4               | 177 s 167 s      | 1.27 1.34        | 6.20 GB          | 103 s 92 s         | 2.18 2.45          | 2.56 GB            |                       │\n",
       "│ | (16 cores) Intel(R) Xeon E5-2690 | 16 4 16         | 375 s 244 s      | 0.60 0.92        | 6.16 GB          | 239 s 143 s        | 0.94 1.57          | 2.42 GB            |                       │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n",
       "</pre>\n"
      ],
      "text/plain": [
       "╭────────────────────────────────────────────────────────────── chunk_pos=13 num_tokens=431 doc_items_refs=['#/texts/72', '#/tables/0'] ───────────────────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ 4 Performance                                                                                                                                                                                        │\n",
       "│ Table 1: Runtime characteristics of Docling with the standard model pipeline and settings, on our test dataset of 225 pages, on two different systems. OCR is disabled. We show the time-to-solution │\n",
       "│ (TTS), computed throughput in pages per second, and the peak memory used (resident set size) for both the Docling-native PDF backend and for the pypdfium backend, using 4 and 16 threads.           │\n",
       "│                                                                                                                                                                                                      │\n",
       "│ | CPU                              | Thread budget   | native backend   | native backend   | native backend   | pypdfium backend   | pypdfium backend   | pypdfium backend   |                       │\n",
       "│ |----------------------------------|-----------------|------------------|------------------|------------------|--------------------|--------------------|--------------------|                       │\n",
       "│ |                                  |                 | TTS              | Pages/s          | Mem              | TTS                | Pages/s            | Mem                |                       │\n",
       "│ | Apple M3 Max                     | 4               | 177 s 167 s      | 1.27 1.34        | 6.20 GB          | 103 s 92 s         | 2.18 2.45          | 2.56 GB            |                       │\n",
       "│ | (16 cores) Intel(R) Xeon E5-2690 | 16 4 16         | 375 s 244 s      | 0.60 0.92        | 6.16 GB          | 239 s 143 s        | 0.94 1.57          | 2.42 GB            |                       │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from docling_core.transforms.chunker.hierarchical_chunker import (\n",
    "    ChunkingDocSerializer,\n",
    "    ChunkingSerializerProvider,\n",
    ")\n",
    "from docling_core.transforms.serializer.markdown import MarkdownTableSerializer\n",
    "\n",
    "\n",
    "class MDTableSerializerProvider(ChunkingSerializerProvider):\n",
    "    def get_serializer(self, doc):\n",
    "        return ChunkingDocSerializer(\n",
    "            doc=doc,\n",
    "            table_serializer=MarkdownTableSerializer(),  # configuring a different table serializer\n",
    "        )\n",
    "\n",
    "\n",
    "chunker = HybridChunker(\n",
    "    tokenizer=tokenizer,\n",
    "    serializer_provider=MDTableSerializerProvider(),\n",
    ")\n",
    "\n",
    "chunk_iter = chunker.chunk(dl_doc=doc)\n",
    "\n",
    "chunks = list(chunk_iter)\n",
    "i, chunk = find_n_th_chunk_with_label(chunks, n=0, label=DocItemLabel.TABLE)\n",
    "print_chunk(\n",
    "    chunks=chunks,\n",
    "    chunk_pos=i,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Picture serialization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using the default strategy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we inspect the first chunk containing a picture.\n",
    "\n",
    "Even when using the default strategy, we can modify the relevant parameters, e.g. which placeholder is used for pictures:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">╭───────────────────────────────────────────────── chunk_pos=0 num_tokens=117 doc_items_refs=['#/pictures/0', '#/texts/2', '#/texts/3', '#/texts/4'] ──────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ &lt;!-- image --&gt;                                                                                                                                                                                       │\n",
       "│ Version 1.0                                                                                                                                                                                          │\n",
       "│ Christoph Auer Maksym Lysak Ahmed Nassar Michele Dolfi Nikolaos Livathinos Panos Vagenas Cesar Berrospi Ramis Matteo Omenetti Fabian Lindlbauer Kasper Dinkla Lokesh Mishra Yusik Kim Shubham Gupta  │\n",
       "│ Rafael Teixeira de Lima Valery Weber Lucas Morin Ingmar Meijer Viktor Kuropiatnyk Peter W. J. Staar                                                                                                  │\n",
       "│ AI4K Group, IBM Research R¨ uschlikon, Switzerland                                                                                                                                                   │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n",
       "</pre>\n"
      ],
      "text/plain": [
       "╭───────────────────────────────────────────────── chunk_pos=0 num_tokens=117 doc_items_refs=['#/pictures/0', '#/texts/2', '#/texts/3', '#/texts/4'] ──────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ <!-- image -->                                                                                                                                                                                       │\n",
       "│ Version 1.0                                                                                                                                                                                          │\n",
       "│ Christoph Auer Maksym Lysak Ahmed Nassar Michele Dolfi Nikolaos Livathinos Panos Vagenas Cesar Berrospi Ramis Matteo Omenetti Fabian Lindlbauer Kasper Dinkla Lokesh Mishra Yusik Kim Shubham Gupta  │\n",
       "│ Rafael Teixeira de Lima Valery Weber Lucas Morin Ingmar Meijer Viktor Kuropiatnyk Peter W. J. Staar                                                                                                  │\n",
       "│ AI4K Group, IBM Research R¨ uschlikon, Switzerland                                                                                                                                                   │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from docling_core.transforms.serializer.markdown import MarkdownParams\n",
    "\n",
    "\n",
    "class ImgPlaceholderSerializerProvider(ChunkingSerializerProvider):\n",
    "    def get_serializer(self, doc):\n",
    "        return ChunkingDocSerializer(\n",
    "            doc=doc,\n",
    "            params=MarkdownParams(\n",
    "                image_placeholder=\"<!-- image -->\",\n",
    "            ),\n",
    "        )\n",
    "\n",
    "\n",
    "chunker = HybridChunker(\n",
    "    tokenizer=tokenizer,\n",
    "    serializer_provider=ImgPlaceholderSerializerProvider(),\n",
    ")\n",
    "\n",
    "chunk_iter = chunker.chunk(dl_doc=doc)\n",
    "\n",
    "chunks = list(chunk_iter)\n",
    "i, chunk = find_n_th_chunk_with_label(chunks, n=0, label=DocItemLabel.PICTURE)\n",
    "print_chunk(\n",
    "    chunks=chunks,\n",
    "    chunk_pos=i,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using a custom strategy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we define and use our custom picture serialization strategy which leverages picture annotations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Any\n",
    "\n",
    "from docling_core.transforms.serializer.base import (\n",
    "    BaseDocSerializer,\n",
    "    SerializationResult,\n",
    ")\n",
    "from docling_core.transforms.serializer.common import create_ser_result\n",
    "from docling_core.transforms.serializer.markdown import MarkdownPictureSerializer\n",
    "from docling_core.types.doc.document import (\n",
    "    PictureClassificationData,\n",
    "    PictureDescriptionData,\n",
    "    PictureItem,\n",
    "    PictureMoleculeData,\n",
    ")\n",
    "from typing_extensions import override\n",
    "\n",
    "\n",
    "class AnnotationPictureSerializer(MarkdownPictureSerializer):\n",
    "    @override\n",
    "    def serialize(\n",
    "        self,\n",
    "        *,\n",
    "        item: PictureItem,\n",
    "        doc_serializer: BaseDocSerializer,\n",
    "        doc: DoclingDocument,\n",
    "        **kwargs: Any,\n",
    "    ) -> SerializationResult:\n",
    "        text_parts: list[str] = []\n",
    "        for annotation in item.annotations:\n",
    "            if isinstance(annotation, PictureClassificationData):\n",
    "                predicted_class = (\n",
    "                    annotation.predicted_classes[0].class_name\n",
    "                    if annotation.predicted_classes\n",
    "                    else None\n",
    "                )\n",
    "                if predicted_class is not None:\n",
    "                    text_parts.append(f\"Picture type: {predicted_class}\")\n",
    "            elif isinstance(annotation, PictureMoleculeData):\n",
    "                text_parts.append(f\"SMILES: {annotation.smi}\")\n",
    "            elif isinstance(annotation, PictureDescriptionData):\n",
    "                text_parts.append(f\"Picture description: {annotation.text}\")\n",
    "\n",
    "        text_res = \"\\n\".join(text_parts)\n",
    "        text_res = doc_serializer.post_process(text=text_res)\n",
    "        return create_ser_result(text=text_res, span_source=item)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">╭───────────────────────────────────────────────── chunk_pos=0 num_tokens=128 doc_items_refs=['#/pictures/0', '#/texts/2', '#/texts/3', '#/texts/4'] ──────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ Picture description: In this image we can see a cartoon image of a duck holding a paper.                                                                                                             │\n",
       "│ Version 1.0                                                                                                                                                                                          │\n",
       "│ Christoph Auer Maksym Lysak Ahmed Nassar Michele Dolfi Nikolaos Livathinos Panos Vagenas Cesar Berrospi Ramis Matteo Omenetti Fabian Lindlbauer Kasper Dinkla Lokesh Mishra Yusik Kim Shubham Gupta  │\n",
       "│ Rafael Teixeira de Lima Valery Weber Lucas Morin Ingmar Meijer Viktor Kuropiatnyk Peter W. J. Staar                                                                                                  │\n",
       "│ AI4K Group, IBM Research R¨ uschlikon, Switzerland                                                                                                                                                   │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n",
       "</pre>\n"
      ],
      "text/plain": [
       "╭───────────────────────────────────────────────── chunk_pos=0 num_tokens=128 doc_items_refs=['#/pictures/0', '#/texts/2', '#/texts/3', '#/texts/4'] ──────────────────────────────────────────────────╮\n",
       "│ Docling Technical Report                                                                                                                                                                             │\n",
       "│ Picture description: In this image we can see a cartoon image of a duck holding a paper.                                                                                                             │\n",
       "│ Version 1.0                                                                                                                                                                                          │\n",
       "│ Christoph Auer Maksym Lysak Ahmed Nassar Michele Dolfi Nikolaos Livathinos Panos Vagenas Cesar Berrospi Ramis Matteo Omenetti Fabian Lindlbauer Kasper Dinkla Lokesh Mishra Yusik Kim Shubham Gupta  │\n",
       "│ Rafael Teixeira de Lima Valery Weber Lucas Morin Ingmar Meijer Viktor Kuropiatnyk Peter W. J. Staar                                                                                                  │\n",
       "│ AI4K Group, IBM Research R¨ uschlikon, Switzerland                                                                                                                                                   │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "class ImgAnnotationSerializerProvider(ChunkingSerializerProvider):\n",
    "    def get_serializer(self, doc: DoclingDocument):\n",
    "        return ChunkingDocSerializer(\n",
    "            doc=doc,\n",
    "            picture_serializer=AnnotationPictureSerializer(),  # configuring a different picture serializer\n",
    "        )\n",
    "\n",
    "\n",
    "chunker = HybridChunker(\n",
    "    tokenizer=tokenizer,\n",
    "    serializer_provider=ImgAnnotationSerializerProvider(),\n",
    ")\n",
    "\n",
    "chunk_iter = chunker.chunk(dl_doc=doc)\n",
    "\n",
    "chunks = list(chunk_iter)\n",
    "i, chunk = find_n_th_chunk_with_label(chunks, n=0, label=DocItemLabel.PICTURE)\n",
    "print_chunk(\n",
    "    chunks=chunks,\n",
    "    chunk_pos=i,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
 }
--- a/docs/examples/data/2408.09869v3_enriched.json
+++ b/docs/examples/data/2408.09869v3_enriched.json
--- a/docs/examples/hybrid_chunking.ipynb
+++ b/docs/examples/hybrid_chunking.ipynb
@ -410,23 +410,6 @@
    "\n",
    "    print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configuring serialization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can additionally customize the serialization strategy via a user-provided\n",
    "[serializer provider](../../concepts/serialization).\n",
    "\n",
    "For usage examples check out [this notebook](https://github.com/docling-project/docling-core/blob/main/examples/chunking_and_serialization.ipynb)."
   ]
  }
 ],
 "metadata": {
--- a/docs/examples/minimal.py
+++ b/docs/examples/minimal.py
@ -1,7 +1,9 @@
 from docling.document_converter import DocumentConverter
 source = "https://arxiv.org/pdf/2408.09869"  # document per local path or URL
 converter = DocumentConverter()
-result = converter.convert(source)
+doc = converter.convert(source).document
-print(result.document.export_to_markdown())
+
 print(doc.export_to_markdown())
 # output: ## Docling Technical Report [...]"
--- a/docs/usage/supported_formats.md
+++ b/docs/usage/supported_formats.md
@ -14,7 +14,7 @@ Below you can find a listing of all supported input and output formats.
 | AsciiDoc | |
 | HTML, XHTML | |
 | CSV | |
-| PNG, JPEG, TIFF, BMP | Image formats |
+| PNG, JPEG, TIFF, BMP, WEBP | Image formats |
 Schema-specific support:
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -88,10 +88,10 @@ nav:
      - "Simple translation": examples/translate.py
      - examples/backend_csv.ipynb
      - examples/backend_xml_rag.ipynb
-    - 📤 Serialization:
+    - ✂️ Serialization & chunking:
      - examples/serialization.ipynb
    - ✂️ Chunking:
      - examples/hybrid_chunking.ipynb
      - examples/advanced_chunking_and_serialization.ipynb
    - 🤖 RAG with AI dev frameworks:
      - examples/rag_haystack.ipynb
      - examples/rag_langchain.ipynb
--- a/poetry.lock
+++ b/poetry.lock
@ -2434,8 +2434,11 @@ files = [
    {file = "lxml-5.4.0-cp36-cp36m-win_amd64.whl", hash = "sha256:7ce1a171ec325192c6a636b64c94418e71a1964f56d002cc28122fceff0b6121"},
    {file = "lxml-5.4.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:795f61bcaf8770e1b37eec24edf9771b307df3af74d1d6f27d812e15a9ff3872"},
    {file = "lxml-5.4.0-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:29f451a4b614a7b5b6c2e043d7b64a15bd8304d7e767055e8ab68387a8cacf4e"},
    {file = "lxml-5.4.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:891f7f991a68d20c75cb13c5c9142b2a3f9eb161f1f12a9489c82172d1f133c0"},
    {file = "lxml-5.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4aa412a82e460571fad592d0f93ce9935a20090029ba08eca05c614f99b0cc92"},
    {file = "lxml-5.4.0-cp37-cp37m-manylinux_2_28_aarch64.whl", hash = "sha256:ac7ba71f9561cd7d7b55e1ea5511543c0282e2b6450f122672a2694621d63b7e"},
    {file = "lxml-5.4.0-cp37-cp37m-manylinux_2_28_x86_64.whl", hash = "sha256:c5d32f5284012deaccd37da1e2cd42f081feaa76981f0eaa474351b68df813c5"},
    {file = "lxml-5.4.0-cp37-cp37m-musllinux_1_2_aarch64.whl", hash = "sha256:ce31158630a6ac85bddd6b830cffd46085ff90498b397bd0a259f59d27a12188"},
    {file = "lxml-5.4.0-cp37-cp37m-musllinux_1_2_x86_64.whl", hash = "sha256:31e63621e073e04697c1b2d23fcb89991790eef370ec37ce4d5d469f40924ed6"},
    {file = "lxml-5.4.0-cp37-cp37m-win32.whl", hash = "sha256:be2ba4c3c5b7900246a8f866580700ef0d538f2ca32535e991027bdaba944063"},
    {file = "lxml-5.4.0-cp37-cp37m-win_amd64.whl", hash = "sha256:09846782b1ef650b321484ad429217f5154da4d6e786636c38e434fa32e94e49"},
@ -3925,15 +3928,19 @@ files = [
    {file = "onnxruntime-1.22.0-cp310-cp310-macosx_13_0_universal2.whl", hash = "sha256:85d8826cc8054e4d6bf07f779dc742a363c39094015bdad6a08b3c18cfe0ba8c"},
    {file = "onnxruntime-1.22.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:468c9502a12f6f49ec335c2febd22fdceecc1e4cc96dfc27e419ba237dff5aff"},
    {file = "onnxruntime-1.22.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:681fe356d853630a898ee05f01ddb95728c9a168c9460e8361d0a240c9b7cb97"},
    {file = "onnxruntime-1.22.0-cp310-cp310-win_amd64.whl", hash = "sha256:20bca6495d06925631e201f2b257cc37086752e8fe7b6c83a67c6509f4759bc9"},
    {file = "onnxruntime-1.22.0-cp311-cp311-macosx_13_0_universal2.whl", hash = "sha256:8d6725c5b9a681d8fe72f2960c191a96c256367887d076b08466f52b4e0991df"},
    {file = "onnxruntime-1.22.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fef17d665a917866d1f68f09edc98223b9a27e6cb167dec69da4c66484ad12fd"},
    {file = "onnxruntime-1.22.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b978aa63a9a22095479c38371a9b359d4c15173cbb164eaad5f2cd27d666aa65"},
    {file = "onnxruntime-1.22.0-cp311-cp311-win_amd64.whl", hash = "sha256:03d3ef7fb11adf154149d6e767e21057e0e577b947dd3f66190b212528e1db31"},
    {file = "onnxruntime-1.22.0-cp312-cp312-macosx_13_0_universal2.whl", hash = "sha256:f3c0380f53c1e72a41b3f4d6af2ccc01df2c17844072233442c3a7e74851ab97"},
    {file = "onnxruntime-1.22.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c8601128eaef79b636152aea76ae6981b7c9fc81a618f584c15d78d42b310f1c"},
    {file = "onnxruntime-1.22.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6964a975731afc19dc3418fad8d4e08c48920144ff590149429a5ebe0d15fb3c"},
    {file = "onnxruntime-1.22.0-cp312-cp312-win_amd64.whl", hash = "sha256:c0d534a43d1264d1273c2d4f00a5a588fa98d21117a3345b7104fa0bbcaadb9a"},
    {file = "onnxruntime-1.22.0-cp313-cp313-macosx_13_0_universal2.whl", hash = "sha256:fe7c051236aae16d8e2e9ffbfc1e115a0cc2450e873a9c4cb75c0cc96c1dae07"},
    {file = "onnxruntime-1.22.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6a6bbed10bc5e770c04d422893d3045b81acbbadc9fb759a2cd1ca00993da919"},
    {file = "onnxruntime-1.22.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9fe45ee3e756300fccfd8d61b91129a121d3d80e9d38e01f03ff1295badc32b8"},
    {file = "onnxruntime-1.22.0-cp313-cp313-win_amd64.whl", hash = "sha256:5a31d84ef82b4b05d794a4ce8ba37b0d9deb768fd580e36e17b39e0b4840253b"},
    {file = "onnxruntime-1.22.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0a2ac5bd9205d831541db4e508e586e764a74f14efdd3f89af7fd20e1bf4a1ed"},
    {file = "onnxruntime-1.22.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:64845709f9e8a2809e8e009bc4c8f73b788cee9c6619b7d9930344eae4c9cd36"},
 ]
@ -8040,4 +8047,4 @@ vlm = ["accelerate", "transformers", "transformers"]
 [metadata]
 lock-version = "2.0"
 python-versions = "^3.9"
-content-hash = "123ae9256733cfb7eb3e9092c6be578dc1c535993ac2f7529832e43314fe32a4"
+content-hash = "9c243f65c2244898b695cfbd24de9848ed8207d8dba7f8f46e1c2d75a2247248"
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [tool.poetry]
 name = "docling"
-version = "2.31.1"  # DO NOT EDIT, updated automatically
+version = "2.34.0"  # DO NOT EDIT, updated automatically
 description = "SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications."
 authors = [
  "Christoph Auer <cau@zurich.ibm.com>",
@ -46,7 +46,7 @@ packages = [{ include = "docling" }]
 ######################
 python = "^3.9"
 pydantic = "^2.0.0"
-docling-core = {version = "^2.26.0", extras = ["chunking"]}
+docling-core = {version = "^2.29.0", extras = ["chunking"]}
 docling-ibm-models = "^3.4.0"
 docling-parse = "^4.0.0"
 filetype = "^1.2.0"
--- a/tests/data/docx/textbox.docx
+++ b/tests/data/docx/textbox.docx
--- a/tests/data/groundtruth/docling_v1/2305.03393v1-pg9.json
+++ b/tests/data/groundtruth/docling_v1/2305.03393v1-pg9.json
@ -213,9 +213,9 @@
      "prov": [
        {
          "bbox": [
-            139.6674041748047,
+            139.66746520996094,
            322.5054626464844,
-            475.00927734375,
+            475.0093078613281,
            454.4546203613281
          ],
          "page": 1,
--- a/tests/data/groundtruth/docling_v1/2305.03393v1-pg9.pages.json
+++ b/tests/data/groundtruth/docling_v1/2305.03393v1-pg9.pages.json
@ -2646,7 +2646,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -2686,7 +2686,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
@ -2881,7 +2881,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -3096,7 +3096,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -3280,9 +3280,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -7852,7 +7852,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -8184,9 +8184,9 @@
              "id": 0,
              "label": "table",
              "bbox": {
-                "l": 139.6674041748047,
+                "l": 139.66746520996094,
                "t": 337.5453796386719,
-                "r": 475.00927734375,
+                "r": 475.0093078613281,
                "b": 469.4945373535156,
                "coord_origin": "TOPLEFT"
              },
@ -13582,7 +13582,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -13628,7 +13628,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
@ -13841,7 +13841,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -14062,7 +14062,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -14252,9 +14252,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -19713,7 +19713,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -20224,7 +20224,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -20445,7 +20445,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -20635,9 +20635,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -26096,7 +26096,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -26440,7 +26440,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -26486,7 +26486,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
--- a/tests/data/groundtruth/docling_v1/multi_page.doctags.txt
+++ b/tests/data/groundtruth/docling_v1/multi_page.doctags.txt
@ -0,0 +1,55 @@
 <document>
 <subtitle-level-1><location><page_1><loc_12><loc_90><loc_44><loc_91></location>The Evolution of the Word Processor</subtitle-level-1>
 <paragraph><location><page_1><loc_12><loc_85><loc_84><loc_88></location>The concept of the word processor predates modern computers and has evolved through several technological milestones.</paragraph>
 <subtitle-level-1><location><page_1><loc_12><loc_81><loc_55><loc_83></location>Pre-Digital Era (19th - Early 20th Century)</subtitle-level-1>
 <paragraph><location><page_1><loc_12><loc_73><loc_85><loc_80></location>The origins of word processing can be traced back to the invention of the typewriter in the mid-19th century. Patented in 1868 by Christopher Latham Sholes, the typewriter revolutionized written communication by enabling people to produce legible, professional documents more efficiently than handwriting.</paragraph>
 <paragraph><location><page_1><loc_12><loc_65><loc_85><loc_71></location>During this period, the term "word processing" didn't exist, but the typewriter laid the groundwork for future developments. Over time, advancements such as carbon paper (for copies) and the electric typewriter (introduced by IBM in 1935) improved the speed and convenience of document creation.</paragraph>
 <subtitle-level-1><location><page_1><loc_12><loc_58><loc_57><loc_60></location>The Birth of Word Processing (1960s - 1970s)</subtitle-level-1>
 <paragraph><location><page_1><loc_12><loc_52><loc_88><loc_56></location>The term "word processor" first emerged in the 1960s and referred to any system designed to streamline written communication and document production. Early word processors were not software programs but rather standalone machines.</paragraph>
 <paragraph><location><page_1><loc_15><loc_43><loc_87><loc_50></location>- · IBM MT/ST (Magnetic Tape/Selectric Typewriter) : Introduced in 1964, this machine combined IBM's Selectric typewriter with magnetic tape storage. It allowed users to record, edit, and replay typed content-an early example of digital text storage.</paragraph>
 <paragraph><location><page_1><loc_15><loc_38><loc_84><loc_43></location>- · Wang Laboratories : In the 1970s, Wang introduced dedicated word processing machines. These devices, like the Wang 1200, featured small screens and floppy disks, making them revolutionary for their time.</paragraph>
 <paragraph><location><page_1><loc_12><loc_33><loc_86><loc_37></location>These machines were primarily used in offices, where secretarial pools benefited from their ability to make revisions without retyping entire documents.</paragraph>
 <subtitle-level-1><location><page_1><loc_12><loc_27><loc_52><loc_28></location>The Rise of Personal Computers (1980s)</subtitle-level-1>
 <paragraph><location><page_1><loc_12><loc_22><loc_87><loc_25></location>The advent of personal computers in the late 1970s and early 1980s transformed word processing from a niche tool to an essential technology for businesses and individuals alike.</paragraph>
 <paragraph><location><page_1><loc_15><loc_15><loc_88><loc_20></location>- · WordStar (1978) : Developed for the CP/M operating system, WordStar was one of the first widely used word processing programs. It featured early examples of modern features like cut, copy, and paste.</paragraph>
 <paragraph><location><page_1><loc_15><loc_10><loc_88><loc_15></location>- · Microsoft Word (1983) : Microsoft launched Word for MS-DOS in 1983, introducing a graphical user interface (GUI) and mouse support. Over the years, Microsoft Word became the industry standard for word processing.</paragraph>
 <paragraph><location><page_2><loc_12><loc_87><loc_87><loc_91></location>Other notable software from this era included WordPerfect, which was popular among legal professionals, and Apple's MacWrite, which leveraged the Macintosh's graphical capabilities.</paragraph>
 <subtitle-level-1><location><page_2><loc_12><loc_80><loc_46><loc_81></location>The Modern Era (1990s - Present)</subtitle-level-1>
 <paragraph><location><page_2><loc_12><loc_75><loc_86><loc_78></location>By the 1990s, word processing software had become more sophisticated, with features like spell check, grammar check, templates, and collaborative tools.</paragraph>
 <paragraph><location><page_2><loc_15><loc_70><loc_83><loc_73></location>- · Microsoft Office Suite : Microsoft continued to dominate with its Office Suite, integrating Word with other productivity tools like Excel and PowerPoint.</paragraph>
 <paragraph><location><page_2><loc_15><loc_67><loc_87><loc_70></location>- · OpenOffice and LibreOffice : Open-source alternatives emerged in the early 2000s, offering free and flexible word processing options.</paragraph>
 <paragraph><location><page_2><loc_15><loc_62><loc_88><loc_67></location>- · Google Docs (2006) : The introduction of cloud-based word processing revolutionized collaboration. Google Docs enabled real-time editing and sharing, making it a staple for teams and remote work.</paragraph>
 <subtitle-level-1><location><page_2><loc_12><loc_55><loc_39><loc_57></location>Future of Word Processing</subtitle-level-1>
 <paragraph><location><page_2><loc_12><loc_45><loc_87><loc_53></location>Today, word processors are more than just tools for typing. They integrate artificial intelligence for grammar and style suggestions (e.g., Grammarly), voice-to-text features, and advanced layout options. As AI continues to advance, word processors may evolve into even more intuitive tools that predict user needs, automate repetitive tasks, and support richer multimedia integration.</paragraph>
 <paragraph><location><page_2><loc_12><loc_35><loc_87><loc_40></location>From the clunky typewriters of the 19th century to the AI-powered cloud tools of today, the word processor has come a long way. It remains an essential tool for communication and creativity, shaping how we write and share ideas.</paragraph>
 <subtitle-level-1><location><page_3><loc_12><loc_90><loc_46><loc_91></location>Specialized Word Processing Tools</subtitle-level-1>
 <paragraph><location><page_3><loc_12><loc_83><loc_86><loc_88></location>In addition to general-purpose word processors, specialized tools have emerged to cater to specific industries and needs. These tools incorporate unique features tailored to their users' workflows:</paragraph>
 <paragraph><location><page_3><loc_15><loc_73><loc_87><loc_81></location>- · Academic and Technical Writing : Tools like LaTeX gained popularity among academics, scientists, and engineers. Unlike traditional word processors, LaTeX focuses on precise formatting, particularly for complex mathematical equations, scientific papers, and technical documents. It relies on a markup language to produce polished documents suitable for publishing.</paragraph>
 <paragraph><location><page_3><loc_15><loc_67><loc_85><loc_73></location>- · Screenwriting Software : For screenwriters, tools like Final Draft and Celtx are specialized to handle scripts for film and television. These programs automate the formatting of dialogue, scene descriptions, and other elements unique to screenwriting.</paragraph>
 <paragraph><location><page_3><loc_15><loc_60><loc_88><loc_67></location>- · Legal Document Processors : Word processors tailored for legal professionals, like WordPerfect, offered features such as redlining (early version tracking) and document comparison. Even today, many law firms rely on these tools due to their robust formatting options for contracts and legal briefs.</paragraph>
 <subtitle-level-1><location><page_3><loc_12><loc_53><loc_57><loc_55></location>Key Features That Changed Word Processing</subtitle-level-1>
 <paragraph><location><page_3><loc_12><loc_47><loc_86><loc_52></location>The evolution of word processors wasn't just about hardware or software improvements-it was about the features that revolutionized how people wrote and edited. Some of these transformative features include:</paragraph>
 <paragraph><location><page_3><loc_15><loc_42><loc_86><loc_45></location>- 1. Undo/Redo : Introduced in the 1980s, the ability to undo mistakes and redo actions made experimentation and error correction much easier.</paragraph>
 <paragraph><location><page_3><loc_15><loc_38><loc_87><loc_42></location>- 2. Spell Check and Grammar Check : By the 1990s, these became standard, allowing users to spot errors automatically.</paragraph>
 <paragraph><location><page_3><loc_15><loc_35><loc_82><loc_38></location>- 3. Templates : Pre-designed formats for documents, such as resumes, letters, and invoices, helped users save time.</paragraph>
 <paragraph><location><page_3><loc_15><loc_32><loc_84><loc_35></location>- 4. Track Changes : A game-changer for collaboration, this feature allowed multiple users to suggest edits while maintaining the original text.</paragraph>
 <paragraph><location><page_3><loc_15><loc_27><loc_88><loc_32></location>- 5. Real-Time Collaboration : Tools like Google Docs and Microsoft 365 enabled multiple users to edit the same document simultaneously, forever changing teamwork dynamics.</paragraph>
 <subtitle-level-1><location><page_3><loc_12><loc_20><loc_52><loc_22></location>The Cultural Impact of Word Processors</subtitle-level-1>
 <paragraph><location><page_3><loc_12><loc_14><loc_87><loc_18></location>The word processor didn't just change workplaces-it changed culture. It democratized writing, enabling anyone with access to a computer to produce professional-quality documents. This shift had profound implications for education, business, and creative fields:</paragraph>
 <paragraph><location><page_4><loc_15><loc_87><loc_86><loc_91></location>- · Accessibility : Writers no longer needed expensive publishing equipment or training in typesetting to create polished work. This accessibility paved the way for selfpublishing, blogging, and even fan fiction communities.</paragraph>
 <paragraph><location><page_4><loc_15><loc_82><loc_88><loc_87></location>- · Education : Word processors became a cornerstone of education, teaching students not only how to write essays but also how to use technology effectively. Features like bibliography generators and integrated research tools enhanced learning.</paragraph>
 <paragraph><location><page_4><loc_15><loc_77><loc_87><loc_82></location>- · Creative Writing : Writers gained powerful tools to organize their ideas. Programs like Scrivener allowed authors to manage large projects, from novels to screenplays, with features like chapter outlines and character notes.</paragraph>
 <subtitle-level-1><location><page_4><loc_12><loc_70><loc_50><loc_72></location>Word Processors in a Post-Digital Era</subtitle-level-1>
 <paragraph><location><page_4><loc_12><loc_67><loc_88><loc_68></location>As we move further into the 21st century, the role of the word processor continues to evolve:</paragraph>
 <paragraph><location><page_4><loc_15><loc_58><loc_88><loc_65></location>- 1. Artificial Intelligence : Modern word processors are leveraging AI to suggest content improvements. Tools like Grammarly, ProWritingAid, and even native features in Word now analyze tone, conciseness, and clarity. Some AI systems can even generate entire paragraphs or rewrite sentences.</paragraph>
 <paragraph><location><page_4><loc_15><loc_52><loc_86><loc_58></location>- 2. Integration with Other Tools : Word processors are no longer standalone. They integrate with task managers, cloud storage, and project management platforms. For instance, Google Docs syncs with Google Drive, while Microsoft Word integrates seamlessly with OneDrive and Teams.</paragraph>
 <paragraph><location><page_4><loc_15><loc_45><loc_84><loc_52></location>- 3. Voice Typing : Speech-to-text capabilities have made word processing more accessible, particularly for those with disabilities. Tools like Dragon NaturallySpeaking and built-in options in Google Docs and Microsoft Word have made dictation mainstream.</paragraph>
 <paragraph><location><page_4><loc_15><loc_40><loc_87><loc_45></location>- 4. Multimedia Documents : Word processing has expanded beyond text. Modern tools allow users to embed images, videos, charts, and interactive elements, transforming simple documents into rich multimedia experiences.</paragraph>
 <paragraph><location><page_4><loc_15><loc_35><loc_86><loc_40></location>- 5. Cross-Platform Accessibility : Thanks to cloud computing, documents can now be accessed and edited across devices. Whether you're on a desktop, tablet, or smartphone, you can continue working seamlessly.</paragraph>
 <subtitle-level-1><location><page_4><loc_12><loc_29><loc_38><loc_30></location>A Glimpse Into the Future</subtitle-level-1>
 <paragraph><location><page_4><loc_12><loc_24><loc_87><loc_27></location>The word processor's future lies in adaptability and intelligence. Some exciting possibilities include:</paragraph>
 <paragraph><location><page_4><loc_15><loc_19><loc_87><loc_22></location>- · Fully AI-Assisted Writing : Imagine a word processor that understands your writing style, drafts emails, or creates entire essays based on minimal input.</paragraph>
 <paragraph><location><page_4><loc_15><loc_14><loc_88><loc_19></location>- · Immersive Interfaces : As augmented reality (AR) and virtual reality (VR) technology advance, users may be able to write and edit in 3D spaces, collaborating in virtual environments.</paragraph>
 <paragraph><location><page_4><loc_15><loc_11><loc_87><loc_14></location>- · Hyper-Personalization : Word processors could offer dynamic suggestions based on industry-specific needs, user habits, or even regional language variations.</paragraph>
 <paragraph><location><page_5><loc_12><loc_80><loc_86><loc_88></location>The journey of the word processor-from clunky typewriters to AI-powered platformsreflects humanity's broader technological progress. What began as a tool to simply replace handwriting has transformed into a powerful ally for creativity, communication, and collaboration. As technology continues to advance, the word processor will undoubtedly remain at the heart of how we express ideas and connect with one another.</paragraph>
 </document>
--- a/tests/data/groundtruth/docling_v1/multi_page.json
+++ b/tests/data/groundtruth/docling_v1/multi_page.json
--- a/tests/data/groundtruth/docling_v1/multi_page.md
+++ b/tests/data/groundtruth/docling_v1/multi_page.md
@ -0,0 +1,105 @@
 ## The Evolution of the Word Processor
 The concept of the word processor predates modern computers and has evolved through several technological milestones.
 ## Pre-Digital Era (19th - Early 20th Century)
 The origins of word processing can be traced back to the invention of the typewriter in the mid-19th century. Patented in 1868 by Christopher Latham Sholes, the typewriter revolutionized written communication by enabling people to produce legible, professional documents more efficiently than handwriting.
 During this period, the term "word processing" didn't exist, but the typewriter laid the groundwork for future developments. Over time, advancements such as carbon paper (for copies) and the electric typewriter (introduced by IBM in 1935) improved the speed and convenience of document creation.
 ## The Birth of Word Processing (1960s - 1970s)
 The term "word processor" first emerged in the 1960s and referred to any system designed to streamline written communication and document production. Early word processors were not software programs but rather standalone machines.
 - · IBM MT/ST (Magnetic Tape/Selectric Typewriter) : Introduced in 1964, this machine combined IBM's Selectric typewriter with magnetic tape storage. It allowed users to record, edit, and replay typed content-an early example of digital text storage.
 - · Wang Laboratories : In the 1970s, Wang introduced dedicated word processing machines. These devices, like the Wang 1200, featured small screens and floppy disks, making them revolutionary for their time.
 These machines were primarily used in offices, where secretarial pools benefited from their ability to make revisions without retyping entire documents.
 ## The Rise of Personal Computers (1980s)
 The advent of personal computers in the late 1970s and early 1980s transformed word processing from a niche tool to an essential technology for businesses and individuals alike.
 - · WordStar (1978) : Developed for the CP/M operating system, WordStar was one of the first widely used word processing programs. It featured early examples of modern features like cut, copy, and paste.
 - · Microsoft Word (1983) : Microsoft launched Word for MS-DOS in 1983, introducing a graphical user interface (GUI) and mouse support. Over the years, Microsoft Word became the industry standard for word processing.
 Other notable software from this era included WordPerfect, which was popular among legal professionals, and Apple's MacWrite, which leveraged the Macintosh's graphical capabilities.
 ## The Modern Era (1990s - Present)
 By the 1990s, word processing software had become more sophisticated, with features like spell check, grammar check, templates, and collaborative tools.
 - · Microsoft Office Suite : Microsoft continued to dominate with its Office Suite, integrating Word with other productivity tools like Excel and PowerPoint.
 - · OpenOffice and LibreOffice : Open-source alternatives emerged in the early 2000s, offering free and flexible word processing options.
 - · Google Docs (2006) : The introduction of cloud-based word processing revolutionized collaboration. Google Docs enabled real-time editing and sharing, making it a staple for teams and remote work.
 ## Future of Word Processing
 Today, word processors are more than just tools for typing. They integrate artificial intelligence for grammar and style suggestions (e.g., Grammarly), voice-to-text features, and advanced layout options. As AI continues to advance, word processors may evolve into even more intuitive tools that predict user needs, automate repetitive tasks, and support richer multimedia integration.
 From the clunky typewriters of the 19th century to the AI-powered cloud tools of today, the word processor has come a long way. It remains an essential tool for communication and creativity, shaping how we write and share ideas.
 ## Specialized Word Processing Tools
 In addition to general-purpose word processors, specialized tools have emerged to cater to specific industries and needs. These tools incorporate unique features tailored to their users' workflows:
 - · Academic and Technical Writing : Tools like LaTeX gained popularity among academics, scientists, and engineers. Unlike traditional word processors, LaTeX focuses on precise formatting, particularly for complex mathematical equations, scientific papers, and technical documents. It relies on a markup language to produce polished documents suitable for publishing.
 - · Screenwriting Software : For screenwriters, tools like Final Draft and Celtx are specialized to handle scripts for film and television. These programs automate the formatting of dialogue, scene descriptions, and other elements unique to screenwriting.
 - · Legal Document Processors : Word processors tailored for legal professionals, like WordPerfect, offered features such as redlining (early version tracking) and document comparison. Even today, many law firms rely on these tools due to their robust formatting options for contracts and legal briefs.
 ## Key Features That Changed Word Processing
 The evolution of word processors wasn't just about hardware or software improvements-it was about the features that revolutionized how people wrote and edited. Some of these transformative features include:
 - 1. Undo/Redo : Introduced in the 1980s, the ability to undo mistakes and redo actions made experimentation and error correction much easier.
 - 2. Spell Check and Grammar Check : By the 1990s, these became standard, allowing users to spot errors automatically.
 - 3. Templates : Pre-designed formats for documents, such as resumes, letters, and invoices, helped users save time.
 - 4. Track Changes : A game-changer for collaboration, this feature allowed multiple users to suggest edits while maintaining the original text.
 - 5. Real-Time Collaboration : Tools like Google Docs and Microsoft 365 enabled multiple users to edit the same document simultaneously, forever changing teamwork dynamics.
 ## The Cultural Impact of Word Processors
 The word processor didn't just change workplaces-it changed culture. It democratized writing, enabling anyone with access to a computer to produce professional-quality documents. This shift had profound implications for education, business, and creative fields:
 - · Accessibility : Writers no longer needed expensive publishing equipment or training in typesetting to create polished work. This accessibility paved the way for selfpublishing, blogging, and even fan fiction communities.
 - · Education : Word processors became a cornerstone of education, teaching students not only how to write essays but also how to use technology effectively. Features like bibliography generators and integrated research tools enhanced learning.
 - · Creative Writing : Writers gained powerful tools to organize their ideas. Programs like Scrivener allowed authors to manage large projects, from novels to screenplays, with features like chapter outlines and character notes.
 ## Word Processors in a Post-Digital Era
 As we move further into the 21st century, the role of the word processor continues to evolve:
 - 1. Artificial Intelligence : Modern word processors are leveraging AI to suggest content improvements. Tools like Grammarly, ProWritingAid, and even native features in Word now analyze tone, conciseness, and clarity. Some AI systems can even generate entire paragraphs or rewrite sentences.
 - 2. Integration with Other Tools : Word processors are no longer standalone. They integrate with task managers, cloud storage, and project management platforms. For instance, Google Docs syncs with Google Drive, while Microsoft Word integrates seamlessly with OneDrive and Teams.
 - 3. Voice Typing : Speech-to-text capabilities have made word processing more accessible, particularly for those with disabilities. Tools like Dragon NaturallySpeaking and built-in options in Google Docs and Microsoft Word have made dictation mainstream.
 - 4. Multimedia Documents : Word processing has expanded beyond text. Modern tools allow users to embed images, videos, charts, and interactive elements, transforming simple documents into rich multimedia experiences.
 - 5. Cross-Platform Accessibility : Thanks to cloud computing, documents can now be accessed and edited across devices. Whether you're on a desktop, tablet, or smartphone, you can continue working seamlessly.
 ## A Glimpse Into the Future
 The word processor's future lies in adaptability and intelligence. Some exciting possibilities include:
 - · Fully AI-Assisted Writing : Imagine a word processor that understands your writing style, drafts emails, or creates entire essays based on minimal input.
 - · Immersive Interfaces : As augmented reality (AR) and virtual reality (VR) technology advance, users may be able to write and edit in 3D spaces, collaborating in virtual environments.
 - · Hyper-Personalization : Word processors could offer dynamic suggestions based on industry-specific needs, user habits, or even regional language variations.
 The journey of the word processor-from clunky typewriters to AI-powered platformsreflects humanity's broader technological progress. What began as a tool to simply replace handwriting has transformed into a powerful ally for creativity, communication, and collaboration. As technology continues to advance, the word processor will undoubtedly remain at the heart of how we express ideas and connect with one another.
--- a/tests/data/groundtruth/docling_v1/multi_page.pages.json
+++ b/tests/data/groundtruth/docling_v1/multi_page.pages.json
--- a/tests/data/groundtruth/docling_v2/2305.03393v1-pg9.json
+++ b/tests/data/groundtruth/docling_v2/2305.03393v1-pg9.json
@ -336,9 +336,9 @@
        {
          "page_no": 1,
          "bbox": {
-            "l": 139.6674041748047,
+            "l": 139.66746520996094,
            "t": 454.4546203613281,
-            "r": 475.00927734375,
+            "r": 475.0093078613281,
            "b": 322.5054626464844,
            "coord_origin": "BOTTOMLEFT"
          },
--- a/tests/data/groundtruth/docling_v2/2305.03393v1-pg9.pages.json
+++ b/tests/data/groundtruth/docling_v2/2305.03393v1-pg9.pages.json
@ -2646,7 +2646,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -2686,7 +2686,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
@ -2881,7 +2881,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -3096,7 +3096,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -3280,9 +3280,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -7852,7 +7852,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -8184,9 +8184,9 @@
              "id": 0,
              "label": "table",
              "bbox": {
-                "l": 139.6674041748047,
+                "l": 139.66746520996094,
                "t": 337.5453796386719,
-                "r": 475.00927734375,
+                "r": 475.0093078613281,
                "b": 469.4945373535156,
                "coord_origin": "TOPLEFT"
              },
@ -13582,7 +13582,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -13628,7 +13628,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
@ -13841,7 +13841,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -14062,7 +14062,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -14252,9 +14252,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -19713,7 +19713,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -20224,7 +20224,7 @@
              "b": 255.42400999999995,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9850425124168396,
+            "confidence": 0.98504239320755,
            "cells": [
              {
                "index": 7,
@ -20445,7 +20445,7 @@
              "b": 327.98218,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9591907262802124,
+            "confidence": 0.9591910243034363,
            "cells": [
              {
                "index": 15,
@ -20635,9 +20635,9 @@
            "id": 0,
            "label": "table",
            "bbox": {
-              "l": 139.6674041748047,
+              "l": 139.66746520996094,
              "t": 337.5453796386719,
-              "r": 475.00927734375,
+              "r": 475.0093078613281,
              "b": 469.4945373535156,
              "coord_origin": "TOPLEFT"
            },
@ -26096,7 +26096,7 @@
              "b": 618.3,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9849976301193237,
+            "confidence": 0.9849975109100342,
            "cells": [
              {
                "index": 93,
@ -26440,7 +26440,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.9373533129692078,
+            "confidence": 0.9373531937599182,
            "cells": [
              {
                "index": 0,
@ -26486,7 +26486,7 @@
              "b": 102.78223000000003,
              "coord_origin": "TOPLEFT"
            },
-            "confidence": 0.8858679533004761,
+            "confidence": 0.8858677744865417,
            "cells": [
              {
                "index": 1,
--- a/tests/data/groundtruth/docling_v2/equations.docx.json
+++ b/tests/data/groundtruth/docling_v2/equations.docx.json
@ -245,7 +245,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "And that is an equation by itself. Cheers!",
-      "text": "And that is an equation by itself. Cheers!"
+      "text": "And that is an equation by itself. Cheers!",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/6",
@ -269,7 +275,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "This is another equation:",
-      "text": "This is another equation:"
+      "text": "This is another equation:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/8",
@ -305,7 +317,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text.",
-      "text": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text."
+      "text": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/11",
@ -413,7 +431,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "And that is an equation by itself. Cheers!",
-      "text": "And that is an equation by itself. Cheers!"
+      "text": "And that is an equation by itself. Cheers!",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/20",
@ -437,7 +461,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "This is another equation:",
-      "text": "This is another equation:"
+      "text": "This is another equation:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/22",
@ -485,7 +515,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text.",
-      "text": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text."
+      "text": "This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text. This is text.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/26",
@ -593,7 +629,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "And that is an equation by itself. Cheers!",
-      "text": "And that is an equation by itself. Cheers!"
+      "text": "And that is an equation by itself. Cheers!",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/35",
--- a/tests/data/groundtruth/docling_v2/lorem_ipsum.docx.json
+++ b/tests/data/groundtruth/docling_v2/lorem_ipsum.docx.json
@ -61,7 +61,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin elit mi, fermentum vitae dolor facilisis, porttitor mollis quam. Cras quam massa, venenatis faucibus libero vel, euismod sollicitudin ipsum. Aliquam semper sapien leo, ac ultrices nibh mollis congue. Cras luctus ultrices est, ut scelerisque eros euismod ut. Curabitur ac tincidunt felis, non scelerisque lectus. Praesent sollicitudin vulputate est id consequat. Vestibulum pharetra ligula sit amet varius porttitor. Sed eros diam, gravida non varius at, scelerisque in libero. Ut auctor finibus mauris sit amet ornare. Sed facilisis leo at urna rhoncus, in facilisis arcu eleifend. Sed tincidunt lacinia fermentum. Cras non purus fringilla, semper quam non, sodales sem. Nulla facilisi.",
-      "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin elit mi, fermentum vitae dolor facilisis, porttitor mollis quam. Cras quam massa, venenatis faucibus libero vel, euismod sollicitudin ipsum. Aliquam semper sapien leo, ac ultrices nibh mollis congue. Cras luctus ultrices est, ut scelerisque eros euismod ut. Curabitur ac tincidunt felis, non scelerisque lectus. Praesent sollicitudin vulputate est id consequat. Vestibulum pharetra ligula sit amet varius porttitor. Sed eros diam, gravida non varius at, scelerisque in libero. Ut auctor finibus mauris sit amet ornare. Sed facilisis leo at urna rhoncus, in facilisis arcu eleifend. Sed tincidunt lacinia fermentum. Cras non purus fringilla, semper quam non, sodales sem. Nulla facilisi."
+      "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin elit mi, fermentum vitae dolor facilisis, porttitor mollis quam. Cras quam massa, venenatis faucibus libero vel, euismod sollicitudin ipsum. Aliquam semper sapien leo, ac ultrices nibh mollis congue. Cras luctus ultrices est, ut scelerisque eros euismod ut. Curabitur ac tincidunt felis, non scelerisque lectus. Praesent sollicitudin vulputate est id consequat. Vestibulum pharetra ligula sit amet varius porttitor. Sed eros diam, gravida non varius at, scelerisque in libero. Ut auctor finibus mauris sit amet ornare. Sed facilisis leo at urna rhoncus, in facilisis arcu eleifend. Sed tincidunt lacinia fermentum. Cras non purus fringilla, semper quam non, sodales sem. Nulla facilisi.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/1",
@ -85,7 +91,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Duis condimentum dui eget ullamcorper maximus. Nulla tortor lectus, hendrerit at diam fermentum, euismod ornare orci. Integer ac mauris sed augue ultricies pellentesque. Etiam condimentum turpis a risus dictum, sed tempor arcu vestibulum. Quisque at venenatis tellus. Morbi id lobortis elit. In gravida metus at ornare suscipit. Donec euismod nibh sit amet commodo porttitor. Integer commodo sit amet nisi vel accumsan. Donec lacinia posuere porta. Pellentesque vulputate porta risus, vel consectetur nisl gravida sit amet. Nam scelerisque enim sodales lacus tempor, et tristique ante aliquet.",
-      "text": "Duis condimentum dui eget ullamcorper maximus. Nulla tortor lectus, hendrerit at diam fermentum, euismod ornare orci. Integer ac mauris sed augue ultricies pellentesque. Etiam condimentum turpis a risus dictum, sed tempor arcu vestibulum. Quisque at venenatis tellus. Morbi id lobortis elit. In gravida metus at ornare suscipit. Donec euismod nibh sit amet commodo porttitor. Integer commodo sit amet nisi vel accumsan. Donec lacinia posuere porta. Pellentesque vulputate porta risus, vel consectetur nisl gravida sit amet. Nam scelerisque enim sodales lacus tempor, et tristique ante aliquet."
+      "text": "Duis condimentum dui eget ullamcorper maximus. Nulla tortor lectus, hendrerit at diam fermentum, euismod ornare orci. Integer ac mauris sed augue ultricies pellentesque. Etiam condimentum turpis a risus dictum, sed tempor arcu vestibulum. Quisque at venenatis tellus. Morbi id lobortis elit. In gravida metus at ornare suscipit. Donec euismod nibh sit amet commodo porttitor. Integer commodo sit amet nisi vel accumsan. Donec lacinia posuere porta. Pellentesque vulputate porta risus, vel consectetur nisl gravida sit amet. Nam scelerisque enim sodales lacus tempor, et tristique ante aliquet.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/3",
@ -109,7 +121,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Maecenas id neque pharetra, eleifend lectus a, vehicula sapien. Aliquam erat volutpat. Ut arcu erat, blandit id elementum at, aliquet pretium mauris. Nulla at semper orci. Nunc sed maximus metus. Duis eget tristique arcu. Phasellus fringilla augue est, ut bibendum est bibendum vitae. Nam et urna interdum, egestas velit a, consectetur metus. Pellentesque facilisis vehicula orci, eu posuere justo imperdiet non. Vestibulum tincidunt orci ac lorem consequat semper. Fusce semper sollicitudin orci, id lacinia nulla faucibus eu. Donec ut nisl metus.",
-      "text": "Maecenas id neque pharetra, eleifend lectus a, vehicula sapien. Aliquam erat volutpat. Ut arcu erat, blandit id elementum at, aliquet pretium mauris. Nulla at semper orci. Nunc sed maximus metus. Duis eget tristique arcu. Phasellus fringilla augue est, ut bibendum est bibendum vitae. Nam et urna interdum, egestas velit a, consectetur metus. Pellentesque facilisis vehicula orci, eu posuere justo imperdiet non. Vestibulum tincidunt orci ac lorem consequat semper. Fusce semper sollicitudin orci, id lacinia nulla faucibus eu. Donec ut nisl metus."
+      "text": "Maecenas id neque pharetra, eleifend lectus a, vehicula sapien. Aliquam erat volutpat. Ut arcu erat, blandit id elementum at, aliquet pretium mauris. Nulla at semper orci. Nunc sed maximus metus. Duis eget tristique arcu. Phasellus fringilla augue est, ut bibendum est bibendum vitae. Nam et urna interdum, egestas velit a, consectetur metus. Pellentesque facilisis vehicula orci, eu posuere justo imperdiet non. Vestibulum tincidunt orci ac lorem consequat semper. Fusce semper sollicitudin orci, id lacinia nulla faucibus eu. Donec ut nisl metus.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/5",
@ -133,7 +151,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Duis ac tellus sed turpis feugiat aliquam sed vel justo. Fusce sit amet volutpat massa. Duis tristique finibus metus quis tincidunt. Etiam dapibus fringilla diam at pharetra. Vivamus dolor est, hendrerit ac ligula nec, pharetra lacinia sapien. Phasellus at malesuada orci. Maecenas est justo, mollis non ultrices ut, sagittis commodo odio. Integer viverra mauris pellentesque bibendum vestibulum. Sed eu felis mattis, efficitur justo non, finibus lorem. Phasellus viverra diam et sapien imperdiet interdum. Cras a convallis libero. Integer maximus dui vel lorem hendrerit, sit amet convallis ligula lobortis. Duis eu lacus elementum, scelerisque nunc eget, dignissim libero. Suspendisse mi quam, vehicula sit amet pellentesque rhoncus, blandit eu nisl.",
-      "text": "Duis ac tellus sed turpis feugiat aliquam sed vel justo. Fusce sit amet volutpat massa. Duis tristique finibus metus quis tincidunt. Etiam dapibus fringilla diam at pharetra. Vivamus dolor est, hendrerit ac ligula nec, pharetra lacinia sapien. Phasellus at malesuada orci. Maecenas est justo, mollis non ultrices ut, sagittis commodo odio. Integer viverra mauris pellentesque bibendum vestibulum. Sed eu felis mattis, efficitur justo non, finibus lorem. Phasellus viverra diam et sapien imperdiet interdum. Cras a convallis libero. Integer maximus dui vel lorem hendrerit, sit amet convallis ligula lobortis. Duis eu lacus elementum, scelerisque nunc eget, dignissim libero. Suspendisse mi quam, vehicula sit amet pellentesque rhoncus, blandit eu nisl."
+      "text": "Duis ac tellus sed turpis feugiat aliquam sed vel justo. Fusce sit amet volutpat massa. Duis tristique finibus metus quis tincidunt. Etiam dapibus fringilla diam at pharetra. Vivamus dolor est, hendrerit ac ligula nec, pharetra lacinia sapien. Phasellus at malesuada orci. Maecenas est justo, mollis non ultrices ut, sagittis commodo odio. Integer viverra mauris pellentesque bibendum vestibulum. Sed eu felis mattis, efficitur justo non, finibus lorem. Phasellus viverra diam et sapien imperdiet interdum. Cras a convallis libero. Integer maximus dui vel lorem hendrerit, sit amet convallis ligula lobortis. Duis eu lacus elementum, scelerisque nunc eget, dignissim libero. Suspendisse mi quam, vehicula sit amet pellentesque rhoncus, blandit eu nisl.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/7",
@ -157,7 +181,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Nunc vehicula mattis erat ac consectetur. Etiam pharetra mauris ut tempor pellentesque. Sed vel libero vitae ante tempus sagittis vel sit amet dolor. Etiam faucibus viverra sodales. Pellentesque ullamcorper magna libero, non malesuada dui bibendum quis. Donec sed dolor non sem luctus volutpat. Morbi vel diam ut urna euismod gravida a id lectus. Vestibulum vel mauris eu tellus hendrerit dapibus. Etiam scelerisque lacus vel ante ultricies vulputate. In ullamcorper malesuada justo, vel scelerisque nisl lacinia at. Donec sodales interdum ipsum, ac bibendum ipsum pharetra interdum. Vivamus condimentum ac ante vel aliquam. Ut consectetur eu nibh nec gravida. Vestibulum accumsan, purus at mollis rutrum, sapien tortor accumsan purus, vitae fermentum urna mauris ut lacus. Fusce vitae leo sollicitudin, vehicula turpis eu, tempus nibh.",
-      "text": "Nunc vehicula mattis erat ac consectetur. Etiam pharetra mauris ut tempor pellentesque. Sed vel libero vitae ante tempus sagittis vel sit amet dolor. Etiam faucibus viverra sodales. Pellentesque ullamcorper magna libero, non malesuada dui bibendum quis. Donec sed dolor non sem luctus volutpat. Morbi vel diam ut urna euismod gravida a id lectus. Vestibulum vel mauris eu tellus hendrerit dapibus. Etiam scelerisque lacus vel ante ultricies vulputate. In ullamcorper malesuada justo, vel scelerisque nisl lacinia at. Donec sodales interdum ipsum, ac bibendum ipsum pharetra interdum. Vivamus condimentum ac ante vel aliquam. Ut consectetur eu nibh nec gravida. Vestibulum accumsan, purus at mollis rutrum, sapien tortor accumsan purus, vitae fermentum urna mauris ut lacus. Fusce vitae leo sollicitudin, vehicula turpis eu, tempus nibh."
+      "text": "Nunc vehicula mattis erat ac consectetur. Etiam pharetra mauris ut tempor pellentesque. Sed vel libero vitae ante tempus sagittis vel sit amet dolor. Etiam faucibus viverra sodales. Pellentesque ullamcorper magna libero, non malesuada dui bibendum quis. Donec sed dolor non sem luctus volutpat. Morbi vel diam ut urna euismod gravida a id lectus. Vestibulum vel mauris eu tellus hendrerit dapibus. Etiam scelerisque lacus vel ante ultricies vulputate. In ullamcorper malesuada justo, vel scelerisque nisl lacinia at. Donec sodales interdum ipsum, ac bibendum ipsum pharetra interdum. Vivamus condimentum ac ante vel aliquam. Ut consectetur eu nibh nec gravida. Vestibulum accumsan, purus at mollis rutrum, sapien tortor accumsan purus, vitae fermentum urna mauris ut lacus. Fusce vitae leo sollicitudin, vehicula turpis eu, tempus nibh.",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    }
  ],
  "pictures": [],
--- a/tests/data/groundtruth/docling_v2/multi_page.doctags.txt
+++ b/tests/data/groundtruth/docling_v2/multi_page.doctags.txt
@ -0,0 +1,66 @@
 <doctag><section_header_level_1><loc_60><loc_43><loc_221><loc_51>The Evolution of the Word Processor</section_header_level_1>
 <text><loc_60><loc_59><loc_418><loc_76>The concept of the word processor predates modern computers and has evolved through several technological milestones.</text>
 <section_header_level_1><loc_60><loc_84><loc_274><loc_93>Pre-Digital Era (19th - Early 20th Century)</section_header_level_1>
 <text><loc_60><loc_102><loc_427><loc_134>The origins of word processing can be traced back to the invention of the typewriter in the mid-19th century. Patented in 1868 by Christopher Latham Sholes, the typewriter revolutionized written communication by enabling people to produce legible, professional documents more efficiently than handwriting.</text>
 <text><loc_60><loc_143><loc_424><loc_175>During this period, the term "word processing" didn't exist, but the typewriter laid the groundwork for future developments. Over time, advancements such as carbon paper (for copies) and the electric typewriter (introduced by IBM in 1935) improved the speed and convenience of document creation.</text>
 <section_header_level_1><loc_60><loc_201><loc_283><loc_209>The Birth of Word Processing (1960s - 1970s)</section_header_level_1>
 <text><loc_60><loc_218><loc_440><loc_242>The term "word processor" first emerged in the 1960s and referred to any system designed to streamline written communication and document production. Early word processors were not software programs but rather standalone machines.</text>
 <unordered_list><list_item><loc_76><loc_251><loc_435><loc_283>· IBM MT/ST (Magnetic Tape/Selectric Typewriter) : Introduced in 1964, this machine combined IBM's Selectric typewriter with magnetic tape storage. It allowed users to record, edit, and replay typed content-an early example of digital text storage.</list_item>
 <list_item><loc_76><loc_284><loc_418><loc_308>· Wang Laboratories : In the 1970s, Wang introduced dedicated word processing machines. These devices, like the Wang 1200, featured small screens and floppy disks, making them revolutionary for their time.</list_item>
 </unordered_list>
 <text><loc_60><loc_316><loc_432><loc_333>These machines were primarily used in offices, where secretarial pools benefited from their ability to make revisions without retyping entire documents.</text>
 <section_header_level_1><loc_60><loc_358><loc_258><loc_367>The Rise of Personal Computers (1980s)</section_header_level_1>
 <text><loc_60><loc_375><loc_433><loc_392>The advent of personal computers in the late 1970s and early 1980s transformed word processing from a niche tool to an essential technology for businesses and individuals alike.</text>
 <unordered_list><list_item><loc_76><loc_400><loc_439><loc_424>· WordStar (1978) : Developed for the CP/M operating system, WordStar was one of the first widely used word processing programs. It featured early examples of modern features like cut, copy, and paste.</list_item>
 <list_item><loc_76><loc_425><loc_441><loc_449>· Microsoft Word (1983) : Microsoft launched Word for MS-DOS in 1983, introducing a graphical user interface (GUI) and mouse support. Over the years, Microsoft Word became the industry standard for word processing.</list_item>
 </unordered_list>
 <page_break>
 <text><loc_60><loc_43><loc_434><loc_67>Other notable software from this era included WordPerfect, which was popular among legal professionals, and Apple's MacWrite, which leveraged the Macintosh's graphical capabilities.</text>
 <section_header_level_1><loc_60><loc_93><loc_229><loc_101>The Modern Era (1990s - Present)</section_header_level_1>
 <text><loc_60><loc_110><loc_429><loc_126>By the 1990s, word processing software had become more sophisticated, with features like spell check, grammar check, templates, and collaborative tools.</text>
 <unordered_list><list_item><loc_76><loc_135><loc_413><loc_151>· Microsoft Office Suite : Microsoft continued to dominate with its Office Suite, integrating Word with other productivity tools like Excel and PowerPoint.</list_item>
 <list_item><loc_76><loc_151><loc_435><loc_167>· OpenOffice and LibreOffice : Open-source alternatives emerged in the early 2000s, offering free and flexible word processing options.</list_item>
 <list_item><loc_76><loc_167><loc_441><loc_192>· Google Docs (2006) : The introduction of cloud-based word processing revolutionized collaboration. Google Docs enabled real-time editing and sharing, making it a staple for teams and remote work.</list_item>
 </unordered_list>
 <section_header_level_1><loc_60><loc_217><loc_195><loc_226>Future of Word Processing</section_header_level_1>
 <text><loc_60><loc_234><loc_437><loc_275>Today, word processors are more than just tools for typing. They integrate artificial intelligence for grammar and style suggestions (e.g., Grammarly), voice-to-text features, and advanced layout options. As AI continues to advance, word processors may evolve into even more intuitive tools that predict user needs, automate repetitive tasks, and support richer multimedia integration.</text>
 <text><loc_60><loc_300><loc_433><loc_325>From the clunky typewriters of the 19th century to the AI-powered cloud tools of today, the word processor has come a long way. It remains an essential tool for communication and creativity, shaping how we write and share ideas.</text>
 <page_break>
 <section_header_level_1><loc_60><loc_43><loc_232><loc_52>Specialized Word Processing Tools</section_header_level_1>
 <text><loc_60><loc_60><loc_432><loc_85>In addition to general-purpose word processors, specialized tools have emerged to cater to specific industries and needs. These tools incorporate unique features tailored to their users' workflows:</text>
 <unordered_list><list_item><loc_76><loc_93><loc_436><loc_134>· Academic and Technical Writing : Tools like LaTeX gained popularity among academics, scientists, and engineers. Unlike traditional word processors, LaTeX focuses on precise formatting, particularly for complex mathematical equations, scientific papers, and technical documents. It relies on a markup language to produce polished documents suitable for publishing.</list_item>
 <list_item><loc_76><loc_134><loc_423><loc_167>· Screenwriting Software : For screenwriters, tools like Final Draft and Celtx are specialized to handle scripts for film and television. These programs automate the formatting of dialogue, scene descriptions, and other elements unique to screenwriting.</list_item>
 <list_item><loc_76><loc_167><loc_441><loc_200>· Legal Document Processors : Word processors tailored for legal professionals, like WordPerfect, offered features such as redlining (early version tracking) and document comparison. Even today, many law firms rely on these tools due to their robust formatting options for contracts and legal briefs.</list_item>
 </unordered_list>
 <section_header_level_1><loc_60><loc_225><loc_286><loc_234>Key Features That Changed Word Processing</section_header_level_1>
 <text><loc_60><loc_242><loc_432><loc_267>The evolution of word processors wasn't just about hardware or software improvements-it was about the features that revolutionized how people wrote and edited. Some of these transformative features include:</text>
 <unordered_list><list_item><loc_76><loc_275><loc_428><loc_291>1. Undo/Redo : Introduced in the 1980s, the ability to undo mistakes and redo actions made experimentation and error correction much easier.</list_item>
 <list_item><loc_76><loc_292><loc_434><loc_308>2. Spell Check and Grammar Check : By the 1990s, these became standard, allowing users to spot errors automatically.</list_item>
 <list_item><loc_76><loc_308><loc_409><loc_324>3. Templates : Pre-designed formats for documents, such as resumes, letters, and invoices, helped users save time.</list_item>
 <list_item><loc_76><loc_324><loc_422><loc_340>4. Track Changes : A game-changer for collaboration, this feature allowed multiple users to suggest edits while maintaining the original text.</list_item>
 <list_item><loc_76><loc_341><loc_438><loc_365>5. Real-Time Collaboration : Tools like Google Docs and Microsoft 365 enabled multiple users to edit the same document simultaneously, forever changing teamwork dynamics.</list_item>
 </unordered_list>
 <section_header_level_1><loc_60><loc_390><loc_262><loc_399>The Cultural Impact of Word Processors</section_header_level_1>
 <text><loc_60><loc_408><loc_436><loc_432>The word processor didn't just change workplaces-it changed culture. It democratized writing, enabling anyone with access to a computer to produce professional-quality documents. This shift had profound implications for education, business, and creative fields:</text>
 <page_break>
 <unordered_list><list_item><loc_76><loc_43><loc_432><loc_67>· Accessibility : Writers no longer needed expensive publishing equipment or training in typesetting to create polished work. This accessibility paved the way for selfpublishing, blogging, and even fan fiction communities.</list_item>
 <list_item><loc_76><loc_67><loc_438><loc_92>· Education : Word processors became a cornerstone of education, teaching students not only how to write essays but also how to use technology effectively. Features like bibliography generators and integrated research tools enhanced learning.</list_item>
 <list_item><loc_76><loc_92><loc_433><loc_117>· Creative Writing : Writers gained powerful tools to organize their ideas. Programs like Scrivener allowed authors to manage large projects, from novels to screenplays, with features like chapter outlines and character notes.</list_item>
 </unordered_list>
 <section_header_level_1><loc_60><loc_142><loc_248><loc_151>Word Processors in a Post-Digital Era</section_header_level_1>
 <text><loc_60><loc_159><loc_438><loc_167>As we move further into the 21st century, the role of the word processor continues to evolve:</text>
 <unordered_list><list_item><loc_76><loc_176><loc_440><loc_208>1. Artificial Intelligence : Modern word processors are leveraging AI to suggest content improvements. Tools like Grammarly, ProWritingAid, and even native features in Word now analyze tone, conciseness, and clarity. Some AI systems can even generate entire paragraphs or rewrite sentences.</list_item>
 <list_item><loc_76><loc_208><loc_432><loc_241>2. Integration with Other Tools : Word processors are no longer standalone. They integrate with task managers, cloud storage, and project management platforms. For instance, Google Docs syncs with Google Drive, while Microsoft Word integrates seamlessly with OneDrive and Teams.</list_item>
 <list_item><loc_76><loc_241><loc_422><loc_274>3. Voice Typing : Speech-to-text capabilities have made word processing more accessible, particularly for those with disabilities. Tools like Dragon NaturallySpeaking and built-in options in Google Docs and Microsoft Word have made dictation mainstream.</list_item>
 <list_item><loc_76><loc_274><loc_434><loc_298>4. Multimedia Documents : Word processing has expanded beyond text. Modern tools allow users to embed images, videos, charts, and interactive elements, transforming simple documents into rich multimedia experiences.</list_item>
 <list_item><loc_76><loc_299><loc_429><loc_323>5. Cross-Platform Accessibility : Thanks to cloud computing, documents can now be accessed and edited across devices. Whether you're on a desktop, tablet, or smartphone, you can continue working seamlessly.</list_item>
 </unordered_list>
 <section_header_level_1><loc_60><loc_348><loc_192><loc_357>A Glimpse Into the Future</section_header_level_1>
 <text><loc_60><loc_366><loc_433><loc_382>The word processor's future lies in adaptability and intelligence. Some exciting possibilities include:</text>
 <unordered_list><list_item><loc_76><loc_390><loc_435><loc_406>· Fully AI-Assisted Writing : Imagine a word processor that understands your writing style, drafts emails, or creates entire essays based on minimal input.</list_item>
 <list_item><loc_76><loc_407><loc_441><loc_431>· Immersive Interfaces : As augmented reality (AR) and virtual reality (VR) technology advance, users may be able to write and edit in 3D spaces, collaborating in virtual environments.</list_item>
 <list_item><loc_76><loc_431><loc_436><loc_447>· Hyper-Personalization : Word processors could offer dynamic suggestions based on industry-specific needs, user habits, or even regional language variations.</list_item>
 </unordered_list>
 <page_break>
 <text><loc_60><loc_59><loc_429><loc_100>The journey of the word processor-from clunky typewriters to AI-powered platformsreflects humanity's broader technological progress. What began as a tool to simply replace handwriting has transformed into a powerful ally for creativity, communication, and collaboration. As technology continues to advance, the word processor will undoubtedly remain at the heart of how we express ideas and connect with one another.</text>
 </doctag>
--- a/tests/data/groundtruth/docling_v2/multi_page.json
+++ b/tests/data/groundtruth/docling_v2/multi_page.json
--- a/tests/data/groundtruth/docling_v2/multi_page.md
+++ b/tests/data/groundtruth/docling_v2/multi_page.md
@ -0,0 +1,87 @@
 ## The Evolution of the Word Processor
 The concept of the word processor predates modern computers and has evolved through several technological milestones.
 ## Pre-Digital Era (19th - Early 20th Century)
 The origins of word processing can be traced back to the invention of the typewriter in the mid-19th century. Patented in 1868 by Christopher Latham Sholes, the typewriter revolutionized written communication by enabling people to produce legible, professional documents more efficiently than handwriting.
 During this period, the term "word processing" didn't exist, but the typewriter laid the groundwork for future developments. Over time, advancements such as carbon paper (for copies) and the electric typewriter (introduced by IBM in 1935) improved the speed and convenience of document creation.
 ## The Birth of Word Processing (1960s - 1970s)
 The term "word processor" first emerged in the 1960s and referred to any system designed to streamline written communication and document production. Early word processors were not software programs but rather standalone machines.
 - · IBM MT/ST (Magnetic Tape/Selectric Typewriter) : Introduced in 1964, this machine combined IBM's Selectric typewriter with magnetic tape storage. It allowed users to record, edit, and replay typed content-an early example of digital text storage.
 - · Wang Laboratories : In the 1970s, Wang introduced dedicated word processing machines. These devices, like the Wang 1200, featured small screens and floppy disks, making them revolutionary for their time.
 These machines were primarily used in offices, where secretarial pools benefited from their ability to make revisions without retyping entire documents.
 ## The Rise of Personal Computers (1980s)
 The advent of personal computers in the late 1970s and early 1980s transformed word processing from a niche tool to an essential technology for businesses and individuals alike.
 - · WordStar (1978) : Developed for the CP/M operating system, WordStar was one of the first widely used word processing programs. It featured early examples of modern features like cut, copy, and paste.
 - · Microsoft Word (1983) : Microsoft launched Word for MS-DOS in 1983, introducing a graphical user interface (GUI) and mouse support. Over the years, Microsoft Word became the industry standard for word processing.
 Other notable software from this era included WordPerfect, which was popular among legal professionals, and Apple's MacWrite, which leveraged the Macintosh's graphical capabilities.
 ## The Modern Era (1990s - Present)
 By the 1990s, word processing software had become more sophisticated, with features like spell check, grammar check, templates, and collaborative tools.
 - · Microsoft Office Suite : Microsoft continued to dominate with its Office Suite, integrating Word with other productivity tools like Excel and PowerPoint.
 - · OpenOffice and LibreOffice : Open-source alternatives emerged in the early 2000s, offering free and flexible word processing options.
 - · Google Docs (2006) : The introduction of cloud-based word processing revolutionized collaboration. Google Docs enabled real-time editing and sharing, making it a staple for teams and remote work.
 ## Future of Word Processing
 Today, word processors are more than just tools for typing. They integrate artificial intelligence for grammar and style suggestions (e.g., Grammarly), voice-to-text features, and advanced layout options. As AI continues to advance, word processors may evolve into even more intuitive tools that predict user needs, automate repetitive tasks, and support richer multimedia integration.
 From the clunky typewriters of the 19th century to the AI-powered cloud tools of today, the word processor has come a long way. It remains an essential tool for communication and creativity, shaping how we write and share ideas.
 ## Specialized Word Processing Tools
 In addition to general-purpose word processors, specialized tools have emerged to cater to specific industries and needs. These tools incorporate unique features tailored to their users' workflows:
 - · Academic and Technical Writing : Tools like LaTeX gained popularity among academics, scientists, and engineers. Unlike traditional word processors, LaTeX focuses on precise formatting, particularly for complex mathematical equations, scientific papers, and technical documents. It relies on a markup language to produce polished documents suitable for publishing.
 - · Screenwriting Software : For screenwriters, tools like Final Draft and Celtx are specialized to handle scripts for film and television. These programs automate the formatting of dialogue, scene descriptions, and other elements unique to screenwriting.
 - · Legal Document Processors : Word processors tailored for legal professionals, like WordPerfect, offered features such as redlining (early version tracking) and document comparison. Even today, many law firms rely on these tools due to their robust formatting options for contracts and legal briefs.
 ## Key Features That Changed Word Processing
 The evolution of word processors wasn't just about hardware or software improvements-it was about the features that revolutionized how people wrote and edited. Some of these transformative features include:
 - 1. Undo/Redo : Introduced in the 1980s, the ability to undo mistakes and redo actions made experimentation and error correction much easier.
 - 2. Spell Check and Grammar Check : By the 1990s, these became standard, allowing users to spot errors automatically.
 - 3. Templates : Pre-designed formats for documents, such as resumes, letters, and invoices, helped users save time.
 - 4. Track Changes : A game-changer for collaboration, this feature allowed multiple users to suggest edits while maintaining the original text.
 - 5. Real-Time Collaboration : Tools like Google Docs and Microsoft 365 enabled multiple users to edit the same document simultaneously, forever changing teamwork dynamics.
 ## The Cultural Impact of Word Processors
 The word processor didn't just change workplaces-it changed culture. It democratized writing, enabling anyone with access to a computer to produce professional-quality documents. This shift had profound implications for education, business, and creative fields:
 - · Accessibility : Writers no longer needed expensive publishing equipment or training in typesetting to create polished work. This accessibility paved the way for selfpublishing, blogging, and even fan fiction communities.
 - · Education : Word processors became a cornerstone of education, teaching students not only how to write essays but also how to use technology effectively. Features like bibliography generators and integrated research tools enhanced learning.
 - · Creative Writing : Writers gained powerful tools to organize their ideas. Programs like Scrivener allowed authors to manage large projects, from novels to screenplays, with features like chapter outlines and character notes.
 ## Word Processors in a Post-Digital Era
 As we move further into the 21st century, the role of the word processor continues to evolve:
 - 1. Artificial Intelligence : Modern word processors are leveraging AI to suggest content improvements. Tools like Grammarly, ProWritingAid, and even native features in Word now analyze tone, conciseness, and clarity. Some AI systems can even generate entire paragraphs or rewrite sentences.
 - 2. Integration with Other Tools : Word processors are no longer standalone. They integrate with task managers, cloud storage, and project management platforms. For instance, Google Docs syncs with Google Drive, while Microsoft Word integrates seamlessly with OneDrive and Teams.
 - 3. Voice Typing : Speech-to-text capabilities have made word processing more accessible, particularly for those with disabilities. Tools like Dragon NaturallySpeaking and built-in options in Google Docs and Microsoft Word have made dictation mainstream.
 - 4. Multimedia Documents : Word processing has expanded beyond text. Modern tools allow users to embed images, videos, charts, and interactive elements, transforming simple documents into rich multimedia experiences.
 - 5. Cross-Platform Accessibility : Thanks to cloud computing, documents can now be accessed and edited across devices. Whether you're on a desktop, tablet, or smartphone, you can continue working seamlessly.
 ## A Glimpse Into the Future
 The word processor's future lies in adaptability and intelligence. Some exciting possibilities include:
 - · Fully AI-Assisted Writing : Imagine a word processor that understands your writing style, drafts emails, or creates entire essays based on minimal input.
 - · Immersive Interfaces : As augmented reality (AR) and virtual reality (VR) technology advance, users may be able to write and edit in 3D spaces, collaborating in virtual environments.
 - · Hyper-Personalization : Word processors could offer dynamic suggestions based on industry-specific needs, user habits, or even regional language variations.
 The journey of the word processor-from clunky typewriters to AI-powered platformsreflects humanity's broader technological progress. What began as a tool to simply replace handwriting has transformed into a powerful ally for creativity, communication, and collaboration. As technology continues to advance, the word processor will undoubtedly remain at the heart of how we express ideas and connect with one another.
--- a/tests/data/groundtruth/docling_v2/multi_page.pages.json
+++ b/tests/data/groundtruth/docling_v2/multi_page.pages.json
--- a/tests/data/groundtruth/docling_v2/powerpoint_sample.pptx.json
+++ b/tests/data/groundtruth/docling_v2/powerpoint_sample.pptx.json
@ -326,8 +326,8 @@
          ]
        }
      ],
-      "orig": "Let\u2019s introduce a list",
+      "orig": "Let’s introduce a list",
-      "text": "Let\u2019s introduce a list"
+      "text": "Let’s introduce a list"
    },
    {
      "self_ref": "#/texts/4",
--- a/tests/data/groundtruth/docling_v2/tablecell.docx.json
+++ b/tests/data/groundtruth/docling_v2/tablecell.docx.json
@ -74,6 +74,12 @@
      "prov": [],
      "orig": "Hello world1",
      "text": "Hello world1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -88,6 +94,12 @@
      "prov": [],
      "orig": "Hello2",
      "text": "Hello2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -113,7 +125,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Some text before",
-      "text": "Some text before"
+      "text": "Some text before",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/4",
@ -149,7 +167,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Some text after",
-      "text": "Some text after"
+      "text": "Some text after",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    }
  ],
  "pictures": [],
--- a/tests/data/groundtruth/docling_v2/test_emf_docx.docx.json
+++ b/tests/data/groundtruth/docling_v2/test_emf_docx.docx.json
@ -55,7 +55,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Test with three images in unusual formats",
-      "text": "Test with three images in unusual formats"
+      "text": "Test with three images in unusual formats",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/1",
@ -67,7 +73,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Raster in emf:",
-      "text": "Raster in emf:"
+      "text": "Raster in emf:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/2",
@ -79,7 +91,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Vector in emf:",
-      "text": "Vector in emf:"
+      "text": "Vector in emf:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/3",
@ -91,7 +109,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Raster in webp:",
-      "text": "Raster in webp:"
+      "text": "Raster in webp:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    }
  ],
  "pictures": [
--- a/tests/data/groundtruth/docling_v2/unit_test_formatting.docx.json
+++ b/tests/data/groundtruth/docling_v2/unit_test_formatting.docx.json
@ -232,6 +232,12 @@
      "prov": [],
      "orig": "hyperlink",
      "text": "hyperlink",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "hyperlink": "https:/github.com/DS4SD/docling"
    },
    {
@ -263,7 +269,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Normal",
-      "text": "Normal"
+      "text": "Normal",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/6",
@ -329,7 +341,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "and",
-      "text": "and"
+      "text": "and",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/10",
@ -342,6 +360,12 @@
      "prov": [],
      "orig": "hyperlink",
      "text": "hyperlink",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "hyperlink": "https:/github.com/DS4SD/docling"
    },
    {
@ -354,7 +378,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "on the same line",
-      "text": "on the same line"
+      "text": "on the same line",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/12",
@ -439,6 +469,12 @@
      "prov": [],
      "orig": "Some",
      "text": "Some",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -513,6 +549,12 @@
      "prov": [],
      "orig": "Nested",
      "text": "Nested",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
--- a/tests/data/groundtruth/docling_v2/unit_test_headers.docx.json
+++ b/tests/data/groundtruth/docling_v2/unit_test_headers.docx.json
@ -133,7 +133,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1",
-      "text": "Paragraph 1.1"
+      "text": "Paragraph 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/5",
@ -157,7 +163,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2",
-      "text": "Paragraph 1.2"
+      "text": "Paragraph 1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/7",
@ -222,7 +234,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.1",
-      "text": "Paragraph 1.1.1"
+      "text": "Paragraph 1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/11",
@ -246,7 +264,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.2",
-      "text": "Paragraph 1.1.2"
+      "text": "Paragraph 1.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/13",
@ -314,7 +338,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.1",
-      "text": "Paragraph 1.1.1"
+      "text": "Paragraph 1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/17",
@ -338,7 +368,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.2",
-      "text": "Paragraph 1.1.2"
+      "text": "Paragraph 1.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/19",
@ -406,7 +442,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2.3.1",
-      "text": "Paragraph 1.2.3.1"
+      "text": "Paragraph 1.2.3.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/23",
@ -430,7 +472,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2.3.1",
-      "text": "Paragraph 1.2.3.1"
+      "text": "Paragraph 1.2.3.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/25",
@ -513,7 +561,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1",
-      "text": "Paragraph 2.1"
+      "text": "Paragraph 2.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/30",
@ -537,7 +591,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.2",
-      "text": "Paragraph 2.2"
+      "text": "Paragraph 2.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/32",
@ -602,7 +662,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1.1",
-      "text": "Paragraph 2.1.1.1"
+      "text": "Paragraph 2.1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/36",
@ -626,7 +692,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1.1",
-      "text": "Paragraph 2.1.1.1"
+      "text": "Paragraph 2.1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/38",
@ -694,7 +766,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1",
-      "text": "Paragraph 2.1.1"
+      "text": "Paragraph 2.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/42",
@ -718,7 +796,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.2",
-      "text": "Paragraph 2.1.2"
+      "text": "Paragraph 2.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/44",
--- a/tests/data/groundtruth/docling_v2/unit_test_headers_numbered.docx.json
+++ b/tests/data/groundtruth/docling_v2/unit_test_headers_numbered.docx.json
@ -209,7 +209,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1",
-      "text": "Paragraph 1.1"
+      "text": "Paragraph 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/5",
@ -233,7 +239,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2",
-      "text": "Paragraph 1.2"
+      "text": "Paragraph 1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/7",
@ -298,7 +310,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.1",
-      "text": "Paragraph 1.1.1"
+      "text": "Paragraph 1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/11",
@ -322,7 +340,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.2",
-      "text": "Paragraph 1.1.2"
+      "text": "Paragraph 1.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/13",
@ -390,7 +414,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.1",
-      "text": "Paragraph 1.1.1"
+      "text": "Paragraph 1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/17",
@ -414,7 +444,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.1.2",
-      "text": "Paragraph 1.1.2"
+      "text": "Paragraph 1.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/19",
@ -482,7 +518,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2.3.1",
-      "text": "Paragraph 1.2.3.1"
+      "text": "Paragraph 1.2.3.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/23",
@ -506,7 +548,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 1.2.3.1",
-      "text": "Paragraph 1.2.3.1"
+      "text": "Paragraph 1.2.3.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/25",
@ -567,7 +615,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1",
-      "text": "Paragraph 2.1"
+      "text": "Paragraph 2.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/30",
@ -591,7 +645,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.2",
-      "text": "Paragraph 2.2"
+      "text": "Paragraph 2.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/32",
@ -656,7 +716,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1.1",
-      "text": "Paragraph 2.1.1.1"
+      "text": "Paragraph 2.1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/36",
@ -680,7 +746,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1.1",
-      "text": "Paragraph 2.1.1.1"
+      "text": "Paragraph 2.1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/38",
@ -748,7 +820,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1",
-      "text": "Paragraph 2.1.1"
+      "text": "Paragraph 2.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/42",
@ -772,7 +850,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.2",
-      "text": "Paragraph 2.1.2"
+      "text": "Paragraph 2.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/44",
--- a/tests/data/groundtruth/docling_v2/unit_test_lists.docx.json
+++ b/tests/data/groundtruth/docling_v2/unit_test_lists.docx.json
@ -365,7 +365,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.1",
-      "text": "Paragraph 2.1.1"
+      "text": "Paragraph 2.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/4",
@ -389,7 +395,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Paragraph 2.1.2",
-      "text": "Paragraph 2.1.2"
+      "text": "Paragraph 2.1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/6",
@ -434,6 +446,12 @@
      "prov": [],
      "orig": "List item 1",
      "text": "List item 1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -448,6 +466,12 @@
      "prov": [],
      "orig": "List item 2",
      "text": "List item 2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -462,6 +486,12 @@
      "prov": [],
      "orig": "List item 3",
      "text": "List item 3",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -508,6 +538,12 @@
      "prov": [],
      "orig": "List item a",
      "text": "List item a",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -522,6 +558,12 @@
      "prov": [],
      "orig": "List item b",
      "text": "List item b",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -536,6 +578,12 @@
      "prov": [],
      "orig": "List item c",
      "text": "List item c",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -582,6 +630,12 @@
      "prov": [],
      "orig": "List item 1",
      "text": "List item 1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -596,6 +650,12 @@
      "prov": [],
      "orig": "List item 2",
      "text": "List item 2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -610,6 +670,12 @@
      "prov": [],
      "orig": "List item 1.1",
      "text": "List item 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -624,6 +690,12 @@
      "prov": [],
      "orig": "List item 1.2",
      "text": "List item 1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -638,6 +710,12 @@
      "prov": [],
      "orig": "List item 1.3",
      "text": "List item 1.3",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -652,6 +730,12 @@
      "prov": [],
      "orig": "List item 3",
      "text": "List item 3",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -698,6 +782,12 @@
      "prov": [],
      "orig": "List item 1",
      "text": "List item 1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -712,6 +802,12 @@
      "prov": [],
      "orig": "List item 1.1",
      "text": "List item 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -726,6 +822,12 @@
      "prov": [],
      "orig": "List item 2",
      "text": "List item 2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -772,6 +874,12 @@
      "prov": [],
      "orig": "List item 1",
      "text": "List item 1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -786,6 +894,12 @@
      "prov": [],
      "orig": "List item 1.1",
      "text": "List item 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -800,6 +914,12 @@
      "prov": [],
      "orig": "List item 1.1.1",
      "text": "List item 1.1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -814,6 +934,12 @@
      "prov": [],
      "orig": "List item 3",
      "text": "List item 3",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -866,6 +992,12 @@
      "prov": [],
      "orig": "List item 1",
      "text": "List item 1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -880,6 +1012,12 @@
      "prov": [],
      "orig": "List item 2",
      "text": "List item 2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -894,6 +1032,12 @@
      "prov": [],
      "orig": "List item 1.1",
      "text": "List item 1.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -908,6 +1052,12 @@
      "prov": [],
      "orig": "List item 1.2",
      "text": "List item 1.2",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -922,6 +1072,12 @@
      "prov": [],
      "orig": "List item 1.2.1",
      "text": "List item 1.2.1",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -936,6 +1092,12 @@
      "prov": [],
      "orig": "List item 3",
      "text": "List item 3",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
--- a/tests/data/groundtruth/docling_v2/wiki_duck.html.json
+++ b/tests/data/groundtruth/docling_v2/wiki_duck.html.json
--- a/tests/data/groundtruth/docling_v2/word_sample.docx.json
+++ b/tests/data/groundtruth/docling_v2/word_sample.docx.json
@ -101,7 +101,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Summer activities",
-      "text": "Summer activities"
+      "text": "Summer activities",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/1",
@ -138,7 +144,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Duck",
-      "text": "Duck"
+      "text": "Duck",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/3",
@ -150,7 +162,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Figure 1: This is a cute duckling",
-      "text": "Figure 1: This is a cute duckling"
+      "text": "Figure 1: This is a cute duckling",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/4",
@ -180,8 +198,8 @@
      "content_layer": "body",
      "label": "section_header",
      "prov": [],
-      "orig": "Let\u2019s swim!",
+      "orig": "Let’s swim!",
-      "text": "Let\u2019s swim!",
+      "text": "Let’s swim!",
      "level": 1
    },
    {
@ -194,7 +212,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "To get started with swimming, first lay down in a water and try not to drown:",
-      "text": "To get started with swimming, first lay down in a water and try not to drown:"
+      "text": "To get started with swimming, first lay down in a water and try not to drown:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/6",
@ -207,6 +231,12 @@
      "prov": [],
      "orig": "You can relax and look around",
      "text": "You can relax and look around",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -221,6 +251,12 @@
      "prov": [],
      "orig": "Paddle about",
      "text": "Paddle about",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -235,6 +271,12 @@
      "prov": [],
      "orig": "Enjoy summer warmth",
      "text": "Enjoy summer warmth",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -247,8 +289,14 @@
      "content_layer": "body",
      "label": "paragraph",
      "prov": [],
-      "orig": "Also, don\u2019t forget:",
+      "orig": "Also, don’t forget:",
-      "text": "Also, don\u2019t forget:"
+      "text": "Also, don’t forget:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/10",
@ -261,6 +309,12 @@
      "prov": [],
      "orig": "Wear sunglasses",
      "text": "Wear sunglasses",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -273,8 +327,14 @@
      "content_layer": "body",
      "label": "list_item",
      "prov": [],
-      "orig": "Don\u2019t forget to drink water",
+      "orig": "Don’t forget to drink water",
-      "text": "Don\u2019t forget to drink water",
+      "text": "Don’t forget to drink water",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -289,6 +349,12 @@
      "prov": [],
      "orig": "Use sun cream",
      "text": "Use sun cream",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -301,8 +367,14 @@
      "content_layer": "body",
      "label": "paragraph",
      "prov": [],
-      "orig": "Hmm, what else\u2026",
+      "orig": "Hmm, what else…",
-      "text": "Hmm, what else\u2026"
+      "text": "Hmm, what else…",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/14",
@ -335,8 +407,8 @@
      "content_layer": "body",
      "label": "section_header",
      "prov": [],
-      "orig": "Let\u2019s eat",
+      "orig": "Let’s eat",
-      "text": "Let\u2019s eat",
+      "text": "Let’s eat",
      "level": 2
    },
    {
@ -348,8 +420,14 @@
      "content_layer": "body",
      "label": "paragraph",
      "prov": [],
-      "orig": "After we had a good day of swimming in the lake, it\u2019s important to eat something nice",
+      "orig": "After we had a good day of swimming in the lake, it’s important to eat something nice",
-      "text": "After we had a good day of swimming in the lake, it\u2019s important to eat something nice"
+      "text": "After we had a good day of swimming in the lake, it’s important to eat something nice",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/16",
@ -361,7 +439,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "I like to eat leaves",
-      "text": "I like to eat leaves"
+      "text": "I like to eat leaves",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/17",
@ -373,7 +457,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "Here are some interesting things a respectful duck could eat:",
-      "text": "Here are some interesting things a respectful duck could eat:"
+      "text": "Here are some interesting things a respectful duck could eat:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/18",
@ -396,8 +486,14 @@
      "content_layer": "body",
      "label": "paragraph",
      "prov": [],
-      "orig": "And let\u2019s add another list in the end:",
+      "orig": "And let’s add another list in the end:",
-      "text": "And let\u2019s add another list in the end:"
+      "text": "And let’s add another list in the end:",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/20",
@ -410,6 +506,12 @@
      "prov": [],
      "orig": "Leaves",
      "text": "Leaves",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -424,6 +526,12 @@
      "prov": [],
      "orig": "Berries",
      "text": "Berries",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    },
@ -438,6 +546,12 @@
      "prov": [],
      "orig": "Grain",
      "text": "Grain",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      },
      "enumerated": false,
      "marker": "-"
    }
--- a/tests/data/groundtruth/docling_v2/word_tables.docx.json
+++ b/tests/data/groundtruth/docling_v2/word_tables.docx.json
@ -114,7 +114,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "A uniform table",
-      "text": "A uniform table"
+      "text": "A uniform table",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/2",
@ -138,7 +144,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "A non-uniform table with horizontal spans",
-      "text": "A non-uniform table with horizontal spans"
+      "text": "A non-uniform table with horizontal spans",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/4",
@ -162,7 +174,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "A non-uniform table with horizontal spans in inner columns",
-      "text": "A non-uniform table with horizontal spans in inner columns"
+      "text": "A non-uniform table with horizontal spans in inner columns",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/6",
@ -186,7 +204,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "A non-uniform table with vertical spans",
-      "text": "A non-uniform table with vertical spans"
+      "text": "A non-uniform table with vertical spans",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/8",
@ -210,7 +234,13 @@
      "label": "paragraph",
      "prov": [],
      "orig": "A non-uniform table with all kinds of spans and empty cells",
-      "text": "A non-uniform table with all kinds of spans and empty cells"
+      "text": "A non-uniform table with all kinds of spans and empty cells",
      "formatting": {
        "bold": false,
        "italic": false,
        "underline": false,
        "strikethrough": false
      }
    },
    {
      "self_ref": "#/texts/10",
--- a/tests/data/pdf/multi_page.pdf
+++ b/tests/data/pdf/multi_page.pdf
--- a/tests/data/webp/groundtruth/docling_v2/webp-test.doctags.txt
+++ b/tests/data/webp/groundtruth/docling_v2/webp-test.doctags.txt
@ -0,0 +1,2 @@
 <doctag><text><loc_60><loc_46><loc_424><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>
 </doctag>
--- a/tests/data/webp/groundtruth/docling_v2/webp-test.json
+++ b/tests/data/webp/groundtruth/docling_v2/webp-test.json
@ -0,0 +1,77 @@
 {
  "schema_name": "DoclingDocument",
  "version": "1.3.0",
  "name": "webp-test",
  "origin": {
    "mimetype": "application/pdf",
    "binary_hash": 16115062463007057787,
    "filename": "webp-test.webp",
    "uri": null
  },
  "furniture": {
    "self_ref": "#/furniture",
    "parent": null,
    "children": [],
    "content_layer": "furniture",
    "name": "_root_",
    "label": "unspecified"
  },
  "body": {
    "self_ref": "#/body",
    "parent": null,
    "children": [
      {
        "cref": "#/texts/0"
      }
    ],
    "content_layer": "body",
    "name": "_root_",
    "label": "unspecified"
  },
  "groups": [],
  "texts": [
    {
      "self_ref": "#/texts/0",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "body",
      "label": "text",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 238.19302423176944,
            "t": 2570.0959833241664,
            "r": 1696.0985546594009,
            "b": 2315.204273887442,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            94
          ]
        }
      ],
      "orig": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package",
      "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package",
      "formatting": null,
      "hyperlink": null
    }
  ],
  "pictures": [],
  "tables": [],
  "key_value_items": [],
  "form_items": [],
  "pages": {
    "1": {
      "size": {
        "width": 2000.0,
        "height": 2829.0
      },
      "image": null,
      "page_no": 1
    }
  }
 }
--- a/tests/data/webp/groundtruth/docling_v2/webp-test.md
+++ b/tests/data/webp/groundtruth/docling_v2/webp-test.md
@ -0,0 +1 @@
 Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package
--- a/tests/data/webp/groundtruth/docling_v2/webp-test.pages.json
+++ b/tests/data/webp/groundtruth/docling_v2/webp-test.pages.json
@ -0,0 +1,388 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 2000.0,
      "height": 2829.0
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 246.4065456254215,
          "r_y0": 329.06770715202435,
          "r_x1": 1691.991797818404,
          "r_y1": 329.06770715202435,
          "r_x2": 1691.991797818404,
          "r_y2": 258.9040166758338,
          "r_x3": 246.4065456254215,
          "r_y3": 258.9040166758338,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 238.19302423176944,
          "r_y0": 415.36904822716525,
          "r_x1": 1696.0985546594009,
          "r_y1": 415.36904822716525,
          "r_x2": 1696.0985546594009,
          "r_y2": 345.20535775097477,
          "r_x3": 238.19302423176944,
          "r_y3": 345.20535775097477,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 245.43122061153045,
          "r_y0": 513.795726112558,
          "r_x1": 514.3223724413002,
          "r_y1": 513.795726112558,
          "r_x2": 514.3223724413002,
          "r_y2": 436.0574704074058,
          "r_x3": 245.43122061153045,
          "r_y3": 436.0574704074058,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 238.19302423176944,
              "t": 258.9040166758338,
              "r": 1696.0985546594009,
              "b": 513.795726112558,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9721010327339172,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 246.4065456254215,
                  "r_y0": 329.06770715202435,
                  "r_x1": 1691.991797818404,
                  "r_y1": 329.06770715202435,
                  "r_x2": 1691.991797818404,
                  "r_y2": 258.9040166758338,
                  "r_x3": 246.4065456254215,
                  "r_y3": 258.9040166758338,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 238.19302423176944,
                  "r_y0": 415.36904822716525,
                  "r_x1": 1696.0985546594009,
                  "r_y1": 415.36904822716525,
                  "r_x2": 1696.0985546594009,
                  "r_y2": 345.20535775097477,
                  "r_x3": 238.19302423176944,
                  "r_y3": 345.20535775097477,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 245.43122061153045,
                  "r_y0": 513.795726112558,
                  "r_x1": 514.3223724413002,
                  "r_y1": 513.795726112558,
                  "r_x2": 514.3223724413002,
                  "r_y2": 436.0574704074058,
                  "r_x3": 245.43122061153045,
                  "r_y3": 436.0574704074058,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 238.19302423176944,
              "t": 258.9040166758338,
              "r": 1696.0985546594009,
              "b": 513.795726112558,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9721010327339172,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 246.4065456254215,
                  "r_y0": 329.06770715202435,
                  "r_x1": 1691.991797818404,
                  "r_y1": 329.06770715202435,
                  "r_x2": 1691.991797818404,
                  "r_y2": 258.9040166758338,
                  "r_x3": 246.4065456254215,
                  "r_y3": 258.9040166758338,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 238.19302423176944,
                  "r_y0": 415.36904822716525,
                  "r_x1": 1696.0985546594009,
                  "r_y1": 415.36904822716525,
                  "r_x2": 1696.0985546594009,
                  "r_y2": 345.20535775097477,
                  "r_x3": 238.19302423176944,
                  "r_y3": 345.20535775097477,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 245.43122061153045,
                  "r_y0": 513.795726112558,
                  "r_x1": 514.3223724413002,
                  "r_y1": 513.795726112558,
                  "r_x2": 514.3223724413002,
                  "r_y2": 436.0574704074058,
                  "r_x3": 245.43122061153045,
                  "r_y3": 436.0574704074058,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 238.19302423176944,
              "t": 258.9040166758338,
              "r": 1696.0985546594009,
              "b": 513.795726112558,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9721010327339172,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 246.4065456254215,
                  "r_y0": 329.06770715202435,
                  "r_x1": 1691.991797818404,
                  "r_y1": 329.06770715202435,
                  "r_x2": 1691.991797818404,
                  "r_y2": 258.9040166758338,
                  "r_x3": 246.4065456254215,
                  "r_y3": 258.9040166758338,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 238.19302423176944,
                  "r_y0": 415.36904822716525,
                  "r_x1": 1696.0985546594009,
                  "r_y1": 415.36904822716525,
                  "r_x2": 1696.0985546594009,
                  "r_y2": 345.20535775097477,
                  "r_x3": 238.19302423176944,
                  "r_y3": 345.20535775097477,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 245.43122061153045,
                  "r_y0": 513.795726112558,
                  "r_x1": 514.3223724413002,
                  "r_y1": 513.795726112558,
                  "r_x2": 514.3223724413002,
                  "r_y2": 436.0574704074058,
                  "r_x3": 245.43122061153045,
                  "r_y3": 436.0574704074058,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package"
        }
      ],
      "headers": []
    }
  }
 ]
--- a/tests/data/webp/webp-test.webp
+++ b/tests/data/webp/webp-test.webp
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test.json
@ -44,10 +44,10 @@
      "prov": [
        {
          "bbox": [
-            69.0,
+            70.90211866351085,
-            688.5883585611979,
+            689.216658542347,
-            506.6666666666667,
+            504.8720079864275,
-            767.2550252278646
+            764.9216921155637
          ],
          "page": 1,
          "span": [
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test.pages.json
@ -15,20 +15,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 71.33333333333333,
+          "r_x0": 73.34702132031646,
-          "r_y0": 99.33333333333333,
+          "r_y0": 97.99999977896755,
-          "r_x1": 506.6666666666667,
+          "r_x1": 503.64955224479564,
-          "r_y1": 99.33333333333333,
+          "r_y1": 97.99999977896755,
-          "r_x2": 506.6666666666667,
+          "r_x2": 503.64955224479564,
-          "r_y2": 74.66666666666667,
+          "r_y2": 76.99999977896756,
-          "r_x3": 71.33333333333333,
+          "r_x3": 73.34702132031646,
-          "r_y3": 74.66666666666667,
+          "r_y3": 76.99999977896756,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
-        "confidence": 0.9555703127793324,
+        "confidence": 1.0,
        "from_ocr": true
      },
      {
@ -40,20 +40,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 69.0,
+          "r_x0": 70.90211866351085,
-          "r_y0": 126.66666666666667,
+          "r_y0": 124.83139551297342,
-          "r_x1": 506.6666666666667,
+          "r_x1": 504.8720079864275,
-          "r_y1": 126.66666666666667,
+          "r_y1": 124.83139551297342,
-          "r_x2": 506.6666666666667,
+          "r_x2": 504.8720079864275,
-          "r_y2": 100.66666666666667,
+          "r_y2": 102.66666671251768,
-          "r_x3": 69.0,
+          "r_x3": 70.90211866351085,
-          "r_y3": 100.66666666666667,
+          "r_y3": 102.66666671251768,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
-        "confidence": 0.9741098171752292,
+        "confidence": 1.0,
        "from_ocr": true
      },
      {
@ -65,20 +65,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 70.66666666666667,
+          "r_x0": 73.10852522817731,
-          "r_y0": 153.33333333333334,
+          "r_y0": 152.70503335218433,
-          "r_x1": 154.0,
+          "r_x1": 153.04479435252625,
-          "r_y1": 153.33333333333334,
+          "r_y1": 152.70503335218433,
-          "r_x2": 154.0,
+          "r_x2": 153.04479435252625,
-          "r_y2": 128.66666666666666,
+          "r_y2": 130.00136157890958,
-          "r_x3": 70.66666666666667,
+          "r_x3": 73.10852522817731,
-          "r_y3": 128.66666666666666,
+          "r_y3": 130.00136157890958,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
-        "confidence": 0.6702765056141881,
+        "confidence": 1.0,
        "from_ocr": true
      }
    ],
@ -90,10 +90,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -107,20 +107,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -132,20 +132,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -157,20 +157,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
@ -195,10 +195,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -212,20 +212,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -237,20 +237,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -262,20 +262,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
@ -293,10 +293,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -310,20 +310,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -335,20 +335,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -360,20 +360,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.doctags.txt
@ -0,0 +1,3 @@
 <document>
 <paragraph><location><page_1><loc_16><loc_12><loc_18><loc_26></location>package</paragraph>
 </document>
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.json
@ -0,0 +1 @@
 {"_name": "", "type": "pdf-document", "description": {"title": null, "abstract": null, "authors": null, "affiliations": null, "subjects": null, "keywords": null, "publication_date": null, "languages": null, "license": null, "publishers": null, "url_refs": null, "references": null, "publication": null, "reference_count": null, "citation_count": null, "citation_date": null, "advanced": null, "analytics": null, "logs": [], "collection": null, "acquisition": null}, "file-info": {"filename": "ocr_test_rotated.pdf", "filename-prov": null, "document-hash": "4a282813d93824eaa9bc2a0b2a0d6d626ecc8f5f380bd1320e2dd3e8e53c2ba6", "#-pages": 1, "collection-name": null, "description": null, "page-hashes": [{"hash": "f8a4dc72d8b159f69d0bc968b97f3fb9e0ac59dcb3113492432755835935d9b3", "model": "default", "page": 1}]}, "main-text": [{"prov": [{"bbox": [131.21306574279092, 74.12495603322407, 152.19606490864376, 154.19400205373182], "page": 1, "span": [0, 7], "__ref_s3_data": null}], "text": "package", "type": "paragraph", "payload": null, "name": "Text", "font": null}], "figures": [], "tables": [], "bitmaps": null, "equations": [], "footnotes": [], "page-dimensions": [{"height": 595.201171875, "page": 1, "width": 841.9216918945312}], "page-footers": [], "page-headers": [], "_s3_data": null, "identifiers": null}
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.md
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.md
@ -0,0 +1 @@
 package
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated.pages.json
@ -0,0 +1 @@
 [{"page_no": 0, "size": {"width": 841.9216918945312, "height": 595.201171875}, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}, {"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "predictions": {"layout": {"clusters": [{"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}]}, "tablestructure": {"table_map": {}}, "figures_classification": null, "equations_prediction": null, "vlm_response": null}, "assembled": {"elements": [{"label": "page_header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"}, {"label": "text", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "package"}], "body": [{"label": "text", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "package"}], "headers": [{"label": "page_header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"}]}}]
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.doctags.txt
@ -0,0 +1,4 @@
 <document>
 <paragraph><location><page_1><loc_74><loc_16><loc_88><loc_18></location>package</paragraph>
 <paragraph><location><page_1><loc_15><loc_9><loc_88><loc_15></location>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained</paragraph>
 </document>
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.json
@ -0,0 +1,106 @@
 {
  "_name": "",
  "type": "pdf-document",
  "description": {
    "title": null,
    "abstract": null,
    "authors": null,
    "affiliations": null,
    "subjects": null,
    "keywords": null,
    "publication_date": null,
    "languages": null,
    "license": null,
    "publishers": null,
    "url_refs": null,
    "references": null,
    "publication": null,
    "reference_count": null,
    "citation_count": null,
    "citation_date": null,
    "advanced": null,
    "analytics": null,
    "logs": [],
    "collection": null,
    "acquisition": null
  },
  "file-info": {
    "filename": "ocr_test_rotated_180.pdf",
    "filename-prov": null,
    "document-hash": "a9cbfe0f2a71171face9ee31d2347ca4195649670ad75680520d67d4a863f982",
    "#-pages": 1,
    "collection-name": null,
    "description": null,
    "page-hashes": [
      {
        "hash": "baca27070f05dd84cf0903ded39bcf0fc1fa6ef0ac390e79cf8ba90c8c33ba49",
        "model": "default",
        "page": 1
      }
    ]
  },
  "main-text": [
    {
      "prov": [
        {
          "bbox": [
            441.304584329099,
            132.09610360960653,
            521.9863114205704,
            151.67751306395223
          ],
          "page": 1,
          "span": [
            0,
            7
          ],
          "__ref_s3_data": null
        }
      ],
      "text": "package",
      "type": "paragraph",
      "payload": null,
      "name": "Text",
      "font": null
    },
    {
      "prov": [
        {
          "bbox": [
            89.12133215549848,
            77.02339849621205,
            523.3501733013318,
            124.86176457554109
          ],
          "page": 1,
          "span": [
            0,
            86
          ],
          "__ref_s3_data": null
        }
      ],
      "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "type": "paragraph",
      "payload": null,
      "name": "Text",
      "font": null
    }
  ],
  "figures": [],
  "tables": [],
  "bitmaps": null,
  "equations": [],
  "footnotes": [],
  "page-dimensions": [
    {
      "height": 841.9216918945312,
      "page": 1,
      "width": 595.201171875
    }
  ],
  "page-footers": [],
  "page-headers": [],
  "_s3_data": null,
  "identifiers": null
 }
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.md
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.md
@ -0,0 +1,3 @@
 package
 Docling bundles PDF document conversion to JSON and Markdown in an easy self contained
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_180.pages.json
@ -0,0 +1,445 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 595.201171875,
      "height": 841.9216918945312
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 90.46133071208328,
          "r_y0": 764.8982933983192,
          "r_x1": 520.7638616365624,
          "r_y1": 764.8982933983192,
          "r_x2": 520.7638616365624,
          "r_y2": 744.0929853742306,
          "r_x3": 90.46133071208328,
          "r_y3": 744.0929853742306,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 89.12133215549848,
          "r_y0": 741.5247710689902,
          "r_x1": 523.3501733013318,
          "r_y1": 741.5247710689902,
          "r_x2": 523.3501733013318,
          "r_y2": 717.0599273189902,
          "r_x3": 89.12133215549848,
          "r_y3": 717.0599273189902,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 441.304584329099,
          "r_y0": 709.8255882849247,
          "r_x1": 521.9863114205704,
          "r_y1": 709.8255882849247,
          "r_x2": 521.9863114205704,
          "r_y2": 690.244178830579,
          "r_x3": 441.304584329099,
          "r_y3": 690.244178830579,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 2,
          "page_no": 0,
          "cluster": {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 2,
          "page_no": 0,
          "cluster": {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": []
    }
  }
 ]
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.doctags.txt
@ -0,0 +1,3 @@
 <document>
 <paragraph><location><page_1><loc_82><loc_74><loc_84><loc_88></location>package</paragraph>
 </document>
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.json
@ -0,0 +1,83 @@
 {
  "_name": "",
  "type": "pdf-document",
  "description": {
    "title": null,
    "abstract": null,
    "authors": null,
    "affiliations": null,
    "subjects": null,
    "keywords": null,
    "publication_date": null,
    "languages": null,
    "license": null,
    "publishers": null,
    "url_refs": null,
    "references": null,
    "publication": null,
    "reference_count": null,
    "citation_count": null,
    "citation_date": null,
    "advanced": null,
    "analytics": null,
    "logs": [],
    "collection": null,
    "acquisition": null
  },
  "file-info": {
    "filename": "ocr_test_rotated_270.pdf",
    "filename-prov": null,
    "document-hash": "52f54e7183bdb73aa3713c7b169baca93e276963a138418c26e7d6a1ea128f14",
    "#-pages": 1,
    "collection-name": null,
    "description": null,
    "page-hashes": [
      {
        "hash": "59bc9ddba89e7b008185dd16d384493beb034686e5670546786390c5d237a304",
        "model": "default",
        "page": 1
      }
    ]
  },
  "main-text": [
    {
      "prov": [
        {
          "bbox": [
            691.4680194659409,
            442.3948768148814,
            709.8255850278712,
            523.0765988200898
          ],
          "page": 1,
          "span": [
            0,
            7
          ],
          "__ref_s3_data": null
        }
      ],
      "text": "package",
      "type": "paragraph",
      "payload": null,
      "name": "Text",
      "font": null
    }
  ],
  "figures": [],
  "tables": [],
  "bitmaps": null,
  "equations": [],
  "footnotes": [],
  "page-dimensions": [
    {
      "height": 595.201171875,
      "page": 1,
      "width": 841.9216918945312
    }
  ],
  "page-footers": [],
  "page-headers": [],
  "_s3_data": null,
  "identifiers": null
 }
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.md
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.md
@ -0,0 +1 @@
 package
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_270.pages.json
@ -0,0 +1,446 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 841.9216918945312,
      "height": 595.201171875
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 744.0930045534915,
          "r_y0": 504.87200373583954,
          "r_x1": 764.8982839673505,
          "r_y1": 504.87200373583954,
          "r_x2": 764.8982839673505,
          "r_y2": 73.34702001188118,
          "r_x3": 744.0930045534915,
          "r_y3": 73.34702001188118,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 717.1685859527342,
          "r_y0": 504.8720063438988,
          "r_x1": 737.9738558298501,
          "r_y1": 504.8720063438988,
          "r_x2": 737.9738558298501,
          "r_y2": 70.90211702098213,
          "r_x3": 717.1685859527342,
          "r_y3": 70.90211702098213,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 691.4680194659409,
          "r_y0": 152.80629506011857,
          "r_x1": 709.8255850278712,
          "r_y1": 152.80629506011857,
          "r_x2": 709.8255850278712,
          "r_y2": 72.12457305491027,
          "r_x3": 691.4680194659409,
          "r_y3": 72.12457305491027,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 8,
          "page_no": 0,
          "cluster": {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 8,
          "page_no": 0,
          "cluster": {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        }
      ]
    }
  }
 ]
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.doctags.txt
@ -0,0 +1,3 @@
 <document>
 <paragraph><location><page_1><loc_16><loc_12><loc_18><loc_26></location>package</paragraph>
 </document>
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.json
@ -0,0 +1,83 @@
 {
  "_name": "",
  "type": "pdf-document",
  "description": {
    "title": null,
    "abstract": null,
    "authors": null,
    "affiliations": null,
    "subjects": null,
    "keywords": null,
    "publication_date": null,
    "languages": null,
    "license": null,
    "publishers": null,
    "url_refs": null,
    "references": null,
    "publication": null,
    "reference_count": null,
    "citation_count": null,
    "citation_date": null,
    "advanced": null,
    "analytics": null,
    "logs": [],
    "collection": null,
    "acquisition": null
  },
  "file-info": {
    "filename": "ocr_test_rotated_90.pdf",
    "filename-prov": null,
    "document-hash": "4a282813d93824eaa9bc2a0b2a0d6d626ecc8f5f380bd1320e2dd3e8e53c2ba6",
    "#-pages": 1,
    "collection-name": null,
    "description": null,
    "page-hashes": [
      {
        "hash": "f8a4dc72d8b159f69d0bc968b97f3fb9e0ac59dcb3113492432755835935d9b3",
        "model": "default",
        "page": 1
      }
    ]
  },
  "main-text": [
    {
      "prov": [
        {
          "bbox": [
            131.21306574279092,
            74.12495603322407,
            152.19606490864376,
            154.19400205373182
          ],
          "page": 1,
          "span": [
            0,
            7
          ],
          "__ref_s3_data": null
        }
      ],
      "text": "package",
      "type": "paragraph",
      "payload": null,
      "name": "Text",
      "font": null
    }
  ],
  "figures": [],
  "tables": [],
  "bitmaps": null,
  "equations": [],
  "footnotes": [],
  "page-dimensions": [
    {
      "height": 595.201171875,
      "page": 1,
      "width": 841.9216918945312
    }
  ],
  "page-footers": [],
  "page-headers": [],
  "_s3_data": null,
  "identifiers": null
 }
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.md
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.md
@ -0,0 +1 @@
 package
--- a/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v1/ocr_test_rotated_90.pages.json
@ -0,0 +1,446 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 841.9216918945312,
      "height": 595.201171875
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 77.10171546422428,
          "r_y0": 520.7638577050515,
          "r_x1": 96.6831586150625,
          "r_y1": 520.7638577050515,
          "r_x2": 96.6831586150625,
          "r_y2": 89.23887398109309,
          "r_x3": 77.10171546422428,
          "r_y3": 89.23887398109309,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 100.55299576256091,
          "r_y0": 523.3155494272656,
          "r_x1": 124.91101654503161,
          "r_y1": 523.3155494272656,
          "r_x2": 124.91101654503161,
          "r_y2": 89.12381765643227,
          "r_x3": 100.55299576256091,
          "r_y3": 89.12381765643227,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 131.21306574279092,
          "r_y0": 521.0762158417759,
          "r_x1": 152.19606490864376,
          "r_y1": 521.0762158417759,
          "r_x2": 152.19606490864376,
          "r_y2": 441.0071698212682,
          "r_x3": 131.21306574279092,
          "r_y3": 441.0071698212682,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 1,
          "page_no": 0,
          "cluster": {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 1,
          "page_no": 0,
          "cluster": {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        }
      ]
    }
  }
 ]
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test.doctags.txt
@ -1,2 +1,2 @@
-<doctag><text><loc_58><loc_44><loc_426><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>
+<doctag><text><loc_60><loc_46><loc_424><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>
 </doctag>
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test.json
@ -42,10 +42,10 @@
        {
          "page_no": 1,
          "bbox": {
-            "l": 69.0,
+            "l": 70.90211866351085,
-            "t": 767.2550252278646,
+            "t": 764.9216921155637,
-            "r": 506.6666666666667,
+            "r": 504.8720079864275,
-            "b": 688.5883585611979,
+            "b": 689.216658542347,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test.pages.json
@ -15,20 +15,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 71.33333333333333,
+          "r_x0": 73.34702132031646,
-          "r_y0": 99.33333333333333,
+          "r_y0": 97.99999977896755,
-          "r_x1": 506.6666666666667,
+          "r_x1": 503.64955224479564,
-          "r_y1": 99.33333333333333,
+          "r_y1": 97.99999977896755,
-          "r_x2": 506.6666666666667,
+          "r_x2": 503.64955224479564,
-          "r_y2": 74.66666666666667,
+          "r_y2": 76.99999977896756,
-          "r_x3": 71.33333333333333,
+          "r_x3": 73.34702132031646,
-          "r_y3": 74.66666666666667,
+          "r_y3": 76.99999977896756,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
-        "confidence": 0.9555703127793324,
+        "confidence": 1.0,
        "from_ocr": true
      },
      {
@ -40,20 +40,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 69.0,
+          "r_x0": 70.90211866351085,
-          "r_y0": 126.66666666666667,
+          "r_y0": 124.83139551297342,
-          "r_x1": 506.6666666666667,
+          "r_x1": 504.8720079864275,
-          "r_y1": 126.66666666666667,
+          "r_y1": 124.83139551297342,
-          "r_x2": 506.6666666666667,
+          "r_x2": 504.8720079864275,
-          "r_y2": 100.66666666666667,
+          "r_y2": 102.66666671251768,
-          "r_x3": 69.0,
+          "r_x3": 70.90211866351085,
-          "r_y3": 100.66666666666667,
+          "r_y3": 102.66666671251768,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
-        "confidence": 0.9741098171752292,
+        "confidence": 1.0,
        "from_ocr": true
      },
      {
@ -65,20 +65,20 @@
          "a": 255
        },
        "rect": {
-          "r_x0": 70.66666666666667,
+          "r_x0": 73.10852522817731,
-          "r_y0": 153.33333333333334,
+          "r_y0": 152.70503335218433,
-          "r_x1": 154.0,
+          "r_x1": 153.04479435252625,
-          "r_y1": 153.33333333333334,
+          "r_y1": 152.70503335218433,
-          "r_x2": 154.0,
+          "r_x2": 153.04479435252625,
-          "r_y2": 128.66666666666666,
+          "r_y2": 130.00136157890958,
-          "r_x3": 70.66666666666667,
+          "r_x3": 73.10852522817731,
-          "r_y3": 128.66666666666666,
+          "r_y3": 130.00136157890958,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
-        "confidence": 0.6702765056141881,
+        "confidence": 1.0,
        "from_ocr": true
      }
    ],
@ -90,10 +90,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -107,20 +107,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -132,20 +132,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -157,20 +157,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
@ -195,10 +195,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -212,20 +212,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -237,20 +237,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -262,20 +262,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
@ -293,10 +293,10 @@
            "id": 0,
            "label": "text",
            "bbox": {
-              "l": 69.0,
+              "l": 70.90211866351085,
-              "t": 74.66666666666667,
+              "t": 76.99999977896756,
-              "r": 506.6666666666667,
+              "r": 504.8720079864275,
-              "b": 153.33333333333334,
+              "b": 152.70503335218433,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.9715733528137207,
@ -310,20 +310,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 71.33333333333333,
+                  "r_x0": 73.34702132031646,
-                  "r_y0": 99.33333333333333,
+                  "r_y0": 97.99999977896755,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 503.64955224479564,
-                  "r_y1": 99.33333333333333,
+                  "r_y1": 97.99999977896755,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 503.64955224479564,
-                  "r_y2": 74.66666666666667,
+                  "r_y2": 76.99999977896756,
-                  "r_x3": 71.33333333333333,
+                  "r_x3": 73.34702132031646,
-                  "r_y3": 74.66666666666667,
+                  "r_y3": 76.99999977896756,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
-                "confidence": 0.9555703127793324,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -335,20 +335,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 69.0,
+                  "r_x0": 70.90211866351085,
-                  "r_y0": 126.66666666666667,
+                  "r_y0": 124.83139551297342,
-                  "r_x1": 506.6666666666667,
+                  "r_x1": 504.8720079864275,
-                  "r_y1": 126.66666666666667,
+                  "r_y1": 124.83139551297342,
-                  "r_x2": 506.6666666666667,
+                  "r_x2": 504.8720079864275,
-                  "r_y2": 100.66666666666667,
+                  "r_y2": 102.66666671251768,
-                  "r_x3": 69.0,
+                  "r_x3": 70.90211866351085,
-                  "r_y3": 100.66666666666667,
+                  "r_y3": 102.66666671251768,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
-                "confidence": 0.9741098171752292,
+                "confidence": 1.0,
                "from_ocr": true
              },
              {
@ -360,20 +360,20 @@
                  "a": 255
                },
                "rect": {
-                  "r_x0": 70.66666666666667,
+                  "r_x0": 73.10852522817731,
-                  "r_y0": 153.33333333333334,
+                  "r_y0": 152.70503335218433,
-                  "r_x1": 154.0,
+                  "r_x1": 153.04479435252625,
-                  "r_y1": 153.33333333333334,
+                  "r_y1": 152.70503335218433,
-                  "r_x2": 154.0,
+                  "r_x2": 153.04479435252625,
-                  "r_y2": 128.66666666666666,
+                  "r_y2": 130.00136157890958,
-                  "r_x3": 70.66666666666667,
+                  "r_x3": 73.10852522817731,
-                  "r_y3": 128.66666666666666,
+                  "r_y3": 130.00136157890958,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
-                "confidence": 0.6702765056141881,
+                "confidence": 1.0,
                "from_ocr": true
              }
            ],
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.doctags.txt
@ -0,0 +1,3 @@
 <doctag><text><loc_371><loc_410><loc_438><loc_422>package</text>
 <text><loc_75><loc_426><loc_440><loc_454>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained</text>
 </doctag>
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.json
@ -0,0 +1,109 @@
 {
  "schema_name": "DoclingDocument",
  "version": "1.3.0",
  "name": "ocr_test_rotated_180",
  "origin": {
    "mimetype": "application/pdf",
    "binary_hash": 2530576989861832966,
    "filename": "ocr_test_rotated_180.pdf",
    "uri": null
  },
  "furniture": {
    "self_ref": "#/furniture",
    "parent": null,
    "children": [],
    "content_layer": "furniture",
    "name": "_root_",
    "label": "unspecified"
  },
  "body": {
    "self_ref": "#/body",
    "parent": null,
    "children": [
      {
        "cref": "#/texts/0"
      },
      {
        "cref": "#/texts/1"
      }
    ],
    "content_layer": "body",
    "name": "_root_",
    "label": "unspecified"
  },
  "groups": [],
  "texts": [
    {
      "self_ref": "#/texts/0",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "body",
      "label": "text",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 441.304584329099,
            "t": 151.67751306395223,
            "r": 521.9863114205704,
            "b": 132.09610360960653,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            7
          ]
        }
      ],
      "orig": "package",
      "text": "package",
      "formatting": null,
      "hyperlink": null
    },
    {
      "self_ref": "#/texts/1",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "body",
      "label": "text",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 89.12133215549848,
            "t": 124.86176457554109,
            "r": 523.3501733013318,
            "b": 77.02339849621205,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            86
          ]
        }
      ],
      "orig": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "formatting": null,
      "hyperlink": null
    }
  ],
  "pictures": [],
  "tables": [],
  "key_value_items": [],
  "form_items": [],
  "pages": {
    "1": {
      "size": {
        "width": 595.201171875,
        "height": 841.9216918945312
      },
      "image": null,
      "page_no": 1
    }
  }
 }
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.md
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.md
@ -0,0 +1,3 @@
 package
 Docling bundles PDF document conversion to JSON and Markdown in an easy self contained
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_180.pages.json
@ -0,0 +1,445 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 595.201171875,
      "height": 841.9216918945312
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 90.46133071208328,
          "r_y0": 764.8982933983192,
          "r_x1": 520.7638616365624,
          "r_y1": 764.8982933983192,
          "r_x2": 520.7638616365624,
          "r_y2": 744.0929853742306,
          "r_x3": 90.46133071208328,
          "r_y3": 744.0929853742306,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 89.12133215549848,
          "r_y0": 741.5247710689902,
          "r_x1": 523.3501733013318,
          "r_y1": 741.5247710689902,
          "r_x2": 523.3501733013318,
          "r_y2": 717.0599273189902,
          "r_x3": 89.12133215549848,
          "r_y3": 717.0599273189902,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 441.304584329099,
          "r_y0": 709.8255882849247,
          "r_x1": 521.9863114205704,
          "r_y1": 709.8255882849247,
          "r_x2": 521.9863114205704,
          "r_y2": 690.244178830579,
          "r_x3": 441.304584329099,
          "r_y3": 690.244178830579,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 2,
          "page_no": 0,
          "cluster": {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "text",
            "bbox": {
              "l": 89.12133215549848,
              "t": 717.0599273189902,
              "r": 523.3501733013318,
              "b": 764.8982933983192,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.7318570613861084,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 90.46133071208328,
                  "r_y0": 764.8982933983192,
                  "r_x1": 520.7638616365624,
                  "r_y1": 764.8982933983192,
                  "r_x2": 520.7638616365624,
                  "r_y2": 744.0929853742306,
                  "r_x3": 90.46133071208328,
                  "r_y3": 744.0929853742306,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 89.12133215549848,
                  "r_y0": 741.5247710689902,
                  "r_x1": 523.3501733013318,
                  "r_y1": 741.5247710689902,
                  "r_x2": 523.3501733013318,
                  "r_y2": 717.0599273189902,
                  "r_x3": 89.12133215549848,
                  "r_y3": 717.0599273189902,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 2,
          "page_no": 0,
          "cluster": {
            "id": 2,
            "label": "text",
            "bbox": {
              "l": 441.304584329099,
              "t": 690.244178830579,
              "r": 521.9863114205704,
              "b": 709.8255882849247,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5982133150100708,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 441.304584329099,
                  "r_y0": 709.8255882849247,
                  "r_x1": 521.9863114205704,
                  "r_y1": 709.8255882849247,
                  "r_x2": 521.9863114205704,
                  "r_y2": 690.244178830579,
                  "r_x3": 441.304584329099,
                  "r_y3": 690.244178830579,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": []
    }
  }
 ]
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.doctags.txt
@ -0,0 +1,3 @@
 <doctag><page_header><loc_426><loc_60><loc_454><loc_424>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained</page_header>
 <text><loc_411><loc_61><loc_422><loc_128>package</text>
 </doctag>
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.json
@ -0,0 +1,109 @@
 {
  "schema_name": "DoclingDocument",
  "version": "1.3.0",
  "name": "ocr_test_rotated_270",
  "origin": {
    "mimetype": "application/pdf",
    "binary_hash": 10890858393843077593,
    "filename": "ocr_test_rotated_270.pdf",
    "uri": null
  },
  "furniture": {
    "self_ref": "#/furniture",
    "parent": null,
    "children": [],
    "content_layer": "furniture",
    "name": "_root_",
    "label": "unspecified"
  },
  "body": {
    "self_ref": "#/body",
    "parent": null,
    "children": [
      {
        "cref": "#/texts/0"
      },
      {
        "cref": "#/texts/1"
      }
    ],
    "content_layer": "body",
    "name": "_root_",
    "label": "unspecified"
  },
  "groups": [],
  "texts": [
    {
      "self_ref": "#/texts/0",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "furniture",
      "label": "page_header",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 717.1685859527342,
            "t": 524.2990548540179,
            "r": 764.8982839673505,
            "b": 90.32916553110118,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            86
          ]
        }
      ],
      "orig": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "formatting": null,
      "hyperlink": null
    },
    {
      "self_ref": "#/texts/1",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "body",
      "label": "text",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 691.4680194659409,
            "t": 523.0765988200898,
            "r": 709.8255850278712,
            "b": 442.3948768148814,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            7
          ]
        }
      ],
      "orig": "package",
      "text": "package",
      "formatting": null,
      "hyperlink": null
    }
  ],
  "pictures": [],
  "tables": [],
  "key_value_items": [],
  "form_items": [],
  "pages": {
    "1": {
      "size": {
        "width": 841.9216918945312,
        "height": 595.201171875
      },
      "image": null,
      "page_no": 1
    }
  }
 }
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.md
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.md
@ -0,0 +1 @@
 package
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_270.pages.json
@ -0,0 +1,446 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 841.9216918945312,
      "height": 595.201171875
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 744.0930045534915,
          "r_y0": 504.87200373583954,
          "r_x1": 764.8982839673505,
          "r_y1": 504.87200373583954,
          "r_x2": 764.8982839673505,
          "r_y2": 73.34702001188118,
          "r_x3": 744.0930045534915,
          "r_y3": 73.34702001188118,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 717.1685859527342,
          "r_y0": 504.8720063438988,
          "r_x1": 737.9738558298501,
          "r_y1": 504.8720063438988,
          "r_x2": 737.9738558298501,
          "r_y2": 70.90211702098213,
          "r_x3": 717.1685859527342,
          "r_y3": 70.90211702098213,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 691.4680194659409,
          "r_y0": 152.80629506011857,
          "r_x1": 709.8255850278712,
          "r_y1": 152.80629506011857,
          "r_x2": 709.8255850278712,
          "r_y2": 72.12457305491027,
          "r_x3": 691.4680194659409,
          "r_y3": 72.12457305491027,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 8,
          "page_no": 0,
          "cluster": {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 8,
          "page_no": 0,
          "cluster": {
            "id": 8,
            "label": "text",
            "bbox": {
              "l": 691.4680194659409,
              "t": 72.12457305491027,
              "r": 709.8255850278712,
              "b": 152.80629506011857,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 1.0,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 691.4680194659409,
                  "r_y0": 152.80629506011857,
                  "r_x1": 709.8255850278712,
                  "r_y1": 152.80629506011857,
                  "r_x2": 709.8255850278712,
                  "r_y2": 72.12457305491027,
                  "r_x3": 691.4680194659409,
                  "r_y3": 72.12457305491027,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 717.1685859527342,
              "t": 70.90211702098213,
              "r": 764.8982839673505,
              "b": 504.8720063438988,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6915205121040344,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 744.0930045534915,
                  "r_y0": 504.87200373583954,
                  "r_x1": 764.8982839673505,
                  "r_y1": 504.87200373583954,
                  "r_x2": 764.8982839673505,
                  "r_y2": 73.34702001188118,
                  "r_x3": 744.0930045534915,
                  "r_y3": 73.34702001188118,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 717.1685859527342,
                  "r_y0": 504.8720063438988,
                  "r_x1": 737.9738558298501,
                  "r_y1": 504.8720063438988,
                  "r_x2": 737.9738558298501,
                  "r_y2": 70.90211702098213,
                  "r_x3": 717.1685859527342,
                  "r_y3": 70.90211702098213,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        }
      ]
    }
  }
 ]
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.doctags.txt
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.doctags.txt
@ -0,0 +1,3 @@
 <doctag><page_header><loc_46><loc_75><loc_74><loc_440>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained</page_header>
 <text><loc_78><loc_370><loc_90><loc_438>package</text>
 </doctag>
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.json
@ -0,0 +1,109 @@
 {
  "schema_name": "DoclingDocument",
  "version": "1.3.0",
  "name": "ocr_test_rotated_90",
  "origin": {
    "mimetype": "application/pdf",
    "binary_hash": 6989291015361162334,
    "filename": "ocr_test_rotated_90.pdf",
    "uri": null
  },
  "furniture": {
    "self_ref": "#/furniture",
    "parent": null,
    "children": [],
    "content_layer": "furniture",
    "name": "_root_",
    "label": "unspecified"
  },
  "body": {
    "self_ref": "#/body",
    "parent": null,
    "children": [
      {
        "cref": "#/texts/0"
      },
      {
        "cref": "#/texts/1"
      }
    ],
    "content_layer": "body",
    "name": "_root_",
    "label": "unspecified"
  },
  "groups": [],
  "texts": [
    {
      "self_ref": "#/texts/0",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "furniture",
      "label": "page_header",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 77.10171546422428,
            "t": 506.07735421856773,
            "r": 124.91101654503161,
            "b": 71.88562244773436,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            86
          ]
        }
      ],
      "orig": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained",
      "formatting": null,
      "hyperlink": null
    },
    {
      "self_ref": "#/texts/1",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "content_layer": "body",
      "label": "text",
      "prov": [
        {
          "page_no": 1,
          "bbox": {
            "l": 131.21306574279092,
            "t": 154.19400205373182,
            "r": 152.19606490864376,
            "b": 74.12495603322407,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            7
          ]
        }
      ],
      "orig": "package",
      "text": "package",
      "formatting": null,
      "hyperlink": null
    }
  ],
  "pictures": [],
  "tables": [],
  "key_value_items": [],
  "form_items": [],
  "pages": {
    "1": {
      "size": {
        "width": 841.9216918945312,
        "height": 595.201171875
      },
      "image": null,
      "page_no": 1
    }
  }
 }
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.md
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.md
@ -0,0 +1 @@
 package
--- a/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.pages.json
+++ b/tests/data_scanned/groundtruth/docling_v2/ocr_test_rotated_90.pages.json
@ -0,0 +1,446 @@
 [
  {
    "page_no": 0,
    "size": {
      "width": 841.9216918945312,
      "height": 595.201171875
    },
    "cells": [
      {
        "index": 0,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 77.10171546422428,
          "r_y0": 520.7638577050515,
          "r_x1": 96.6831586150625,
          "r_y1": 520.7638577050515,
          "r_x2": 96.6831586150625,
          "r_y2": 89.23887398109309,
          "r_x3": 77.10171546422428,
          "r_y3": 89.23887398109309,
          "coord_origin": "TOPLEFT"
        },
        "text": "Docling bundles PDF document conversion to",
        "orig": "Docling bundles PDF document conversion to",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 1,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 100.55299576256091,
          "r_y0": 523.3155494272656,
          "r_x1": 124.91101654503161,
          "r_y1": 523.3155494272656,
          "r_x2": 124.91101654503161,
          "r_y2": 89.12381765643227,
          "r_x3": 100.55299576256091,
          "r_y3": 89.12381765643227,
          "coord_origin": "TOPLEFT"
        },
        "text": "JSON and Markdown in an easy self contained",
        "orig": "JSON and Markdown in an easy self contained",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      },
      {
        "index": 2,
        "rgba": {
          "r": 0,
          "g": 0,
          "b": 0,
          "a": 255
        },
        "rect": {
          "r_x0": 131.21306574279092,
          "r_y0": 521.0762158417759,
          "r_x1": 152.19606490864376,
          "r_y1": 521.0762158417759,
          "r_x2": 152.19606490864376,
          "r_y2": 441.0071698212682,
          "r_x3": 131.21306574279092,
          "r_y3": 441.0071698212682,
          "coord_origin": "TOPLEFT"
        },
        "text": "package",
        "orig": "package",
        "text_direction": "left_to_right",
        "confidence": 1.0,
        "from_ocr": true
      }
    ],
    "parsed_page": null,
    "predictions": {
      "layout": {
        "clusters": [
          {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          }
        ]
      },
      "tablestructure": {
        "table_map": {}
      },
      "figures_classification": null,
      "equations_prediction": null,
      "vlm_response": null
    },
    "assembled": {
      "elements": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        },
        {
          "label": "text",
          "id": 1,
          "page_no": 0,
          "cluster": {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "body": [
        {
          "label": "text",
          "id": 1,
          "page_no": 0,
          "cluster": {
            "id": 1,
            "label": "text",
            "bbox": {
              "l": 131.21306574279092,
              "t": 441.0071698212682,
              "r": 152.19606490864376,
              "b": 521.0762158417759,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.5234212875366211,
            "cells": [
              {
                "index": 2,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 131.21306574279092,
                  "r_y0": 521.0762158417759,
                  "r_x1": 152.19606490864376,
                  "r_y1": 521.0762158417759,
                  "r_x2": 152.19606490864376,
                  "r_y2": 441.0071698212682,
                  "r_x3": 131.21306574279092,
                  "r_y3": 441.0071698212682,
                  "coord_origin": "TOPLEFT"
                },
                "text": "package",
                "orig": "package",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "package"
        }
      ],
      "headers": [
        {
          "label": "page_header",
          "id": 0,
          "page_no": 0,
          "cluster": {
            "id": 0,
            "label": "page_header",
            "bbox": {
              "l": 77.10171546422428,
              "t": 89.12381765643227,
              "r": 124.91101654503161,
              "b": 523.3155494272656,
              "coord_origin": "TOPLEFT"
            },
            "confidence": 0.6016772389411926,
            "cells": [
              {
                "index": 0,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 77.10171546422428,
                  "r_y0": 520.7638577050515,
                  "r_x1": 96.6831586150625,
                  "r_y1": 520.7638577050515,
                  "r_x2": 96.6831586150625,
                  "r_y2": 89.23887398109309,
                  "r_x3": 77.10171546422428,
                  "r_y3": 89.23887398109309,
                  "coord_origin": "TOPLEFT"
                },
                "text": "Docling bundles PDF document conversion to",
                "orig": "Docling bundles PDF document conversion to",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              },
              {
                "index": 1,
                "rgba": {
                  "r": 0,
                  "g": 0,
                  "b": 0,
                  "a": 255
                },
                "rect": {
                  "r_x0": 100.55299576256091,
                  "r_y0": 523.3155494272656,
                  "r_x1": 124.91101654503161,
                  "r_y1": 523.3155494272656,
                  "r_x2": 124.91101654503161,
                  "r_y2": 89.12381765643227,
                  "r_x3": 100.55299576256091,
                  "r_y3": 89.12381765643227,
                  "coord_origin": "TOPLEFT"
                },
                "text": "JSON and Markdown in an easy self contained",
                "orig": "JSON and Markdown in an easy self contained",
                "text_direction": "left_to_right",
                "confidence": 1.0,
                "from_ocr": true
              }
            ],
            "children": []
          },
          "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"
        }
      ]
    }
  }
 ]
--- a/Show More
+++ b/Show More
		`@ -0,0 +1,2 @@`
							`<doctag><text><loc_60><loc_46><loc_424><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>`
							`</doctag>`
		`@ -0,0 +1 @@`
							`Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package`
		`@ -0,0 +1 @@`
							{"_name": "", "type": "pdf-document", "description": {"title": null, "abstract": null, "authors": null, "affiliations": null, "subjects": null, "keywords": null, "publication_date": null, "languages": null, "license": null, "publishers": null, "url_refs": null, "references": null, "publication": null, "reference_count": null, "citation_count": null, "citation_date": null, "advanced": null, "analytics": null, "logs": [], "collection": null, "acquisition": null}, "file-info": {"filename": "ocr_test_rotated.pdf", "filename-prov": null, "document-hash": "4a282813d93824eaa9bc2a0b2a0d6d626ecc8f5f380bd1320e2dd3e8e53c2ba6", "#-pages": 1, "collection-name": null, "description": null, "page-hashes": [{"hash": "f8a4dc72d8b159f69d0bc968b97f3fb9e0ac59dcb3113492432755835935d9b3", "model": "default", "page": 1}]}, "main-text": [{"prov": [{"bbox": [131.21306574279092, 74.12495603322407, 152.19606490864376, 154.19400205373182], "page": 1, "span": [0, 7], "__ref_s3_data": null}], "text": "package", "type": "paragraph", "payload": null, "name": "Text", "font": null}], "figures": [], "tables": [], "bitmaps": null, "equations": [], "footnotes": [], "page-dimensions": [{"height": 595.201171875, "page": 1, "width": 841.9216918945312}], "page-footers": [], "page-headers": [], "_s3_data": null, "identifiers": null}
		`@ -0,0 +1 @@`
							[{"page_no": 0, "size": {"width": 841.9216918945312, "height": 595.201171875}, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}, {"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "predictions": {"layout": {"clusters": [{"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}]}, "tablestructure": {"table_map": {}}, "figures_classification": null, "equations_prediction": null, "vlm_response": null}, "assembled": {"elements": [{"label": "page_header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"}, {"label": "text", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "package"}], "body": [{"label": "text", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "text", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}, "confidence": 0.5234212875366211, "cells": [{"id": 2, "text": "package", "bbox": {"l": 131.21306574279092, "t": 441.0071698212682, "r": 152.19606490864376, "b": 521.0762158417759, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "package"}], "headers": [{"label": "page_header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "page_header", "bbox": {"l": 77.10171546422428, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}, "confidence": 0.6016772389411926, "cells": [{"id": 0, "text": "Docling bundles PDF document conversion to", "bbox": {"l": 77.10171546422428, "t": 89.23887398109309, "r": 96.6831586150625, "b": 520.7638577050515, "coord_origin": "TOPLEFT"}}, {"id": 1, "text": "JSON and Markdown in an easy self contained", "bbox": {"l": 100.55299576256091, "t": 89.12381765643227, "r": 124.91101654503161, "b": 523.3155494272656, "coord_origin": "TOPLEFT"}}], "children": []}, "text": "Docling bundles PDF document conversion to JSON and Markdown in an easy self contained"}]}}]
		`@ -0,0 +1,3 @@`
							`package`

							`Docling bundles PDF document conversion to JSON and Markdown in an easy self contained`
`@ -1,2 +1,2 @@`
	`<doctag><text><loc_58><loc_44><loc_426><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>`	`<doctag><text><loc_60><loc_46><loc_424><loc_91>Docling bundles PDF document conversion to JSON and Markdown in an easy self contained package</text>`
	`</doctag>`	`</doctag>`