Commit Graph

  • 4bcd483d2d
    Fix typos stephencox-ict 2025-07-15 13:53:58 +1200
  • a436be7367
    feat: Add option to control empty clusters in layout postprocessing (#1940) Christoph Auer 2025-07-14 18:32:01 +0200
  • 3e71e6fc6e Add option to control empty clusters in layout postprocessing Christoph Auer 2025-07-14 12:20:37 +0200
  • 130a10e2d9 Add multi-page TIFF test data and verification tests copilot-swe-agent[bot] 2025-07-14 08:23:50 +0000
  • db1daf91f5
    DCO Remediation Commit for William Easton <bill.easton@elastic.co> William Easton 2025-07-12 20:13:06 -0500
  • df5c15195b
    Support hierarchical markdown William Easton 2025-07-12 20:10:27 -0500
  • be5d0f71a3 DCO Remediation Commit for Gustavo Lima <crymerom@gmail.com> Gustavo Lima 2025-07-11 16:42:52 +0100
  • 84e45d7191 docs: fix README getting started to reflect single document conversion Gustavo Lima 2025-07-11 16:36:49 +0100
  • b6765b0c09
    Merge 117add0396 into 95e70962f1 benichou 2025-07-11 12:13:33 +0200
  • 6e0b3dcaf1 Remove pointless test Christoph Auer 2025-07-11 10:27:51 +0200
  • 05f51b30d9 add RGB conversion Christoph Auer 2025-07-11 10:26:43 +0200
  • 6aa85cc933 Merge branch 'main' of github.com:DS4SD/docling into copilot/fix-1903 Christoph Auer 2025-07-11 10:21:21 +0200
  • 95e70962f1
    fix: KeyError: 'fPr' when processing latex fractions in DOCX files (#1926) Copilot 2025-07-11 09:52:14 +0200
  • 20d2ff8ee0 Relax accelerate min version Christoph Auer 2025-07-11 09:51:13 +0200
  • 1c54af1a94 Add accelerate Christoph Auer 2025-07-11 09:47:40 +0200
  • c8e9dc4c86 Merge branch 'main' of github.com:DS4SD/docling into cau/thread-safety-fixes-again Christoph Auer 2025-07-11 09:46:36 +0200
  • c5fb353f10
    fix: Change granite vision model URL from preview to stable version (#1925) Copilot 2025-07-11 08:46:03 +0200
  • c1d722725f Fix multi-page TIFF image support copilot-swe-agent[bot] 2025-07-10 15:04:28 +0000
  • 420df478f3 Initial plan copilot-swe-agent[bot] 2025-07-10 14:48:19 +0000
  • 6c4bf9d087 chore: bump version to 2.41.0 [skip ci] v2.41.0 github-actions[bot] 2025-07-10 14:25:05 +0000
  • 9d0c200573
    Update to granite vision 3.3 (2) Christoph Auer 2025-07-10 16:24:09 +0200
  • b16b9f5d27
    Update to granite vision 3.3 Christoph Auer 2025-07-10 16:23:52 +0200
  • f4c1836c96 functional working two-stage, need to implement a good prompt now to leverage bounding boxes dev/add-two-stage-vlm Peter Staar 2025-07-10 16:15:54 +0200
  • b2d5c783ae working two-stage vlm approach from the cli Peter Staar 2025-07-10 15:38:15 +0200
  • e85bcb8d0b Use debug logging, remove unnecesary test Christoph Auer 2025-07-10 15:33:55 +0200
  • 8f3c6ebe3c Merge branch 'main' of github.com:DS4SD/docling into copilot/fix-1915 Christoph Auer 2025-07-10 15:32:42 +0200
  • cc6193b3b9
    test: Update tests to use default PDF backend (DPv4) (#1923) Christoph Auer 2025-07-10 15:16:56 +0200
  • fb74d0c5b3 working TwoStageVlmModel Peter Staar 2025-07-10 15:11:53 +0200
  • c72572b36c Add comprehensive test for OMML fraction fPr fix copilot-swe-agent[bot] 2025-07-10 13:11:52 +0000
  • be63114d43 Fix granite vision model URL from preview to stable version copilot-swe-agent[bot] 2025-07-10 13:10:34 +0000
  • 7bd828f121 Initial analysis and fix for KeyError: 'fPr' in OMML fraction processing copilot-swe-agent[bot] 2025-07-10 13:06:21 +0000
  • d02d2772db Initial plan copilot-swe-agent[bot] 2025-07-10 13:01:54 +0000
  • ff5b560a3f Initial plan copilot-swe-agent[bot] 2025-07-10 13:01:18 +0000
  • 460b247b66 OCR tests use DPv1 until rotation bugs are fixed Christoph Auer 2025-07-10 11:39:22 +0200
  • 8a9228a9a2 Update tests to use default PDF backend (DPv4) Christoph Auer 2025-07-10 10:38:34 +0200
  • b2336830eb fixed the circular dependenciea Peter Staar 2025-07-10 10:35:47 +0200
  • 70872e6539 merged with main and refactored the code to fix MyPy Peter Staar 2025-07-10 09:58:06 +0200
  • e596143bf8 Merge branch 'main' into dev/add-two-stage-vlm Peter Staar 2025-07-10 06:52:31 +0200
  • 0f395688b8 refactored the code and added vlm2stage as a cli option Peter Staar 2025-07-10 06:48:34 +0200
  • fb900115ee
    Merge 7b4a4457e8 into 2b8616d6d5 Clément Doumouro 2025-07-10 04:37:57 +0000
  • 2b8616d6d5
    feat: Layout model specification and multiple choices (#1910) Christoph Auer 2025-07-10 06:37:27 +0200
  • 34059a8021 Pull v1 changed test GT from main Christoph Auer 2025-07-09 18:08:05 +0200
  • 4175344d3d Merge from main, update tests Christoph Auer 2025-07-09 17:11:08 +0200
  • 7b4a4457e8 fix(layout,table): update e2e test Clément Doumouro 2025-07-09 17:00:31 +0200
  • bba05d1c37 fix(layout,table): orientation-aware layout and table detection Clément Doumouro 2025-07-09 13:17:42 +0200
  • 8ffa01bc9f fix(layout,table): orientation-aware layout and table detection Clément Doumouro 2025-07-04 10:12:36 +0200
  • a47fd8372d fix(ocr): refactor rotation utilities Clément Doumouro 2025-04-08 17:28:06 +0200
  • c0f170bb72 fix(ocr): move bounding bow rotation util to orientation.py Clément Doumouro 2025-04-08 15:08:24 +0200
  • 389e2389e7 fix(ocr): fix layout debug Clément Doumouro 2025-04-08 10:54:19 +0200
  • b54eeb185f fix(ocr): rotate image to the natural orientation before layout prediction Clément Doumouro 2025-04-04 17:31:45 +0200
  • 05123c9342 Use device_map for transformer models Christoph Auer 2025-07-09 16:49:21 +0200
  • ec588df971
    feat: enable precision control in float serialization (#1914) Panos Vagenas 2025-07-09 16:39:17 +0200
  • 892d103483 repin docling-core Panos Vagenas 2025-07-09 14:41:43 +0200
  • 6c8a2a67cb update test float precision Panos Vagenas 2025-07-09 14:14:37 +0200
  • 785c6b37f5 parametrize float serialization, propagate core updates Panos Vagenas 2025-07-09 13:49:11 +0200
  • b16b4ea069 chore: propagate precision control in float serialization Panos Vagenas 2025-07-09 10:50:06 +0200
  • 43cae7be7c fix deps issue with openai-whipser>numba>llvmlite Christoph Auer 2025-07-09 10:28:32 +0200
  • 09f618e90c Merge from main Christoph Auer 2025-07-09 09:27:28 +0200
  • dcf6fd6a41 fixed the MyPy complaining Peter Staar 2025-07-09 06:48:03 +0200
  • 5ad5fc36ee docs: add documentation for confidence scores Fabiano Franz 2025-07-08 17:39:54 -0300
  • 931eb55b88
    fix(ocr-utils): unit test and fix the rotate_bounding_box function (#1897) Clément Doumouro 2025-07-08 18:03:29 +0200
  • 9ed842ce6d Back to uppercase constants Christoph Auer 2025-07-08 16:41:40 +0200
  • c10e2920a4 refactoring redundant code and fixing mypy errors Peter Staar 2025-07-08 16:37:20 +0200
  • b5479ab971 working on MyPy Peter Staar 2025-07-08 15:05:54 +0200
  • 49e9a00c05 merged in layout-model-spec Peter Staar 2025-07-08 13:29:30 +0200
  • 517230b9c4 Updated naming Christoph Auer 2025-07-08 13:07:56 +0200
  • 5a794392e2
    ️ Speed up function _parse_orientation by 242% Here’s how you should rewrite the code for **maximum speed** based on your profiler. codeflash-ai[bot] 2025-07-08 09:35:38 +0000
  • af0461e5b1 Move to pipeline_options.layout_options.model Christoph Auer 2025-07-08 11:24:06 +0200
  • f2094f858b Establish layout_model spec and example instantations Christoph Auer 2025-07-08 10:23:18 +0200
  • 810446c8dc feat: working on a two stage VLM model Peter Staar 2025-07-08 09:49:39 +0200
  • 5e1e82ab3b Add ability to preprocess VLM response Shkarupa Alex 2025-07-08 09:29:22 +0300
  • 3b8deae9ce
    ️ Speed up method LayoutPostprocessor._process_special_clusters by 653% Here are targeted optimizations based on the profiling output and the code. codeflash-ai[bot] 2025-07-08 05:43:53 +0000
  • 4eceefa47c feat: add TwoStageVlmModel Peter Staar 2025-07-08 07:38:48 +0200
  • a07ba863c4
    feat: add image-text-to-text models in transformers (#1772) geoHeil 2025-07-08 05:54:57 +0200
  • 218d7d4a85 add prompt style and examples Michele Dolfi 2025-07-07 20:08:10 +0200
  • 721916e22c Merge remote-tracking branch 'origin/main' into feat/add-dolphin Michele Dolfi 2025-07-07 18:41:34 +0200
  • e25873d557
    fix: docs are missing osd packages for tesseract on RHEL (#1905) VIktor Kuropiantnyk 2025-07-07 17:06:26 +0200
  • b8813eea80
    feat(vlm): Dynamic prompts (#1808) Shkarupa Alex 2025-07-07 17:58:42 +0300
  • edd4356aac
    fix: use only backend for picture classifier (#1904) Michele Dolfi 2025-07-07 16:23:16 +0200
  • a60141eca3 Fixed missing packages in the docs on tesseract Viktor Kuropiatnyk 2025-07-07 16:10:16 +0200
  • 91dedc3b63 Merge branch 'vlm-dynamic-prompt' of https://github.com/shkarupa-alex/docling into vlm-dynamic-prompt Shkarupa Alex 2025-07-07 17:09:11 +0300
  • 4c916d65fe Swap inference engine to LM Studio Shkarupa Alex 2025-07-07 17:04:29 +0300
  • b23925a74e
    Use lmstudio-community model Christoph Auer 2025-07-07 14:57:38 +0200
  • 8f858c89ef use backend for picture classifier Michele Dolfi 2025-07-07 14:47:02 +0200
  • adace463f3 fix(ocr-utils): unit test and fix the rotate_bounding_box function Clément Doumouro 2025-07-04 09:51:14 +0200
  • a1df985ef4 DCO Remediation Commit for Shkarupa Alex <shkarupa.alex@gmail.com> Shkarupa Alex 2025-07-07 14:30:31 +0300
  • 5d209f5db4 Merge branch 'vlm-dynamic-prompt' of https://github.com/shkarupa-alex/docling into vlm-dynamic-prompt Shkarupa Alex 2025-07-07 14:29:20 +0300
  • 865546a0cc Sign-off Shkarupa Alex 2025-07-07 14:27:56 +0300
  • 347834d757
    Fix example HF repo link Christoph Auer 2025-07-07 13:20:22 +0200
  • 1a162066dd Replace Page with SegmentedPage Shkarupa Alex 2025-07-07 12:41:22 +0300
  • dd8fde7f19
    fix: typo in asr options (#1902) Michele Dolfi 2025-07-07 08:59:14 +0200
  • ea2cf8dbd5 fix typo Michele Dolfi 2025-07-07 07:08:22 +0200
  • f4a1c06937 chore: bump version to 2.40.0 [skip ci] v2.40.0 github-actions[bot] 2025-07-04 15:31:36 +0000
  • ec6cf6f7e8
    feat: Introduce LayoutOptions to control layout postprocessing behaviour (#1870) Christoph Auer 2025-07-04 15:36:13 +0200
  • 6ddedfbbfb Resolve conflicts Christoph Auer 2025-07-04 14:48:50 +0200
  • 598c9c53d4
    fix: Secure torch model inits with global locks (#1884) Christoph Auer 2025-07-04 07:27:26 +0200
  • 13865c06f5
    perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell (#1875) Qiefan Jiang 2025-07-03 19:12:06 +0800
  • 0b69608dbf DCO Remediation Commit for Qiefan Jiang <jiangqiefan@bytedance.com> Qiefan Jiang 2025-07-03 10:50:01 +0800
  • b6b5b090a9 fix lint Qiefan Jiang 2025-07-03 10:27:52 +0800
  • c0ef74d9cc fix: Secure torch model inits with global locks Christoph Auer 2025-07-02 17:20:38 +0200