docling

mirror of https://github.com/DS4SD/docling.git synced 2025-12-08 12:48:28 +00:00

Author	SHA1	Message	Date
Michele Dolfi	8af228f1e2	docs(examples): processing parquet file of images (#2641 ) * add example processing parquet file of images Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * vlm using vllm api Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use openvino and add more docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add default input file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * change default to standard for running in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use simple rapidocr without openvino in the CI example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-11-19 06:39:25 +01:00
Michele Dolfi	da4c2e9dbe	fix: remove py3.14 requirement for default rapidocr (#2639 ) * remove py3.14 requirement for default rapidocr Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove easyocr Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-11-18 17:23:43 +01:00
Ryan Soliveres	d549445e78	docs: Move Installation and Quickstart (Usage) under Getting started (#2644 ) * docs: Move Installation and Quickstart (Usage) under Getting started Moved Installation and Usage (Quickstart) under Getting started section Rename installation folder to documentation folder Rename installation/index.md to documentation/installation.md Duplicate usage/index.md to documentation directory and rename it to documentation/quickstart.md Add redirection from installation and usage Signed-off-by: Ryan S <ryansoliveres@users.noreply.github.com> * docs: Move Installation and Quickstart under Getting started Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * docs: Move Installation and Quickstart under Getting started Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * git commit -m "DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com>" Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * git commit --allow-empty -m "DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com>" Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> --------- Signed-off-by: Ryan S <ryansoliveres@users.noreply.github.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com>	2025-11-18 17:09:41 +01:00
Panos Vagenas	ac9fc585bb	docs: add redirection from getting started page (#2640 ) Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-11-17 14:13:51 +01:00
Cesar Berrospi Ramis	f5528623a7	docs(examples): remove deprecation warnings with export_to_dataframe (#2638 ) fix: remove deprecation warnings with export_to_dataframe Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-11-17 12:48:41 +01:00
github-actions[bot]	d6ddf9f4cb	chore: bump version to 2.62.0 [skip ci] v2.62.0	2025-11-17 11:34:08 +00:00
Peter W. J. Staar	3495b73de8	feat: add the Image backend (#2627 ) * feat: add the Image backend Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Fixed single- versus multi-frame image formats Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fix: Proper usage of ImageDocumentBackend in the pipeline, deprecate old code. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Adapt tests Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: correct mets_gbs backend test Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Make ImagePageBackend.get_bitmap_rects() yield Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>	2025-11-17 11:37:22 +01:00
Robyn Johnson	ae30373ee7	docs: combine Home and Getting Started pages (#2600 ) * Update mkdocs.yml Remove navigations.sections feature so that navigation menus will collapse & expand. They are collapsed by default. * docs: add sign-off DCO Remediation Commit for Robyn J <bobbinrobyn@users.noreply.github.com> I, Robyn J <bobbinrobyn@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `b7d7441827` Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> * docs: Combine Home and Getting Started page Combine home and getting stated pages, and rename the page "Documentation" Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> --------- Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com>	2025-11-14 13:29:25 +01:00
Peter W. J. Staar	14b436d590	fix: correct the model-repo name (#2624 ) * fix: correct the model-repo name Signed-off-by: Peter Staar <taa@zurich.ibm.com> * udated model-id Signed-off-by: Peter Staar <taa@zurich.ibm.com> * reformatted code Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2025-11-14 13:21:08 +01:00
Christoph Auer	4852d8b4f2	feat(experimental): Layout + VLM model with layout prompt (#2244 ) * adding granite-docling preview Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the model specs Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Add Layout+VLM pipeline with prompt injection, ApiVlmModel updates Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update layout injection, move to experimental Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Adjust defaults Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Map Layout+VLM pipeline to GraniteDoclign Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove base_prompt from layout injection prompt Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Reinstate custom prompt Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * add demo_layout file that produces with vs without layout injection Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: wrap vlm_inference around process_images Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: carry input prompt + number of input tokens Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * fix: adapt example to run on local test file Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * fix: example now expects single document Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: add layout example to EXAMPLES_TO_SKIP Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: address comments on git Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: add inference wrapper for hf_transformers + carry input prompt Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * Feat: add track_input_prompt to ApiVlmOptions, and track input prompt as part of api vlm Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * fix: Ensure backward-compatible build_prompt by adding _internal_page ag Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Ensure backward-compatible build_prompt by adding _internal_page ag Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fixes for demo Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Typing fixes Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Restoring lost changes in vllm_model Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Restoring vlm_pipeline_api_model example Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> Co-authored-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: ElHachem02 <peterelhachem02@gmail.com>	2025-11-12 13:42:09 +01:00
Cesar Berrospi Ramis	054c4a634d	fix(docx): parse page headers and footers (#2599 ) * fix(docx): parse page headers and footers Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): rename _add_header with _add_heading To avoid confusion, rename _add_header function name with _add_heading since the function is about adding section headings. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): extend the page header and footer parsing to any content type Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): fix _add_header_footer function Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-11-10 16:10:12 +01:00
github-actions[bot]	463051b852	chore: bump version to 2.61.2 [skip ci] v2.61.2	2025-11-10 11:44:59 +00:00
Panos Vagenas	5c27567c41	fix: default to EasyOCR in Python 3.14 (#2605 ) fix: default to EasyOCR in Python 3.14 Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-11-10 12:09:00 +01:00
Peter W. J. Staar	06ae8ae29a	chore: replace ds4sd with docling-project (#2596 ) Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2025-11-07 11:25:56 +01:00
github-actions[bot]	c21327cd74	chore: bump version to 2.61.1 [skip ci] v2.61.1	2025-11-06 05:19:20 +00:00
Cesar Berrospi Ramis	ef623ffcee	fix(docx): slow table parsing (#2553 ) * chore(docx): remove unnecessary import Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(docx): simplify parsing of simple tables Simplify the parsing of tables with just text (no rich cells). Move nested function group_cell_elements out of _handle_tables for readability. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): reuse method for finding inline pictures Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): format strikethrough text Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * tests(docx): use fixtures to avoid converting same file multiple times Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(docx): remove unnecessary argument docx_obj in functions Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * tests(docx): add test for rich table cells Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): small improvements in backend and its unit tests Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(docx): parse superscript and subscript formatted text Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-11-06 05:25:53 +01:00
Cesar Berrospi Ramis	0ba8d5d9e3	fix(html): slow table parsing (#2582 ) * fix(html): simplify parsing of simple table cells Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * tests(html): add test for rich table cells Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(html): ensure table cells with formatted text are parsed as RichTableCell Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(html): simplify process_rich_table_cells since only rich cells are processed Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(html): formatted cell runs should be parsed as text items respecting the order Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: pin latest docling-core and update uv.lock Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: upgrade dependencies on uv.lock Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-11-06 05:25:36 +01:00
Robyn Johnson	8da3d287ed	docs: make navigation menus collapse and expand (#2573 ) * Update mkdocs.yml Remove navigations.sections feature so that navigation menus will collapse & expand. They are collapsed by default. * docs: add sign-off DCO Remediation Commit for Robyn J <bobbinrobyn@users.noreply.github.com> I, Robyn J <bobbinrobyn@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `b7d7441827` Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> --------- Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com>	2025-11-06 05:25:19 +01:00
github-actions[bot]	0ccc0a3245	chore: bump version to 2.61.0 [skip ci] v2.61.0	2025-11-06 04:25:06 +00:00
Panos Vagenas	fa925741b6	fix: temporarily pin NuExtract to working revision (#2588 ) * fix: temporarily pin NuExtract revision NuExtract rev 489efed was causing MPS errors Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * Revise revision comment for NuExtract transformer Updated revision comment for NU_EXTRACT_2B_TRANSFORMERS. Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * pass revision to model download Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-11-05 21:23:12 +01:00
peets	6a04e27352	feat(vlm): track generated tokens and stop reasons for VLM models (#2543 ) * feat: add enum StopReason and use it in VlmPrediction Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * add vlm_inference time for api calls and track stop reason Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * fix: rename enum to VlmStopReason Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * Propagate partial success status if page reaches max tokens Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: page with generation stopped by loop detector create partial success status Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> * Add hint for future improvement Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> * fix: remove vlm_stop_reason from extracted page data, add UNSPECIFIED state as VlmStopReason to avoid null value Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> --------- Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> Co-authored-by: Peter El Hachem <peter.el.hachem@ibm.com>	2025-11-04 19:39:09 +01:00
정물결	1a5146abc9	fix(ocr): use PSM integer values directly instead of constructor (#2578 ) * fix(ocr): use PSM integer values directly instead of constructor - Use integer psm value directly instead of calling tesserocr.PSM() - Fixed in both main_psm and script_readers initialization - tesserocr.PSM is a class with integer constants, not an enum Fixes #2576 * DCO Remediation Commit for mulgyeol <mulgyeoljung@gmail.com> I, mulgyeol <mulgyeoljung@gmail.com>, hereby add my Signed-off-by to this commit: `da63a17a3c` Signed-off-by: mulgyeol <mulgyeoljung@gmail.com> --------- Signed-off-by: mulgyeol <mulgyeoljung@gmail.com>	2025-11-04 19:32:41 +01:00
github-actions[bot]	32a5aed5ea	chore: bump version to 2.60.1 [skip ci] v2.60.1	2025-11-04 11:26:12 +00:00
Panos Vagenas	0e1b0bd816	chore: switch print statements to debug logging (#2569 ) Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-11-04 11:32:39 +01:00
Johannes Damp	fb737d026e	chore: fix malformed f-string (#2563 ) * fix: incorrect f-string in docling.datamodel.document * DCO Remediation Commit for Johannes Damp <jdamp@users.noreply.github.com> I, Johannes Damp <jdamp@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `0f690a863a` Signed-off-by: Johannes Damp <jdamp@users.noreply.github.com> --------- Signed-off-by: Johannes Damp <jdamp@users.noreply.github.com>	2025-11-04 11:01:26 +01:00
peets	8360aa5449	fix: extract response from api_image_request in picture description (#2571 ) Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> Co-authored-by: Peter El Hachem <peter.el.hachem@ibm.com>	2025-11-04 08:39:15 +01:00
github-actions[bot]	3467b0a035	chore: bump version to 2.60.0 [skip ci] v2.60.0	2025-10-31 14:43:29 +00:00
Michele Dolfi	268d027c8f	feat: Use threading in the standard pipeline and move old behavior to legacy (#2452 ) * rename standard to legacy Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove old standard pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move threaded to standard Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add backwards compatible threaded pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Updates for threaded pipeline to lower memory requirements Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * updating deps seem to remove the corrupted double-linked list error Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update pinning Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use main lock Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add more threadsafe blocks Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename batch_timeout_seconds Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>	2025-10-31 14:42:11 +01:00
Welteam	01577e92d1	docs: Update link to Open WebUI docs (#2549 ) Fix dead link to Open WebUI docs Signed-off-by: Welteam <8932313+Welteam@users.noreply.github.com>	2025-10-31 13:21:11 +01:00
Michele Dolfi	cb100437fa	docs: Update installation options with extras and review FAQ (#2548 ) * revise install docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add more FAQ Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-31 13:21:01 +01:00
Yasir Ali	741c44fa45	docs: fix typos (#2546 ) docs: fix typos in enrichments.md ('analize' -> 'analyze', 'consise' -> 'concise') Signed-off-by: Yasir Ali <engr23002@gmail.com>	2025-10-31 10:29:34 +01:00
Michele Dolfi	a51275d080	fix(pdf): threadsafe for pypdfium2 backend (#2527 ) * add threadsafe test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * test backend Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * test threaded pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test_pypdfium_threaded_pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add more threadsafe blocks Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix threadsafe in pypdfium backend Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unneccessary tests Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * restore clean test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-30 17:58:39 +01:00
github-actions[bot]	d27fe92e01	chore: bump version to 2.59.0 [skip ci] v2.59.0	2025-10-30 13:05:56 +00:00
Michele Dolfi	97aa06bfbc	docs: Add details and examples on optimal GPU setup (#2531 ) * docs for GPU optimizations Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve time reporting and improve execution Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix standard pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * tune examples with batch size 64 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add benchmark results Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * typo in excluded tests Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * explicit pipeline in table Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-30 13:22:05 +01:00
glypt	d9c90eb45e	fix: xlsx cell parsing, now returning values instead of formulas (#2520 ) * fix: xlsx doc parsing, now returning values instead of formulas Signed-off-by: glypt <8trash-can8@protonmail.ch> * fix: add test for better coverage of xlsx backend Signed-off-by: glypt <8trash-can8@protonmail.ch> * fix: add the total of ducks as a formula in the tests/data This also adds the test that the value 310 is contained in the table. Without the fix from the previous commit, it would return "B7+C7" Signed-off-by: glypt <8trash-can8@protonmail.ch> --------- Signed-off-by: glypt <8trash-can8@protonmail.ch>	2025-10-29 11:35:51 +01:00
peets	b6c892b505	feat(vlm): add num_tokens as attribtue for VlmPrediction (#2489 ) * feat: add num_tokens as attribtue for VlmPrediction * feat: implement tokens tracking for api_vlm Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> * DCO Remediation Commit for ElHachem02 <peterelhachem02@gmail.com> I, ElHachem02 <peterelhachem02@gmail.com>, hereby add my Signed-off-by to this commit: `311287f562` Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> * DCO Remediation Commit for ElHachem02 <peterelhachem02@gmail.com> I, ElHachem02 <peterelhachem02@gmail.com>, hereby add my Signed-off-by to this commit: `311287f562` Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * update return type Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * add time recorder for vlm inference and track generated token ids depending on config Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * update num_tokens to have None as value on exception Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * set default value of num_tokens to None Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> --------- Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com> Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> Signed-off-by: peets <100425207+ElHachem02@users.noreply.github.com> Co-authored-by: Peter El Hachem <peter.el.hachem@ibm.com>	2025-10-28 17:18:44 +01:00
Michele Dolfi	cdffb47b9a	feat: Support for Python 3.14 (#2530 ) * fix dependencies for py314 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add metadata and CI tests Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add back gliner Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update error message about python 3.14 availability Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * skip tests which cannot run on py 3.14 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix lint Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove vllm from py 3.14 deps Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * safe import for vllm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update lock Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove torch.compile() Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update checkbox results after docling-core changes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * cannot run mlx example in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test for rapidocr backends and skip onnxruntime on py3.14 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix other occurances of torch.compile() Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * allow torch.compile for Python <3.14. proper support will be introduced with new torch releases Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-28 14:32:15 +01:00
Cesar Berrospi Ramis	9a6fdf936b	docs: update opensearch notebook and backend documentation (#2519 ) * docs(opensearch): update the example notebook RAG with OpenSearch Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs(uspto): remove direct usage of the backend class for conversion Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: remove direct usage of backends from documentation Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-10-27 10:02:50 +01:00
github-actions[bot]	10c1f06b74	chore: bump version to 2.58.0 [skip ci] v2.58.0	2025-10-22 11:31:29 +00:00
Michele Dolfi	bbe82a68d0	feat(pdf): Support for password-protected PDF documents (#2499 ) * add test and example for PDF with password Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use docling-parse with new password feature Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add pdfbackendoptions Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * generalize backend_options and add PdfBackendOptions Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add pdf-password option Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update exception test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix docs description Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-22 12:48:01 +02:00
Michele Dolfi	89820d01b5	perf: use docling-parse-v4 as default (#2503 ) use doclnig-parse-v4 as default Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-21 17:55:43 +02:00
McGuireMark	86556d8367	docs: fix typo in mcp.md (#2502 ) Update mcp.md Typo fix Signed-off-by: McGuireMark <mark.mcguire@nimblegravity.com>	2025-10-21 17:31:28 +02:00
Cesar Berrospi Ramis	4227fcc3e1	fix(markdown): set the correct discriminator in md backend options (#2501 ) Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-10-21 14:30:48 +02:00
Legoshi	a30e6a7614	feat(backend): add generic options support and HTML image handling modes (#2011 ) * feat: add backend options support to document backends Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * feat: enhance document backends with generic backend options and improve HTML image handling Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * Refactor tests for declarativebackend Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(HTML): improve image caption handling and ensure backend options are set correctly Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix: enhance HTML backend image handling and add support for local file paths Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: Add ground truth data for test data Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * fix(HTML): skip loading SVG files in image data handling Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(html): simplify backend options and address gaps Backend options for DeclarativeDocumentBackend classes and only when necessary. Refactor caption parsing in 'img' elements and remove dummy text. Replace deprecated annotations from Typing library with native types. Replace typing annotations according to pydantic guidelines. Some documentation with pydantic annotations. Fix diff issue with test files. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * tests(html): add tests and fix bugs Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(html): refactor backend options Move backend option classes to its own module within datamodel package. Rename 'source_location' with 'source_uri' in HTMLBackendOptions. Rename 'image_fetch' with 'fetch_images' in HTMLBackendOptions. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(markdown): create a class for the markdown backend options Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-10-21 12:52:17 +02:00
Richard (Huangrui) Chu	b66624bfff	fix(xlsx): speed up by detecting the true last non-empty row/column (#2404 ) * Update msexcel_backend.py Fix #2307, Follow the instruction of https://github.com/docling-project/docling/issues/2307#issuecomment-3327248503. Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> * Update msexcel_backend.py Fix error Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> * Fix linting issues Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> * Add test files and data (Signed-off-by: Huangrui Chu <huangrui.chu.1999@gmail.com>) Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> * resolve conflict with test_backend_msexecl; update the boundary Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> * chore(xlsx): use a dataclass to represent a bounding rectangle in worksheets Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore(xlsx): increase parsing speed by iterating on 'sheet._cells' Increase the parsing speed of the spreadsheet backend by iterating on 'sheets._cells' since this is proportional to the number of created cells. Rename test file to align it to other test files. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Richard (Huangrui) Chu <65276824+HuangruiChu@users.noreply.github.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-10-21 08:08:20 +02:00
Ken Steele	657ce8b01c	feat(ASR): MLX Whisper Support for Apple Silicon (#2366 ) * add mlx-whisper support * added mlx-whisper example and test. update docling cli to use MLX automatically if present. * fix pre-commit checks and added proper type safety * fixed linter issue * DCO Remediation Commit for Ken Steele <ksteele@gmail.com> I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: a979a680e1dc2fee8461401335cfb5dda8cfdd98 I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 9827068382ca946fe1387ed83f747ae509fcf229 I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: ebbeb45c7dc266260e1fad6bdb54a7041f8aeed4 I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 2f6fd3cf46c8ca0bb98810191578278f1df87aa3 Signed-off-by: Ken Steele <ksteele@gmail.com> * fix unit tests and code coverage for CI * DCO Remediation Commit for Ken Steele <ksteele@gmail.com> I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 5e61bf11139a2133978db2c8d306be6289aed732 Signed-off-by: Ken Steele <ksteele@gmail.com> * fix CI example test - mlx_whisper_example.py defaults to tests/data/audio/sample_10s.mp3 if no args specified. Signed-off-by: Ken Steele <ksteele@gmail.com> * refactor: centralize audio file extensions and MIME types in base_models.py - Move audio file extensions from CLI hardcoded set to FormatToExtensions[InputFormat.AUDIO] - Add support for additional audio formats: m4a, aac, ogg, flac, mp4, avi, mov - Update FormatToMimeType mapping to include MIME types for all audio formats - Update CLI auto-detection to use centralized FormatToExtensions mapping - Add comprehensive tests for audio file auto-detection and pipeline selection - Ensure explicit pipeline choices are not overridden by auto-detection Fixes issue where only .mp3 and .wav files were processed as audio despite CLI auto-detection working for all formats. The document converter now properly recognizes all audio formats through MIME type detection. Addresses review comments: - Centralizes audio extensions in base_models.py as suggested - Maintains existing auto-detection behavior while using centralized data - Adds proper test coverage for the audio detection functionality All examples and tests pass with the new centralized approach. All audio formats (mp3, wav, m4a, aac, ogg, flac, mp4, avi, mov) now work correctly. Signed-off-by: Ken Steele <ksteele@gmail.com> * feat: address reviewer feedback - improve CLI auto-detection and add explicit model options Review feedback addressed: 1. Fix CLI auto-detection to only switch to ASR pipeline when ALL files are audio - Previously switched if ANY file was audio, now requires ALL files to be audio - Added warning for mixed file types with guidance to use --pipeline asr 2. Add explicit WHISPER_X_MLX and WHISPER_X_NATIVE model options - Users can now force specific implementations if desired - Auto-selecting models (WHISPER_BASE, etc.) still choose best for hardware - Added 12 new explicit model options: _MLX and _NATIVE variants for each size CLI now supports: - Auto-selecting: whisper_tiny, whisper_base, etc. (choose best for hardware) - Explicit MLX: whisper_tiny_mlx, whisper_base_mlx, etc. (force MLX) - Explicit Native: whisper_tiny_native, whisper_base_native, etc. (force native) Addresses reviewer comments from @dolfim-ibm Signed-off-by: Ken Steele <ksteele@gmail.com> * DCO Remediation Commit for Ken Steele <ksteele@gmail.com> I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: `c60e72d2b5` I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: `94803317a3` I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: `21905e8ace` I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: `96c669d155` I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: `8371c060ea` Signed-off-by: Ken Steele <ksteele@gmail.com> * test(asr): add coverage for MLX options, pipeline helpers, and VLM prompts - tests/test_asr_mlx_whisper.py: verify explicit MLX options (framework, repo ids) - tests/test_asr_pipeline.py: cover _has_text/_determine_status and backend support with proper InputDocument/NoOpBackend wiring - tests/test_interfaces.py: add BaseVlmPageModel.formulate_prompt tests (RAW/NONE/CHAT, invalid style), with minimal InlineVlmOptions scaffold Improves reliability of ASR and VLM components by validating configuration paths and helper logic. Signed-off-by: Ken Steele <ksteele@gmail.com> * test(asr): broaden coverage for model selection, pipeline flows, and VLM prompts - tests/test_asr_mlx_whisper.py - Add MLX/native selector coverage across all Whisper sizes - Validate repo_id choices under MLX and Native paths - Cover fallback path when MPS unavailable and mlx_whisper missing - tests/test_asr_pipeline.py - Relax silent-audio assertion to accept PARTIAL_SUCCESS or SUCCESS - Force CPU native path in helper tests to avoid torch in device selection - Add language handling tests for native/MLX transcribe - Cover native run success (BytesIO) and failure (exception) branches - Cover MLX run success/failure branches with mocked transcribe - Add init path coverage with artifacts_path - tests/test_interfaces.py - Add focused VLM prompt tests (NONE/CHAT variants) Result: all tests passing with significantly improved coverage for ASR model selectors, pipeline execution paths, and VLM prompt formulation. Signed-off-by: Ken Steele <ksteele@gmail.com> * simplify ASR model settings (no pipeline detection needed) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * clean up disk space in runners Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Ken Steele <ksteele@gmail.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-21 08:05:59 +02:00
Michele Dolfi	a5af082d82	chore: fix parsing of release body message (#2498 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-20 13:41:35 +02:00
Michele Dolfi	5be856fbc0	chore: add action posting to discord (#2486 ) * add action posting to discord Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * test on push Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * with icon Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove testing Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-17 16:31:57 +02:00
Michele Dolfi	dd03b53117	docs: discord badge with join link (#2473 ) * add discord link Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add Discord link to social section in mkdocs.yml Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * Add Discord link to getting started documentation Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-10-16 10:13:50 +02:00
Michele Dolfi	1762bb8762	chore: update lock (#2468 ) update lock Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-15 20:35:49 +02:00

1 2 3 4 5 ...

751 Commits