Commit Graph

  • 6a432d9115 fix/ran poetry run pre-commit run --all-files to format the file Signed-off-by: Franck Benichou franck.benichou@sciencespo.fr Benichou 2025-05-14 15:35:50 -0400
  • 7468137c55 fix/removed generate=True in test_backend_pptx.py in verify_export method to not conflict with main branch Signed-off-by: Franck Benichou franck.benichou@sciencespo.fr Benichou 2025-05-13 20:46:08 -0400
  • a5e8c2d1be fix/adding the missing slide size argument in the handle pictures in the mspowerpoint_backend.py file and adding generate=True in the verify export method in the pytest for pptx to ensure the pytest passes appropriately Signed-off-by: Franck Benichou franck.benichou@sciencespo.fr Benichou 2025-05-13 20:34:56 -0400
  • 30cfaaf39f fix: run poetry pre-commit all files to black format changes Signed-off-by: Franck Benichou franck.benichou@sciencespo.fr Benichou 2025-04-14 22:43:44 -0400
  • 45eb3e79f7 fix/implementing the capture of pptx_image with the same method from docx backend by extracting the drawing blip Benichou 2025-04-08 11:33:52 -0400
  • d8873aa0c9 fix/adding a commit with a signature Signed-off-by: Franck Benichou franck.benichou@sciencespo.fr Benichou 2025-04-08 01:00:12 -0400
  • f6d4e67559 fix/implementing the capture of pptx_image with the same method from docx backend by extracting the drawing blip Benichou 2025-04-08 00:53:24 -0400
  • 4c0964c1fd
    Merge branch 'docling-project:main' into main Ayraf 2025-06-05 20:05:40 +0530
  • 45b9ded576 fix: initialize df_osd to avoid uninitialized variable error IoannisMaras 2025-06-05 17:29:48 +0300
  • a2b83fe4ae
    fix: Add WEBP to the list of image file extensions (#1711) Eugene 2025-06-05 11:09:27 +0400
  • 3e681217b1 feat: Add WEBP to the list of image file extensions Eugene 2025-06-05 02:46:53 +0400
  • 40df0d74ad chore: bump version to 2.36.1 [skip ci] v2.36.1 github-actions[bot] 2025-06-04 11:43:13 +0000
  • 8846f1a393
    fix: remove typer and click constraints (#1707) Michele Dolfi 2025-06-04 13:06:23 +0200
  • 1fd5a17945 release typer and click constraints Michele Dolfi 2025-06-04 12:01:06 +0200
  • be42b03f9b
    docs: flash-attn usage and install (#1706) Michele Dolfi 2025-06-04 11:09:54 +0200
  • b6af19072f fix link Michele Dolfi 2025-06-04 09:18:30 +0200
  • 32539dc1f9 docs: flash-attn usage and install Michele Dolfi 2025-06-04 09:08:07 +0200
  • bef2fa95cd
    Merge branch 'docling-project:main' into fix/docx_text_box_extraction AndrewTsai0406 2025-06-03 22:00:12 +0800
  • 96c54dba91 chore: bump version to 2.36.0 [skip ci] v2.36.0 github-actions[bot] 2025-06-03 13:54:25 +0000
  • 744b2a06b5 fix/docx_text_box_extraction JiunAn Tsai 2025-06-03 21:31:45 +0800
  • cdd401847a
    feat: simplify dependencies, switch to uv (#1700) Michele Dolfi 2025-06-03 15:18:54 +0200
  • add5fa2b26 more constraints Michele Dolfi 2025-06-03 13:42:59 +0200
  • 2e9d40eaa9 constraints for onnxruntime Michele Dolfi 2025-06-03 13:39:09 +0200
  • f4ad079735 Merge remote-tracking branch 'origin/main' into refactor-uv Michele Dolfi 2025-06-03 13:32:13 +0200
  • e78c3f9456 Merge remote-tracking branch 'origin/main' into refactor-uv Michele Dolfi 2025-06-03 13:31:54 +0200
  • 61d0d6c755
    test: mark flaky test (#1698) Panos Vagenas 2025-06-03 13:13:44 +0200
  • 83d91c7a91 fix path usage Panos Vagenas 2025-06-03 11:25:02 +0200
  • e056ec3b60 mark textbox file test as flaky Panos Vagenas 2025-06-03 11:14:04 +0200
  • 125aacdce5 refactor with uv Michele Dolfi 2025-06-03 10:47:19 +0200
  • ccf726c937 test: cleanse Word test file Panos Vagenas 2025-06-03 08:54:56 +0200
  • cfdf4cea25
    feat: new vlm-models support (#1570) Peter W. J. Staar 2025-06-02 17:01:06 +0200
  • 6d279f1c41 add docs for vision models Michele Dolfi 2025-06-02 15:16:23 +0200
  • 08dcacc5cb chore: bump version to 2.35.0 [skip ci] v2.35.0 github-actions[bot] 2025-06-02 12:30:26 +0000
  • 07045386c6 remove torch type Michele Dolfi 2025-06-02 14:29:03 +0200
  • 738385004a Merge remote-tracking branch 'origin/main' into dev/add-other-vlm-models Michele Dolfi 2025-06-02 14:08:23 +0200
  • ea5719c39d use single HF VLM model class Michele Dolfi 2025-06-02 13:25:51 +0200
  • 8006683007 remove hf_vlm_model and add extra_generation_args Michele Dolfi 2025-06-02 12:58:32 +0200
  • 11ca4f7a7b
    docs: fix typo in index.md (#1676) Edgar Hipp 2025-06-02 12:35:59 +0200
  • 1c8a1283c4
    test: ensure utf-8 in test data utils (#1691) Panos Vagenas 2025-06-02 12:13:19 +0200
  • 38b8b84dcf test: ensure utf-8 in test data utils Panos Vagenas 2025-06-02 11:31:06 +0200
  • c0847c97a7 use module import and remove MLX from non-darwin Michele Dolfi 2025-06-02 10:45:46 +0200
  • b9c1698263 rename to specs Michele Dolfi 2025-06-02 10:40:06 +0200
  • edf984f478 fix/docx_text_box_extraction JiunAn Tsai 2025-06-02 16:12:12 +0800
  • be3b1bd1e9 fix/docx_text_box_extraction JiunAn Tsai 2025-06-02 16:12:12 +0800
  • 76718cb1f9 add message for transformers version Michele Dolfi 2025-06-02 09:55:15 +0200
  • 3ba698984d Merge remote-tracking branch 'origin/main' into dev/add-other-vlm-models Michele Dolfi 2025-06-02 08:46:54 +0200
  • 984cb137f6
    fix: guess HTML content starting with script tag (#1673) cp_main_20250602 Cesar Berrospi Ramis 2025-06-02 08:43:24 +0200
  • 55e0703945 missing file Michele Dolfi 2025-06-02 08:40:04 +0200
  • 910743a81a exclude minimal_vlm Michele Dolfi 2025-06-01 21:15:17 +0200
  • ffb7f071c3 remove not-needed function Michele Dolfi 2025-06-01 21:13:54 +0200
  • 7f6df727e3 add supported_devices Michele Dolfi 2025-06-01 21:12:43 +0200
  • 5d21153948 move more argument to options and simplify model init Michele Dolfi 2025-06-01 18:49:00 +0200
  • 3ff1712787 rename pipeline_vlm_model_spec Michele Dolfi 2025-06-01 18:29:20 +0200
  • 2bd15cc809 add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import Michele Dolfi 2025-06-01 18:24:04 +0200
  • f63312add6 use lowercase and uppercase only Michele Dolfi 2025-06-01 17:55:16 +0200
  • 8686842478 skip compare example in CI Michele Dolfi 2025-06-01 16:57:48 +0200
  • 0b2c1d5eda refactor instances of VLM models Michele Dolfi 2025-06-01 16:55:56 +0200
  • fb0d979419 remove unused value Michele Dolfi 2025-06-01 16:34:02 +0200
  • 9dbf08a084 use AutoModelForVision2Seq for Pixtral and review example (including rename) Michele Dolfi 2025-06-01 16:30:58 +0200
  • 0cb7520648 restore stable imports Michele Dolfi 2025-06-01 09:06:41 +0200
  • fa561170f6 chore: Update lock with the dependencies for D-FINE nli/layout_dfine Nikos Livathinos 2025-05-31 16:57:09 +0200
  • dcc63ae00b Merge branch 'main' into nli/layout_rtdetr_v2 nli/layout_rtdetr_v2 Nikos Livathinos 2025-05-31 16:55:06 +0200
  • 7aa2be93d6 Merge branch 'main' into nli/layout_dfine Nikos Livathinos 2025-05-31 16:48:28 +0200
  • 30dafd976d chore: Update dependencies to docling-ibm-models and transformers to support D-FINE layout model Nikos Livathinos 2025-05-31 16:39:25 +0200
  • 064a236ebf
    Merge branch 'main' into html_backend vaaale 2025-05-31 08:57:01 +0200
  • 98b0ef6f22
    Add files via upload Dimitri Mbakop 2025-05-30 10:42:24 +0200
  • 6188770f07 docs: fix typo in index.md Edgar Hipp 2025-05-30 00:38:49 +0200
  • 93d98dfa63 test: added groundtruth test files for fix(msword_backend): Identify text in the same line after an image / image anchor #1425 Michael Krissgau 2025-05-29 15:12:55 +0200
  • 84dc120d39 Merge branch 'main' of https://github.com/docling-project/docling into dev/fix_msword_backend_identify_text_after_image Michael Krissgau 2025-05-29 15:04:06 +0200
  • 14520b2dd4 fix: guess HTML content starting with script tag Cesar Berrospi Ramis 2025-05-28 20:13:05 +0200
  • 3942923125
    chore: fix or ignore runtime and deprecation warnings (#1660) Cesar Berrospi Ramis 2025-05-28 17:55:31 +0200
  • b3e0042813
    chore: exclude data from GH Linguist (#1671) Panos Vagenas 2025-05-28 15:42:34 +0200
  • 62a2ec1218 chore: exclude data from GH Linguist Panos Vagenas 2025-05-28 14:20:18 +0200
  • 4b65566076
    Merge branch 'docling-project:main' into main ShiroYasha18 2025-05-28 17:38:17 +0530
  • bf19c5a291 chore: update poetry lock with latest docling-core Cesar Berrospi Ramis 2025-05-28 12:54:39 +0200
  • 53ffc565ca chore: fix or catch deprecation warnings Cesar Berrospi Ramis 2025-05-26 05:47:57 +0200
  • 106951e71e
    test: add missing ground truth files (#1667) Cesar Berrospi Ramis 2025-05-28 13:26:49 +0200
  • c303265526
    Merge 7955903e9b into b356b33059 ka-weihe 2025-05-28 13:13:48 +0200
  • b356b33059
    feat: Add visualization of bbox on page with html export. (#1663) Peter W. J. Staar 2025-05-28 13:10:38 +0200
  • 1b5d96eb3f test: add missing ground truth files Cesar Berrospi Ramis 2025-05-27 22:40:53 +0200
  • e36a4200da ran tests ShiroYasha18 2025-05-27 21:18:27 +0530
  • abcbde71b6 run tests ShiroYasha18 2025-05-27 21:16:50 +0530
  • c4c59204d6
    Merge branch 'docling-project:main' into main ShiroYasha18 2025-05-27 18:02:13 +0530
  • 51d3450915
    fix: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid continuation byte (#1665) DavidLee 2025-05-27 20:06:05 +0800
  • 58b572a579
    Update document.py DavidLee 2025-05-27 17:41:42 +0800
  • 52fbfe1ce3 fix: pptx line break and space handling Martin Wind 2025-05-27 11:16:21 +0200
  • ad313473c4 updated the cli argument to show_layout Peter Staar 2025-05-27 10:43:20 +0200
  • e838818783 reformatted code Peter Staar 2025-05-27 09:01:32 +0200
  • bb12c96094 updated the cli Peter Staar 2025-05-27 08:56:54 +0200
  • c9735de4c6 feat: Add visualization of bbox on page with html export. Peter Staar 2025-05-27 05:41:51 +0200
  • 218b520ec4
    xlsm file ShiroYasha18 2025-05-27 05:12:24 +0530
  • 4b8dfa326b
    Delete tests/test_backend_msexcel_xlsm.py ShiroYasha18 2025-05-27 05:09:29 +0530
  • 22e7635a0a
    Update document_converter.py ShiroYasha18 2025-05-27 05:07:53 +0530
  • 3ecbd9a115
    Update test_backend_msexcel_xlsm.py ShiroYasha18 2025-05-27 05:06:15 +0530
  • 1f645e62c8
    Update test_backend_msexcel.py ShiroYasha18 2025-05-27 02:31:43 +0530
  • 96377cb81e
    Update base_models.py ShiroYasha18 2025-05-27 01:50:53 +0530
  • 7c7baf814d
    Merge branch 'docling-project:main' into main ShiroYasha18 2025-05-27 00:11:41 +0530
  • 08beb406d9 fix: when .simplify_text_elements() always put a space between chunks, checks for alphanumeric characters creates more problems than it does good. commit new that testfiles that got forgotten in the last commit. Roman Kayan BAZG 2025-05-25 18:14:32 +0200
  • c75b75e8af fix: pptx shape order Martin Wind 2025-05-25 10:31:06 +0200
  • f11880d5ad
    Merge branch 'main' into html_backend vaaale 2025-05-24 22:46:28 +0200