Commit Graph

  • e49fa7ec4f Update lock Christoph Auer 2025-02-10 10:46:46 +0100
  • f875fbc6cf Update reading-order model branch Christoph Auer 2025-02-10 09:51:52 +0100
  • 3e26597995 chore: bump version to 2.20.0 [skip ci] v2.20.0 github-actions[bot] 2025-02-07 17:46:36 +0000
  • c18f47c5c0
    fix: remove unused httpx (#919) Michele Dolfi 2025-02-07 17:51:31 +0100
  • 5156f8e197 remove more usage of httpx Michele Dolfi 2025-02-07 16:59:00 +0100
  • b686e5ab88 use requests instead of httpx Michele Dolfi 2025-02-07 16:58:11 +0100
  • fdb2191bdb remove unused httpx Michele Dolfi 2025-02-07 16:54:39 +0100
  • 4cc6e3ea5e
    feat: Describe pictures using vision models (#259) Michele Dolfi 2025-02-07 16:30:42 +0100
  • a56dbc5f3f Implement new reading-order model, replacing DS GLM model (WIP) Christoph Auer 2025-02-07 16:19:16 +0100
  • fba3cf9be7 chore: bump version to 2.19.0 [skip ci] v2.19.0 github-actions[bot] 2025-02-07 13:36:54 +0000
  • 80e0bef75a Merge remote-tracking branch 'origin/main' into feat-picture-description Michele Dolfi 2025-02-07 14:02:05 +0100
  • dbb35c7f28 use with_smolvlm in models download Michele Dolfi 2025-02-07 14:01:39 +0100
  • 2909753856 fix name of cli argument Michele Dolfi 2025-02-07 14:00:22 +0100
  • d4eee87b26 apply CLI download login Michele Dolfi 2025-02-07 13:59:24 +0100
  • 02faf5376b
    refactor: use org--name in artifacts-path (#912) Michele Dolfi 2025-02-07 13:58:05 +0100
  • 12df2ac259 use org--name in artifacts-path Michele Dolfi 2025-02-07 13:25:53 +0100
  • 90b766e2ae
    fix(markdown): handle nested lists (#910) Panos Vagenas 2025-02-07 12:55:12 +0100
  • 07c65b6084 Rebase from main Christoph Auer 2025-02-07 12:36:27 +0100
  • eec83ca6a1 fix(markdown): handle nested lists Panos Vagenas 2025-02-07 11:43:25 +0100
  • 71242499a1 fix examples path Michele Dolfi 2025-02-07 10:22:57 +0100
  • 90f0428a62 more renaming Michele Dolfi 2025-02-07 10:12:38 +0100
  • b6ed0b34cd do not run with vlm api Michele Dolfi 2025-02-07 10:08:15 +0100
  • 30a9fc5c59 Merge remote-tracking branch 'origin/main' into feat-picture-description Michele Dolfi 2025-02-07 10:07:00 +0100
  • 4e11ce62bb rename model Michele Dolfi 2025-02-07 09:50:49 +0100
  • 9114ada7bc
    fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) Michele Dolfi 2025-02-07 08:43:31 +0100
  • 3f0e98b1ad Add DoclingParseV3 backend implementation Christoph Auer 2025-02-06 20:29:44 +0100
  • d7df1eae0c Fix test data paths in examples Christoph Auer 2025-02-06 19:17:29 +0100
  • 8979abd865 Revert unwanted RTL additions Christoph Auer 2025-02-06 19:06:32 +0100
  • b7f5cdb230 fix path to files in example Michele Dolfi 2025-02-06 17:04:12 +0100
  • fa0a788ba9 remove debugging code Michele Dolfi 2025-02-06 17:01:53 +0100
  • 8e5ecad9c9 use latest docling-core Michele Dolfi 2025-02-06 16:33:25 +0100
  • 9097f6d099 pin wheel of latest docling-parse release Michele Dolfi 2025-02-06 16:20:40 +0100
  • 81a6d16ae7 add test data results Michele Dolfi 2025-02-06 16:18:30 +0100
  • 23e82a5f49 fix example filepaths Michele Dolfi 2025-02-06 16:12:52 +0100
  • 6d801eff55 Merge remote-tracking branch 'origin/main' into multiple-updates Michele Dolfi 2025-02-06 15:58:30 +0100
  • 69e8a9d499 fix mypy reports Michele Dolfi 2025-02-06 15:55:46 +0100
  • ed74fe2ec0
    feat: new artifacts path and CLI utility (#876) Michele Dolfi 2025-02-06 15:46:32 +0100
  • b817b7eb2b minor refactor Panos Vagenas 2025-02-06 14:29:58 +0100
  • 3af9b9d34e simplify downloading specific model(s) Panos Vagenas 2025-02-06 13:46:32 +0100
  • fce6bb14db Merge remote-tracking branch 'origin/dev/add-r2l-tests' into multiple-updates Michele Dolfi 2025-02-06 14:33:18 +0100
  • 3a71a3546d minor refactor Panos Vagenas 2025-02-06 14:29:58 +0100
  • 6ccff9a299 update lock Michele Dolfi 2025-02-06 14:21:27 +0100
  • f0ed5aca1e Merge remote-tracking branch 'origin/main' into feat-picture-description Michele Dolfi 2025-02-06 14:02:51 +0100
  • 15e18f5903 allow only localhost traffic Michele Dolfi 2025-02-06 13:58:06 +0100
  • 8ac000e35e update vlm API Michele Dolfi 2025-02-06 13:51:41 +0100
  • 5131e7ff21 simplify downloading specific model(s) Panos Vagenas 2025-02-06 13:46:32 +0100
  • 722a6eb7b9
    fix(msword_backend): handle conversion error in label parsing (#896) Vladimir Gurevich 2025-02-06 13:30:51 +0200
  • 06342a5a28 add generation options Michele Dolfi 2025-02-06 11:12:28 +0100
  • 11e27930c4 vlm description using AutoModelForVision2Seq Michele Dolfi 2025-02-06 10:55:31 +0100
  • 5e4056f222 fix(msword_backend): handle conversion error in label parsing Vladimir Gurevich 2025-02-05 14:46:48 +0200
  • 6fd8666dc1 new test file mao1/code_formula_v1.0.1 Matteo-Omenetti 2025-02-05 22:21:31 +0100
  • 5692cdb19d Merge remote-tracking branch 'origin/main' into fix-artifacts-path Michele Dolfi 2025-02-05 18:27:03 +0100
  • 8d810fd45f update docs Michele Dolfi 2025-02-05 18:24:22 +0100
  • f0a6932e40 remove unused file Michele Dolfi 2025-02-05 18:21:16 +0100
  • 1b8adb860a move function to utils Michele Dolfi 2025-02-05 18:20:26 +0100
  • a11fd1f157 propagate artifacts path usage for ocr models Michele Dolfi 2025-02-05 17:40:52 +0100
  • 0ba08adb26 rename download methods and deprecation warnings Michele Dolfi 2025-02-05 17:04:54 +0100
  • 0f830a6ac9 rename utility to docling-tools Michele Dolfi 2025-02-05 16:58:03 +0100
  • 379a0650b2 Added a README file in a good place of README file Jorge Samuel Teixeira Jordão 2025-02-05 11:10:17 -0300
  • fbe59f7c99 Added a requirements.txt file Jorge Samuel Teixeira Jordão 2025-02-05 10:58:57 -0300
  • e730e59d1d Merge branch 'main' of github.com:DS4SD/docling into cau/handle-furniture Christoph Auer 2025-02-05 12:48:38 +0100
  • 7bdd6868ed Add code to expose text direction of cell dev/add-r2l-tests Christoph Auer 2025-02-05 12:48:12 +0100
  • 5ad6de0560
    fix: enrichment models batch size and expose picture classifier (#878) Michele Dolfi 2025-02-05 11:46:01 +0100
  • 9f6aa036b1 added new gt for test_e2e_conversion Matteo-Omenetti 2025-02-05 10:26:10 +0100
  • 0ead58e0df cleanup imports Michele Dolfi 2025-02-05 09:52:38 +0100
  • e7b89e7815 remove batch size from CLI Michele Dolfi 2025-02-05 09:51:55 +0100
  • 654ce001da missing formatting Michele Dolfi 2025-02-04 15:40:44 +0100
  • dc2d03a349 Merge remote-tracking branch 'origin/main' into fix-artifacts-path Michele Dolfi 2025-02-04 15:38:04 +0100
  • dc9e759354 add docling-models utility Michele Dolfi 2025-02-04 15:35:19 +0100
  • 8040a4f19d added new gt for test_e2e_conversion Matteo-Omenetti 2025-02-04 15:24:49 +0100
  • 297e837719 fix black Matteo-Omenetti 2025-02-04 14:56:28 +0100
  • d7c9874a88 added three test-files for right-to-left Peter Staar 2025-02-04 14:49:19 +0100
  • 24163b02d1 fix: update all test cases again Christoph Auer 2025-02-04 14:40:16 +0100
  • 68d1713802 switch to code formula model v1.0.1 and new test pdf Matteo-Omenetti 2025-02-04 14:12:42 +0100
  • 5db82d5b67 cleaned up the data folder in the tests Peter Staar 2025-02-04 13:50:19 +0100
  • 89844a5725 switch to code formula model v1.0.1 and new test pdf Matteo-Omenetti 2025-02-04 13:29:02 +0100
  • 48c57144d2 switch to code formula model v1.0.1 and new test pdf Matteo-Omenetti 2025-02-04 13:28:13 +0100
  • 17448163e7
    chore: fix docs search (#880) Panos Vagenas 2025-02-04 11:35:34 +0100
  • c9b0b5aff3 fix: update all test cases Christoph Auer 2025-02-04 10:52:56 +0100
  • 5b4b1929ed revert temporary conditions Panos Vagenas 2025-02-04 10:44:55 +0100
  • 0d0be388fc chore: check docs search fix Panos Vagenas 2025-02-04 10:34:46 +0100
  • 6d3fea0196
    docs: Introduce example with custom models for RapidOCR (#874) Nikos Livathinos 2025-02-04 10:07:00 +0100
  • 04f9963396 chore: Update all test GT Christoph Auer 2025-02-04 09:47:09 +0100
  • 00aac6b620 use different batch size in each model Michele Dolfi 2025-02-03 21:58:09 +0100
  • e0b82721c7 expose picture classifier in CLI Michele Dolfi 2025-02-03 21:57:45 +0100
  • 18aad34d67 fix artifacts path Michele Dolfi 2025-02-03 21:13:05 +0100
  • 147c7a1bc9 feat: use w:lastRenderedPageBreaks if present to get approximate pagination David Huggins-Daines 2025-01-29 08:32:52 -0500
  • b5da4080c9 chore: bump version to 2.18.0 [skip ci] v2.18.0 github-actions[bot] 2025-02-03 14:58:50 +0000
  • 6fa5bfd115 Assign content_layer for page_headers and page_footers Christoph Auer 2025-02-03 15:06:19 +0100
  • 5ac2887e4a
    fix(markdown): fix parsing if doc ending with table (#873) Panos Vagenas 2025-02-03 14:38:38 +0100
  • 1a5eaf079a chore: Exclude the example with custom RapidOCR models from the examples to run in github actions Nikos Livathinos 2025-02-03 14:30:49 +0100
  • dee2b9a50f docs: Introduce example with custom models for RapidOCR Nikos Livathinos 2025-02-03 14:17:08 +0100
  • a40544a546
    chore: clean up top-level file (#872) Panos Vagenas 2025-02-03 14:10:12 +0100
  • 7b3c5ddc9d fix(markdown): fix parsing if doc ending with table Panos Vagenas 2025-02-03 14:06:37 +0100
  • d409fb5b69 chore: cleanup top-level file Panos Vagenas 2025-02-03 13:25:42 +0100
  • bfbd70c224 Merge branch 'main' of github.com:DS4SD/docling into cau/pin-docling-parse-pre-3.2 cau/pin-docling-parse-pre-3.2 Christoph Auer 2025-02-03 12:23:24 +0100
  • 94751a78f4
    fix(markdown): add support for HTML content (#855) Panos Vagenas 2025-02-03 12:21:05 +0100
  • f4b30fe7a7 fix word test data Panos Vagenas 2025-02-03 11:21:46 +0100
  • 0efd9fe584 chore: pin docling-parse 3.2.0 Christoph Auer 2025-02-03 10:53:01 +0100
  • 7adea4af08 fix(markdown): add support for HTML content Panos Vagenas 2025-01-31 16:51:29 +0100