Commit Graph

  • 0be736227f fix: improve handling of disallowed formats (#429) Christoph Auer 2024-12-03 12:45:32 +0100
  • 34c7c79858
    fix: improve handling of disallowed formats (#429) Christoph Auer 2024-12-03 12:45:32 +0100
  • 25a0fa38d1 chore: bump version to 2.8.2 [skip ci] github-actions[bot] 2024-12-03 10:47:29 +0000
  • 2254845da3 chore: bump version to 2.8.2 [skip ci] v2.8.2 github-actions[bot] 2024-12-03 10:47:29 +0000
  • 9f35e368f6 chore: update numpy lock (#500) Michele Dolfi 2024-12-03 11:21:31 +0100
  • 672962a8b2
    chore: update numpy lock (#500) Michele Dolfi 2024-12-03 11:21:31 +0100
  • a7e3f713bb fix: ParserError EOF inside string (#470) (#472) guglie 2024-12-03 11:21:18 +0100
  • c90c41c391
    fix: ParserError EOF inside string (#470) (#472) guglie 2024-12-03 11:21:18 +0100
  • a01cedbb69 docs: add styling for faq (#502) Michele Dolfi 2024-12-03 11:20:49 +0100
  • 5ba3807f31
    docs: add styling for faq (#502) Michele Dolfi 2024-12-03 11:20:49 +0100
  • e9c6462629 Merge remote-tracking branch 'origin/main' into fix-numpy-pinning Michele Dolfi 2024-12-03 10:51:10 +0100
  • 20cc2f375a remove torchaudio Michele Dolfi 2024-12-03 10:40:45 +0100
  • 418d8159bd perf: prevent temp file leftovers, reuse core type (#487) Panos Vagenas 2024-12-03 10:40:28 +0100
  • 051789d017
    perf: prevent temp file leftovers, reuse core type (#487) Panos Vagenas 2024-12-03 10:40:28 +0100
  • 639829a1d3 docs: add styling to faq Michele Dolfi 2024-12-03 10:38:43 +0100
  • 7245cc6080 Implement hierachical cluster layout processing Christoph Auer 2024-12-03 10:28:36 +0100
  • 05bffd38f3 Implement hierachical cluster layout processing Christoph Auer 2024-12-03 10:28:36 +0100
  • 32e9b4a2cf fix: PermissionError when using tesseract_ocr_cli_model (#496) Gaspard Petit 2024-12-03 04:22:03 -0500
  • d3f84b2457
    fix: PermissionError when using tesseract_ocr_cli_model (#496) Gaspard Petit 2024-12-03 04:22:03 -0500
  • 84abb008f5 chore: update numpy lock Michele Dolfi 2024-12-03 10:04:30 +0100
  • b946d8d9ed
    Update advanced_chunking_with_merging.ipynb Ben Rood 2024-12-03 17:02:35 +0800
  • a762c8394e fix: use new resolve_source_to_x functions to avoid tempfile leftovers (#490) Michele Dolfi 2024-12-03 09:28:52 +0100
  • 1cd30ed448 chore: removed discriminator Simonas 2024-12-03 09:33:47 +0200
  • a0aa83301c chore: base ocr options, was missing rapid ocr Simonas 2024-12-03 09:30:03 +0200
  • 4131fa3e34 fix: PermissionError when using tesseract_ocr_cli_model Gaspard Petit 2024-12-02 22:11:16 -0500
  • 36bffc3cc7
    Fix bug: Excel Backend Andrew Tran 2024-12-02 12:05:05 -0500
  • d1e439de92 use new resolve_source_to_x functions Michele Dolfi 2024-12-02 17:15:43 +0100
  • e0cf80a919 Upgraded Layout Postprocessing, sending old code back to ERZ Christoph Auer 2024-12-02 16:21:14 +0100
  • b9f8f5ac7b Upgraded Layout Postprocessing, sending old code back to ERZ Christoph Auer 2024-12-02 16:21:14 +0100
  • 72e76512bd [skip ci] document import line Panos Vagenas 2024-12-02 16:20:24 +0100
  • 64aff225c6 update docling-core version Panos Vagenas 2024-12-02 16:11:53 +0100
  • 79a4788277 chore: reuse DocumentStream from docling-core Panos Vagenas 2024-12-02 14:58:12 +0100
  • 8e57c85bf4 rename new status, populate ConversionResult errors Panos Vagenas 2024-12-02 13:32:05 +0100
  • 6ca85993f4 docs: typo in faq (#484) Álvaro Huertas 2024-12-02 10:35:24 +0100
  • 33cff98d36
    docs: typo in faq (#484) Álvaro Huertas 2024-12-02 10:35:24 +0100
  • 048031d32b docs: add automatic api reference (#475) Michele Dolfi 2024-12-02 09:55:52 +0100
  • d4872103b8
    docs: add automatic api reference (#475) Michele Dolfi 2024-12-02 09:55:52 +0100
  • 3a5b436f0e
    Typo faq.md Álvaro Huertas 2024-12-02 08:45:19 +0100
  • 2f6f6c1b41 fix: tesseract_ocr_cli csv parsing fails when text contains single quotes Gaspard Petit 2024-12-02 01:59:17 -0500
  • 86a81be4a1 Merge branch 'gaspardpetit-fix-permission-error-tesseractcli' of https://github.com/gaspardpetit/docling into gaspardpetit-fix-permission-error-tesseractcli Gaspard Petit 2024-12-01 11:37:53 -0500
  • 8b3bd511cd linter fix - import order Gaspard Petit 2024-12-01 11:37:36 -0500
  • 42c544996d fix: PermissionError when using tesseract_ocr_cli_model Gaspard Petit 2024-11-25 09:19:53 -0500
  • 32d039ed7b docs: add automatic api reference Michele Dolfi 2024-11-29 22:35:21 +0100
  • 0e0360a37b docs: introduce faq section (#468) Michele Dolfi 2024-11-29 22:34:56 +0100
  • 8ccb3c6db6
    docs: introduce faq section (#468) Michele Dolfi 2024-11-29 22:34:56 +0100
  • a52305990b
    fix: ParserError EOF inside string (#470) guglie 2024-11-29 17:25:06 +0100
  • ef98381963 docs: introduce faq section Michele Dolfi 2024-11-29 15:15:44 +0100
  • 1d81b85443 chore: bump version to 2.8.1 [skip ci] github-actions[bot] 2024-11-29 13:04:48 +0000
  • cc46c938b6 chore: bump version to 2.8.1 [skip ci] v2.8.1 github-actions[bot] 2024-11-29 13:04:48 +0000
  • 7bd432496a fix(cli): expose debug options (#467) Michele Dolfi 2024-11-29 13:25:58 +0100
  • dd8de46267
    fix(cli): expose debug options (#467) Michele Dolfi 2024-11-29 13:25:58 +0100
  • 861b6a6499 fix: remove unused deps (#466) Michele Dolfi 2024-11-29 13:18:06 +0100
  • af63818df5
    fix: remove unused deps (#466) Michele Dolfi 2024-11-29 13:18:06 +0100
  • 44a07ed45f fix(cli): expose debug options Michele Dolfi 2024-11-29 13:05:10 +0100
  • 0d9b28470d fix: remove unused deps Michele Dolfi 2024-11-29 12:46:27 +0100
  • 9d8d698921 docs: extend integration docs & README (#456) Panos Vagenas 2024-11-28 09:41:21 +0100
  • 84c46fdeb3
    docs: extend integration docs & README (#456) Panos Vagenas 2024-11-28 09:41:21 +0100
  • d8cce38f3b docs: extend integration docs & README Panos Vagenas 2024-11-28 08:17:20 +0100
  • 0762986cf9 updated the README, still need to update the docsa Peter Staar 2024-11-28 07:51:54 +0100
  • 4138110c6b robustify & simplify format option resolution Panos Vagenas 2024-11-27 19:45:39 +0100
  • 20a2cd0f53 chore: bump version to 2.8.0 [skip ci] github-actions[bot] 2024-11-27 13:29:32 +0000
  • 211f4f7570 chore: bump version to 2.8.0 [skip ci] v2.8.0 github-actions[bot] 2024-11-27 13:29:32 +0000
  • 85b29990be
    feat(ocr): added support for RapidOCR engine (#415) Swaymaw 2024-11-27 18:27:41 +0530
  • 767563bf8b
    fix: use correct image index in word backend (#442) Manuel030 2024-11-27 13:45:07 +0100
  • 9f265e9e80 Merge remote-tracking branch 'origin/main' into rapidocr Michele Dolfi 2024-11-27 13:08:23 +0100
  • 29807a2d68
    fix: Update tests and examples for docling-core 2.5.1 (#449) Christoph Auer 2024-11-27 13:07:00 +0100
  • 674d533900 Update lockfile for docling-core 2.5.1 Christoph Auer 2024-11-27 12:44:48 +0100
  • 0937da9029 Revert "Fix OCR tests" Christoph Auer 2024-11-27 12:43:00 +0100
  • 732bc6f515 Fix OCR tests Christoph Auer 2024-11-27 12:31:04 +0100
  • 9777f41137 Add export with referenced images to export_figures example Christoph Auer 2024-11-27 10:13:18 +0100
  • 35c73938a7 Update tests for docling-core 2.5.0 Christoph Auer 2024-11-27 10:04:38 +0100
  • 496141a090 resolve merge conflicts Manuel030 2024-11-27 12:11:37 +0100
  • cf2f3a1ceb correct rebase error Manuel030 2024-11-27 11:58:03 +0100
  • 34ef233c42 sign dco Manuel030 2024-11-26 15:25:35 +0100
  • 625a3297ef fix: Fixes for wordx (#432) Maxim Lysak 2024-11-26 14:44:43 +0100
  • 1b7a3756e0 fix image index in word backend Manuel030 2024-11-26 15:13:10 +0100
  • c228f34f44 Merge remote-tracking branch 'origin/main' into rapidocr Michele Dolfi 2024-11-27 11:44:07 +0100
  • c1b6442670 use default device until we enable global management Michele Dolfi 2024-11-27 11:39:23 +0100
  • 74e005df63 fix styling issues and small bug in rapidOcrOptions Swaymaw 2024-11-27 10:52:25 +0530
  • 0348cfb964 simplifying rapidocr options so that device can be changed using a single option for all models Swaymaw 2024-11-27 10:26:43 +0530
  • 0bb1e203b6 improve handling of unsupported types Panos Vagenas 2024-11-26 21:33:32 +0100
  • bafe673b97 fix: Other test fixes Christoph Auer 2024-11-25 14:36:56 +0100
  • c8bf252dbd fix: Remove unnecessary case handling Christoph Auer 2024-11-25 14:35:10 +0100
  • 7df9527733 fix: Fixes and tests for StopIteration on .convert() Christoph Auer 2024-11-25 14:16:28 +0100
  • 6666d9ec07 chore: bump version to 2.7.1 [skip ci] v2.7.1 github-actions[bot] 2024-11-26 15:01:33 +0000
  • d0a1180478
    fix: Fixes for wordx (#432) Maxim Lysak 2024-11-26 14:44:43 +0100
  • dfba82679d Updated lxml dependency version Maksym Lysak 2024-11-26 13:09:38 +0100
  • 592534e150 Added test for word file with embedded emf images, re-generated full tests for docx, eased up dependency on lxml Maksym Lysak 2024-11-26 10:22:31 +0100
  • 686affe782 help poetry pinning for python3.9 Michele Dolfi 2024-11-26 07:38:06 +0100
  • 94d1fd41cd Added safety try-except when trying to load pillow image from a docx blob. Added explicit dependency on lxml. Maksym Lysak 2024-11-25 20:20:45 +0100
  • 508bbed8f8 fixes for referencing drawing blip in wordx Maksym Lysak 2024-11-25 16:42:48 +0100
  • 0d12ad1dcc
    fix: PermissionError when using tesseract_ocr_cli_model Gaspard Petit 2024-11-25 09:19:53 -0500
  • ac1faebfc9 updating pyproject.toml and poetry.lock to fix ci bugs Swaymaw 2024-11-25 15:14:19 +0530
  • 31151291a2
    Update layout_utils.py by changing an "if" to an "elif" keyword. Raphaël M.J.I. Larsen 2024-11-22 20:47:27 +0100
  • cbaf2b518a fixing styling format Swaymaw 2024-11-22 14:43:34 +0530
  • a00940f918 fixing styling issues Swaymaw 2024-11-22 14:38:13 +0530
  • 1b86a862e5 Merge branch 'main' of https://github.com/DS4SD/docling into rapidocr Swaymaw 2024-11-22 13:07:38 +0530
  • d7072b4b56
    fix: force pydantic < 2.10.0 (#407) Michele Dolfi 2024-11-22 08:23:11 +0100
  • 9bb2e58e59 adding rapidocr engine for ocr in docling swayam-singhal 2024-11-22 12:45:06 +0530
  • 2a1d3fd221
    chore: update the README (#409) Peter W. J. Staar 2024-11-21 17:28:53 +0100