Commit Graph

  • 0efd9fe584 chore: pin docling-parse 3.2.0 Christoph Auer 2025-02-03 10:53:01 +0100
  • 7adea4af08 fix(markdown): add support for HTML content Panos Vagenas 2025-01-31 16:51:29 +0100
  • 6a76b49a47
    feat: Expose equation exports (#869) Michele Dolfi 2025-02-03 10:31:19 +0100
  • 0cd81a8122
    fix(docx): merged table cells not properly converted (#857) Cesar Berrospi Ramis 2025-02-03 10:20:03 +0100
  • 5cf7a4f4a7 update with docling-core release Michele Dolfi 2025-02-03 09:41:49 +0100
  • eff16b62cc
    fix: Processing of placeholder shapes in pptx that have text but no bbox (#868) Maxim Lysak 2025-02-03 09:33:33 +0100
  • ef2e5e415f update test results Michele Dolfi 2025-02-03 08:53:26 +0100
  • 82f778e09c Processing of placeholder shapes in pptx that have text but no bbox Maksym Lysak 2025-02-03 08:41:53 +0100
  • 817a480038 pin new docling-core and exploit it via assembler changes Michele Dolfi 2025-02-03 08:21:57 +0100
  • 5c98306770
    Merge pull request #5 from 0xCarbon/update_01022025 jpcanesin 2025-02-01 19:57:55 -0300
  • 7819e93786 remove mergify João 2025-02-01 19:55:58 -0300
  • 9442d2a47a remove original repo actions João 2025-02-01 19:55:11 -0300
  • 8bcbfc1f25 update with upstream 01022025 João 2025-02-01 19:53:24 -0300
  • e0f89029db chore: add type hinting to docx backend Cesar Berrospi Ramis 2025-01-31 18:30:00 +0100
  • 40145b59b3 fix(docx): merged cells not properly converted Cesar Berrospi Ramis 2025-01-31 16:03:28 +0100
  • b1cf796730
    fix: KeyError in tableformer prediction (#854) Maxim Lysak 2025-01-31 17:00:14 +0100
  • 5bdc37378b chore: pin docling-parse PR-91 Christoph Auer 2025-01-31 16:48:40 +0100
  • bdeeb74534 chore: rewrite cumbersome dictionary checking Christoph Auer 2025-01-31 16:26:27 +0100
  • 8f79e95e74 fix for KeyError in tableformer prediction Maksym Lysak 2025-01-31 16:11:58 +0100
  • c33ae89608 chore: pin docling-parse PR 90 Christoph Auer 2025-01-31 15:55:24 +0100
  • 70d68b6164
    feat: Add option to define page range (#852) Christoph Auer 2025-01-31 15:23:00 +0100
  • d727b04ad0
    feat(docx): Support of SDTs in docx backend (#853) Maxim Lysak 2025-01-31 14:52:24 +0100
  • c82ec123ff Fix poetry.lock Rafael Teixeira de Lima 2025-01-31 14:35:39 +0100
  • 2e0a907601
    Merge branch 'main' into rtdl/docx_latex Rafael Teixeira de Lima 2025-01-31 14:28:06 +0100
  • 8c73404668 Support of table of content containers in docx backend Maksym Lysak 2025-01-31 14:01:02 +0100
  • 2dcc582d02 feat: Add option to define page range Christoph Auer 2025-01-31 13:58:07 +0100
  • 2c037ae62e
    fix: Fixed docx import with headers that are also lists (#842) Maxim Lysak 2025-01-31 10:51:21 +0100
  • 440685c65f
    Update docling/backend/msword_backend.py Maxim Lysak 2025-01-31 10:20:24 +0100
  • 53f6d95877
    Update docling/backend/msword_backend.py Maxim Lysak 2025-01-31 10:20:14 +0100
  • 2a1f8afe7e
    fix: use new add_code in html backend and add more typing hints (#850) Michele Dolfi 2025-01-31 09:54:17 +0100
  • 97bd818745 fix add_code in html backend and add more typing hints Michele Dolfi 2025-01-31 08:42:55 +0100
  • 4df085aa6c
    feat: Python 3.13 support (#841) Michele Dolfi 2025-01-30 17:26:42 +0100
  • bccb022fc8
    fix(markdown): fix empty block handling (#843) Panos Vagenas 2025-01-30 16:22:29 +0100
  • c94c0d3813 fix(markdown): fix empty block handling Panos Vagenas 2025-01-30 15:03:17 +0100
  • fea0a99a95
    fix: Fix for the crash when encountering WMF images in pptx and docx (#837) Maxim Lysak 2025-01-30 14:58:27 +0100
  • 3e5271c095 test with rapidocr only on python <3.13 Michele Dolfi 2025-01-30 14:55:20 +0100
  • 21bd3d94af update docs about python 3.13 Michele Dolfi 2025-01-30 14:44:48 +0100
  • ac828b7ce4 Merge remote-tracking branch 'origin/main' into lock-torch-py3.13 Michele Dolfi 2025-01-30 14:31:25 +0100
  • 3224df9fda Updated faq Maksym Lysak 2025-01-30 14:25:43 +0100
  • 17de54005f activate py3.13 in CI Michele Dolfi 2025-01-30 14:22:45 +0100
  • 6a1cce0a26 latest poetry version in CI Michele Dolfi 2025-01-30 14:22:15 +0100
  • df34a025d8 fix version for python3.13 Michele Dolfi 2025-01-30 14:21:42 +0100
  • b31f60295c fix table in test results Michele Dolfi 2025-01-30 14:11:22 +0100
  • d01a2e73ee
    test: update results with new docling-core (#839) Michele Dolfi 2025-01-30 14:07:52 +0100
  • c2f048bd2d Fix for docx when headers are also lists, now recorded as appropriate headers and subheaders, unit test included Maksym Lysak 2025-01-30 14:00:09 +0100
  • bd61a2db92 update all deps in the lock Michele Dolfi 2025-01-30 13:44:00 +0100
  • 1c05bbfd78 fix table output in 2203.01017v2.md Michele Dolfi 2025-01-30 12:48:03 +0100
  • 22a64ccf45 test: update results with new docling-core Michele Dolfi 2025-01-30 11:39:51 +0100
  • c93897b3df Fix for the crash when encountering WMF images in pptx and docx backends on non Windows platforms Maksym Lysak 2025-01-30 11:14:39 +0100
  • d7c082894e
    docs: updated the readme with upcoming features (#831) Peter W. J. Staar 2025-01-30 09:52:54 +0100
  • e2ff4f6c5d updated the docs-index Peter Staar 2025-01-30 08:53:48 +0100
  • 97f444b11c Update test files Rafael Teixeira de Lima 2025-01-29 16:06:33 +0100
  • c7289f647a Add standalone equations as DocItem formula Rafael Teixeira de Lima 2025-01-29 16:05:47 +0100
  • dae30a48aa Merge remote-tracking branch 'origin/main' into feat-picture-description Michele Dolfi 2025-01-29 13:54:12 +0100
  • f9144f2bb6
    docs: Add example for inspection of picture content (#624) Christoph Auer 2025-01-29 10:39:00 +0100
  • 5677bed986 updated the readme with upcoming features Peter Staar 2025-01-29 10:03:47 +0100
  • 347244de02 test pin dev branch of docling-core Michele Dolfi 2025-01-29 09:52:17 +0100
  • 4d11d87d06 chore: bump version to 2.17.0 [skip ci] v2.17.0 github-actions[bot] 2025-01-28 18:37:26 +0000
  • 5aed9f8aeb
    fix: fix single newline handling in MD backend (#824) Panos Vagenas 2025-01-28 19:05:55 +0100
  • adf6353483
    fix: use file extension if filetype fails with PDF (#827) Cesar Berrospi Ramis 2025-01-28 19:03:54 +0100
  • ba521dd88f
    chore: add missing imports to Office type tests (#826) Panos Vagenas 2025-01-28 16:17:44 +0100
  • 67f531e5bc
    Revert "chore: expose draw_clusters function (#803)" revert-803-refactor_viz Yusik Kim 2025-01-28 16:17:10 +0100
  • f5034944b8 Adding test files Rafael Teixeira de Lima 2025-01-28 15:14:02 +0100
  • 7996dcbf3e Fix test_backend_msword Rafael Teixeira de Lima 2025-01-28 13:59:17 +0100
  • b3b7c387ca Remove prints and backend flag Rafael Teixeira de Lima 2025-01-28 13:39:54 +0100
  • 31e30a2cb7 Add parsing configuration Rafael Teixeira de Lima 2025-01-28 09:41:56 +0100
  • 1f240c7763 Recommit with fixed history Rafael Teixeira de Lima 2025-01-28 09:32:00 +0100
  • e0a8bb5d29 fix: fix single newline handling in MD backend Panos Vagenas 2025-01-28 15:00:38 +0100
  • 76db3a8963
    Update test_backend_pptx.py Panos Vagenas 2025-01-28 14:52:43 +0100
  • c7e25a59b5
    Update test_backend_msword.py [skip ci] Panos Vagenas 2025-01-28 14:51:50 +0100
  • 5e1fafb518
    chore: add missing import to XLSX test Panos Vagenas 2025-01-28 14:48:42 +0100
  • 6ca7daf3b4 fix: use file extension if filetype fails with PDF Cesar Berrospi Ramis 2025-01-28 14:33:06 +0100
  • 6875913e34
    docs: document Docling JSON parsing (#819) Panos Vagenas 2025-01-28 13:23:30 +0100
  • e7930b547c update feature list, minor fixes Panos Vagenas 2025-01-28 12:39:13 +0100
  • 5139b48e4e
    docs: Add SSL verification error mitigation (#821) Anastas Stoyanovsky 2025-01-28 01:22:43 -0500
  • 6882e6c38d
    feat(CLI): Expose code and formula models in the CLI (#820) Michele Dolfi 2025-01-28 06:26:03 +0100
  • 440371dd1e
    Add SSL verification error mitigation Anastas Stoyanovsky 2025-01-27 15:46:29 -0500
  • 4d41db3f7a
    docs(backend XML): do not delete temp file in notebook (#817) Cesar Berrospi Ramis 2025-01-27 18:53:39 +0100
  • 9c6d9223bb feat: expose code and formula models in the CLI Michele Dolfi 2025-01-27 17:50:31 +0100
  • 68272b987a docs: document Docling JSON parsing Panos Vagenas 2025-01-27 17:23:39 +0100
  • a112d7a035
    fix: parse html with omitted body tag (#818) Cesar Berrospi Ramis 2025-01-27 16:59:00 +0100
  • 8aedd5819d test: ensure docling converts HTML without body tag Cesar Berrospi Ramis 2025-01-27 16:09:51 +0100
  • 95b293a723
    feat: add platform info to CLI version printout (#816) Panos Vagenas 2025-01-27 16:04:57 +0100
  • baf622ffad fix: parse HTML files without body tag Cesar Berrospi Ramis 2025-01-27 15:12:18 +0100
  • a5c360145c
    add Python implementation & language versions Panos Vagenas 2025-01-27 14:30:53 +0100
  • 740872c081 docs(backend XML): do not delete temp file in notebook Cesar Berrospi Ramis 2025-01-27 14:30:02 +0100
  • 53327552e8
    feat(ocr): expose rec_keys_path in RapidOcrOptions to support custom dictionaries (#786) Yorick Terweijden 2025-01-27 14:38:15 +0200
  • 327e9238c4
    Update main.py Panos Vagenas 2025-01-27 13:13:12 +0100
  • 8b41861fc6
    feat: add platform info to CLI version printout Panos Vagenas 2025-01-27 13:08:38 +0100
  • 9022c6d855
    chore: update deps in lockfile (#815) Michele Dolfi 2025-01-27 12:41:18 +0100
  • 30d0afe137
    Merge branch 'main' into rtdl/export_latex_docx Rafael Teixeira de Lima 2025-01-27 12:31:59 +0100
  • c32c3b6f2c update poetry lock Rafael Teixeira de Lima 2025-01-27 12:29:46 +0100
  • bf31fdd0b8 style(rapidocr-options): fix alignment of rec_keys_path comment Yorick Terweijden 2025-01-27 13:05:31 +0200
  • 2209fe27c3 chore: update deps in lockfile Michele Dolfi 2025-01-27 11:24:54 +0100
  • 8a4ec77576
    docs: typo (#814) Farzad Sunavala 2025-01-27 04:24:26 -0600
  • b3290c1ed9
    typo Farzad Sunavala 2025-01-27 03:39:20 -0600
  • f70758bc4f
    Update rag_azuresearch.ipynb Farzad Sunavala 2025-01-27 03:36:36 -0600
  • b3bf971eb0 Add pylatexenc to exceptions list Rafael Teixeira de Lima 2025-01-27 10:22:28 +0100
  • 9b5e482d1e pre commit fixes, issue with pylatexenc Rafael Teixeira de Lima 2025-01-27 10:02:21 +0100
  • b885b2fa3c
    docs: added markdown headings to enable TOC in github pages (#808) Farzad Sunavala 2025-01-27 02:40:35 -0600