Commit Graph

  • 4090f9700b add node parser, JSONPath resolution to LI example, refactor demo Panos Vagenas 2024-09-12 09:20:48 +02:00
  • d8ddb559fa docs: add conversion example Panos Vagenas 2024-09-12 06:39:10 +02:00
  • 53569a1023 docs: showcase RAG with LlamaIndex and LangChain (#71) Panos Vagenas 2024-09-11 15:07:08 +02:00
  • 79932b7d69 test: check for stable obj_type (#70) Michele Dolfi 2024-09-11 12:53:59 +02:00
  • e66dc53765 chore: bump version to 1.11.0 [skip ci] v1.11.0 github-actions[bot] 2024-09-10 16:18:59 +00:00
  • bdfdfbf092 feat: adding txt and doctags output (#68) Peter W. J. Staar 2024-09-10 17:30:52 +02:00
  • cd5b6293cc chore: bump version to 1.10.0 [skip ci] v1.10.0 github-actions[bot] 2024-09-10 14:38:07 +00:00
  • 27a7a152e1 feat: linux arm64 support and reducing dependencies (#69) Michele Dolfi 2024-09-10 15:43:27 +02:00
  • 1051eb9465 chore: update README (#65) Panos Vagenas 2024-09-09 12:03:04 +02:00
  • 6f1811e050 chore: fix placeholders in license (#63) Michele Dolfi 2024-09-06 17:10:07 +02:00
  • d3711437f6 chore: bump version to 1.9.0 [skip ci] v1.9.0 github-actions[bot] 2024-09-03 13:33:40 +00:00
  • 1de2e4f924 feat: export document pages as multimodal output (#54) Michele Dolfi 2024-09-03 15:05:35 +02:00
  • 69e5d951a3 docs: Update MAINTAINERS.md (#59) Christoph Auer 2024-09-02 12:34:38 +02:00
  • 85b7348846 docs: Mention quackling on README (#58) Christoph Auer 2024-09-02 12:27:29 +02:00
  • 66ed096c40 chore: bump version to 1.8.5 [skip ci] v1.8.5 github-actions[bot] 2024-08-30 12:37:54 +00:00
  • 48f4d1ba52 fix: Add unit tests (#51) Peter W. J. Staar 2024-08-30 14:08:20 +02:00
  • 256f4d504e chore: bump version to 1.8.4 [skip ci] v1.8.4 github-actions[bot] 2024-08-30 08:47:57 +00:00
  • de85e46ced fix: propagate row_section in tables (#57) Michele Dolfi 2024-08-30 10:36:00 +02:00
  • a8a60d52b1 docs: add instructions for cpu-only installation (#56) Michele Dolfi 2024-08-30 10:20:21 +02:00
  • 5c46749e70 chore: bump version to 1.8.3 [skip ci] v1.8.3 github-actions[bot] 2024-08-28 10:37:38 +00:00
  • f49ee825c3 fix: table cells overlap and model warnings (#53) Michele Dolfi 2024-08-28 12:30:42 +02:00
  • d0403aaebf chore: bump version to 1.8.2 [skip ci] v1.8.2 github-actions[bot] 2024-08-27 09:53:15 +00:00
  • e46a66a176 fix: refine conversion result (#52) Panos Vagenas 2024-08-27 11:50:43 +02:00
  • fe817b11d7 docs: update interface in README (#50) Michele Dolfi 2024-08-26 15:36:39 +02:00
  • 7052bee999 chore: bump version to 1.8.1 [skip ci] v1.8.1 github-actions[bot] 2024-08-26 11:55:37 +00:00
  • 8cc147bc56 fix: align output formats (#49) Michele Dolfi 2024-08-26 13:30:26 +02:00
  • 053eae4bdf chore: bump version to 1.8.0 [skip ci] v1.8.0 github-actions[bot] 2024-08-23 14:24:04 +00:00
  • a294b7e64a feat: Page-level error reporting from PDF backend, introduce PARTIAL_SUCCESS status (#47) Christoph Auer 2024-08-23 16:18:41 +02:00
  • 3226b20779 chore: bump version to 1.7.1 [skip ci] v1.7.1 github-actions[bot] 2024-08-23 11:56:02 +00:00
  • 8808463cec fix: Better raise exception when a page fails to parse (#46) Christoph Auer 2024-08-23 13:51:42 +02:00
  • 7e84533299 fix: Upgrade docling-parse to 1.1.1, safety checks for failed parse on pages (#45) Christoph Auer 2024-08-23 12:51:02 +02:00
  • 1930f08d4e chore: bump version to 1.7.0 [skip ci] v1.7.0 github-actions[bot] 2024-08-22 12:00:25 +00:00
  • a8c6b29a67 feat: Upgrade docling-parse PDF backend and interface to use page-by-page parsing (#44) Christoph Auer 2024-08-22 13:49:37 +02:00
  • f7c50c8b0e chore: bump version to 1.6.3 [skip ci] v1.6.3 github-actions[bot] 2024-08-22 11:02:35 +00:00
  • fac5745dc8 fix: usage of bytesio with docling-parse (#43) Michele Dolfi 2024-08-22 12:59:49 +02:00
  • 1347c01a9e chore: bump version to 1.6.2 [skip ci] v1.6.2 github-actions[bot] 2024-08-22 07:32:54 +00:00
  • 69952682ed fix: remove [ocr] extra to fix wheel install (#42) Michele Dolfi 2024-08-22 09:25:19 +02:00
  • 47c6dab6d2 chore: bump version to 1.6.1 [skip ci] v1.6.1 github-actions[bot] 2024-08-21 17:41:26 +00:00
  • f19871a5a1 fix: Add scipy as dependency (#40) Christoph Auer 2024-08-21 17:21:02 +02:00
  • 4a1ceaf65c Update docling-ibm-models to v1.1.2 (#39) Christoph Auer 2024-08-21 17:12:38 +02:00
  • 22a5c29c63 chore: bump version to 1.6.0 [skip ci] v1.6.0 github-actions[bot] 2024-08-20 13:34:53 +00:00
  • e94d317c02 feat: Add adaptive OCR, factor out treatment of OCR areas and cell filtering (#38) Christoph Auer 2024-08-20 15:28:03 +02:00
  • 47b8ad917e chore: bump version to 1.5.0 [skip ci] v1.5.0 github-actions[bot] 2024-08-20 11:53:52 +00:00
  • 78347bf679 feat: allow computing page images on-demand with scale and cache them (#36) Michele Dolfi 2024-08-20 13:27:19 +02:00
  • c253dd743a Add redbooks to test data, small additions (#35) Christoph Auer 2024-08-20 12:36:00 +02:00
  • a13114bafd docs: add technical paper ref (#37) Michele Dolfi 2024-08-20 12:32:53 +02:00
  • 778e51ef18 chore: bump version to 1.4.0 [skip ci] v1.4.0 github-actions[bot] 2024-08-14 11:46:55 +00:00
  • 349b0e914f fix: allow newer torch versions (#34) Michele Dolfi 2024-08-14 13:37:36 +02:00
  • 90dd676422 feat: update parser with bytesio interface and set as new default backend (#32) Michele Dolfi 2024-08-14 12:30:00 +02:00
  • 61be78a875 Fix class re-mapping for table of contents (#33) Christoph Auer 2024-08-14 11:32:30 +02:00
  • dd0df9f094 chore: bump version to 1.3.0 [skip ci] v1.3.0 github-actions[bot] 2024-08-12 16:29:05 +00:00
  • 63d80edca2 feat: output page images and extracted bbox (#31) Michele Dolfi 2024-08-12 18:25:45 +02:00
  • 0bf4a43ed5 chore: bump version to 1.2.1 [skip ci] v1.2.1 github-actions[bot] 2024-08-07 15:38:00 +00:00
  • 79ef8d2f2f fix: update (vuln) deps (#29) Michele Dolfi 2024-08-07 17:29:36 +02:00
  • 794b20a50a fix: type of path_or_stream in PdfDocumentBackend (#28) Michele Dolfi 2024-08-07 17:20:44 +02:00
  • 9550db8e64 docs: improve examples (#27) Michele Dolfi 2024-08-07 17:16:35 +02:00
  • 20cbe7c24a chore: bump version to 1.2.0 [skip ci] v1.2.0 github-actions[bot] 2024-08-07 14:35:03 +00:00
  • b8f5e38a8c feat: introducing docling_backend (#26) Maxim Lysak 2024-08-07 16:22:36 +02:00
  • 62ba4aaf31 chore: bump version to 1.1.2 [skip ci] v1.1.2 github-actions[bot] 2024-07-31 12:35:59 +00:00
  • d2d9543415 fix: set page number using 1-based indexing (#22) Panos Vagenas 2024-07-31 14:28:44 +02:00
  • e102827753 chore: bump version to 1.1.1 [skip ci] v1.1.1 github-actions[bot] 2024-07-30 12:53:54 +00:00
  • f4bf3d25b9 fix: Correct text extraction for table cells (#21) Maxim Lysak 2024-07-30 14:51:47 +02:00
  • b07c4a7a4a chore: bump version to 1.1.0 [skip ci] v1.1.0 github-actions[bot] 2024-07-26 15:01:56 +00:00
  • d603137383 feat: add simplified single-doc conversion (#20) Panos Vagenas 2024-07-26 16:55:33 +02:00
  • 3eca8b8485 refactor(pypdfium2): just forward input to PdfDocument directly (#17) mara004 2024-07-25 08:54:57 +02:00
  • 6db2b350dd chore: bump version to 1.0.2 [skip ci] v1.0.2 github-actions[bot] 2024-07-24 12:18:21 +00:00
  • 54b3dda141 fix: add easyocr to main deps for valid extra (#19) Michele Dolfi 2024-07-24 14:11:26 +02:00
  • 3e92f0bfba chore: bump version to 1.0.1 [skip ci] v1.0.1 github-actions[bot] 2024-07-24 09:28:47 +00:00
  • b0725e0aa6 fix: expose ocr as extra (#18) Michele Dolfi 2024-07-24 11:14:17 +02:00
  • 9f2add112f chore: bump version to 1.0.0 [skip ci] v1.0.0 github-actions[bot] 2024-07-18 15:52:38 +00:00
  • 71c3a9c8cd feat!: v1.0.0 release (#16) Michele Dolfi 2024-07-18 17:50:14 +02:00
  • 7bc20adc16 pin docling-ibm-models 1.1.0 with python 3.10 support (#15) Michele Dolfi 2024-07-18 17:27:48 +02:00
  • eb0b208272 chore: switch to docling-core Markdown export (#14) Panos Vagenas 2024-07-18 16:10:05 +02:00
  • 28d1c746a6 chore: update README (#13) Panos Vagenas 2024-07-18 11:23:23 +02:00
  • f09ffcc8f4 chore: bump version to 0.4.0 [skip ci] v0.4.0 github-actions[bot] 2024-07-17 14:26:50 +00:00
  • e9526bb11e feat: Optimize table extraction quality, add configuration options (#11) Christoph Auer 2024-07-17 16:13:21 +02:00
  • 3e2ede8107 chore: bump version to 0.3.1 [skip ci] v0.3.1 github-actions[bot] 2024-07-17 13:58:51 +00:00
  • d1d1724537 fix: missing type for default values (#12) Michele Dolfi 2024-07-17 15:54:43 +02:00
  • 2baa35c548 docs: reflect supported Python versions, add badges (#10) Panos Vagenas 2024-07-17 15:49:26 +02:00
  • 0dfa4548d3 chore: bump version to 0.3.0 [skip ci] v0.3.0 github-actions[bot] 2024-07-17 12:11:15 +00:00
  • fb72688ff7 feat: enable python 3.12 support by updating glm (#8) Michele Dolfi 2024-07-17 14:03:26 +02:00
  • 2803222ee1 docs: Add setup with pypi to Readme (#7) Christoph Auer 2024-07-16 14:15:09 +02:00
  • 5c88574d03 chore: bump version to 0.2.0 [skip ci] v0.2.0 github-actions[bot] 2024-07-16 11:37:14 +00:00
  • b1479cf4ec feat: build with ci (#6) Michele Dolfi 2024-07-16 13:34:42 +02:00
  • b4f45ce96b disable docs build (#5) Michele Dolfi 2024-07-16 13:14:44 +02:00
  • e45dc5d1a5 ci: Add Github Actions (#4) Michele Dolfi 2024-07-16 13:05:04 +02:00
  • b9dc892385 Update convert.py (#3) Christoph Auer 2024-07-15 18:02:42 +02:00
  • 05ab89f958 doc: More documentation updates (#2) Christoph Auer 2024-07-15 14:59:53 +02:00
  • 180f70c6e8 docs: Update links, add GH repository to metadata (#1) v0.1.1 Christoph Auer 2024-07-15 12:43:05 +02:00
  • e2d996753b Initial commit Christoph Auer 2024-07-15 09:42:42 +02:00