mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 20:58:11 +00:00
103 KiB
103 KiB
v2.50.0 - 2025-09-03
Feature
Fix
v2.49.0 - 2025-09-01
Feature
- [Beta] Extraction with schema (#2138) (
9f4bc5b) - msexcel: Set ContentLayer.INVISIBLE for invisible sheet (#1876) (
a283ccf)
Fix
- pypdfium2: Fix OCR bounding box misalignment caused by mismatched rotation metadata (#2039) (
4d94e38) - Translation example (#2166) (
9f0286b) - Extend offline mode for rapidocr fonts (#2155) (
9904d14)
Documentation
v2.48.0 - 2025-08-26
Feature
Fix
v2.47.1 - 2025-08-23
Fix
v2.47.0 - 2025-08-22
Feature
- CLI: Option to download arbitrary HuggingFace model (#2123) (
cdf079d) - Batching support for VLMs in transformers backend, add initial VLLM backend (#2094) (
3c660c0) - html: Support formatting tags in HTML texts (#2111) (
94fcc46)
Fix
Documentation
- DPK pipeline example using docling library (#2112) (
e76298c) - Add Getting Started page (#2113) (
8996d61)
v2.46.0 - 2025-08-20
Feature
Fix
Performance
- Clean up resources with docling-parse v4, no parsed_page output by default (#2105) (
5f57ff2) - Speed up function
_parse_orientation(#1934) (8820b55)
v2.45.0 - 2025-08-18
Feature
- Add backend for METS with Google Books profile (#1989) (
31087f3) - html: Support in-line anchor tags in HTML texts (#1659) (
9687297) - vlm: Ability to preprocess VLM response (#1907) (
5f050f9)
Documentation
v2.44.0 - 2025-08-12
Feature
Fix
- html: Parse rawspan and colspan when they include non numerical values (#2048) (
ed56f2d) - Support new mlx-vlm module (#2001) (
0130e3a) - Extend error reporting when verbose logging is enabled (#2017) (
2eb760d) - HTML: Replace non-standard Unicode characters (#2006) (
86f7012)
Documentation
v2.43.0 - 2025-07-28
Feature
Fix
- markdown: Ensure correct parsing of nested lists (#1995) (
aec29a7) - HTML: Remove an unnecessary print command (#1988) (
945721a)
v2.42.2 - 2025-07-24
Fix
- HTML: Concatenation of child strings in table cells and list items (#1981) (
5132f06) - docx: Adding plain latex equations to table cells (#1986) (
0b83609) - Preserve PARTIAL_SUCCESS status when document timeout hits (#1975) (
98e2fcf) - Multi-page image support (tiff) (#1928) (
8d50a59)
Documentation
v2.42.1 - 2025-07-22
Fix
Documentation
- Enrich existing DoclingDocument (#1969) (
90a7cc4) - Add documentation for confidence scores (#1912) (
5d98bce)
v2.42.0 - 2025-07-18
Feature
Fix
- Safe pipeline init, use device_map in transformers models (#1917) (
cca05c4) - Fix HTML table parser and JATS backend bugs (#1948) (
e1e3053) - KeyError: 'fPr' when processing latex fractions in DOCX files (#1926) (
95e7096) - Change granite vision model URL from preview to stable version (#1925) (
c5fb353)
Documentation
v2.41.0 - 2025-07-10
Feature
- Layout model specification and multiple choices (#1910) (
2b8616d) - Enable precision control in float serialization (#1914) (
ec588df) - Add image-text-to-text models in transformers (#1772) (
a07ba86) - vlm: Dynamic prompts (#1808) (
b8813ee)
Fix
- ocr-utils: Unit test and fix the
rotate_bounding_boxfunction (#1897) (931eb55) - Docs are missing osd packages for tesseract on RHEL (#1905) (
e25873d) - Use only backend for picture classifier (#1904) (
edd4356) - Typo in asr options (#1902) (
dd8fde7)
v2.40.0 - 2025-07-04
Feature
- Introduce LayoutOptions to control layout postprocessing behaviour (#1870) (
ec6cf6f) - Integrate ListItemMarkerProcessor into document assembly (#1825) (
56a0e10)
Fix
- Secure torch model inits with global locks (#1884) (
598c9c5) - Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866) (
ae39a94)
Performance
- msexcel: _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell (#1875) (
13865c0) - Move expensive imports closer to usage (#1863) (
3089cf2)
v2.39.0 - 2025-06-27
Feature
Fix
v2.38.1 - 2025-06-25
Fix
- Updated granite vision model version for picture description (#1852) (
d337825) - markdown: Fix single-formatted headings & list items (#1820) (
7c5614a) - Fix response type of ollama (#1850) (
41e8cae) - Handle missing runs to avoid out of range exception (#1844) (
4002de1)
v2.38.0 - 2025-06-23
Feature
- Support audio input (#1763) (
1557e7c) - markdown: Add formatting & improve inline support (#1804) (
861abcd) - Maximum image size for Vlm models (#1802) (
215b540)
Fix
- docx: Ensure list items have a list parent (#1827) (
d26dac6) - msword_backend: Identify text in the same line after an image #1425 (#1610) (
1350a8d) - Ensure uninitialized pages are removed before assembling document (#1812) (
dd7f64f) - Formula conversion with page_range param set (#1791) (
dbab30e)
Documentation
- Update readme and add ASR example (#1836) (
f3ae302) - Support running examples from root or subfolder (#1816) (
64ac043)
v2.37.0 - 2025-06-16
Feature
- Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745) (
7d3302c) - Support xlsm files (#1520) (
df14022)
Fix
- Pptx line break and space handling (#1664) (
f28d23c) - asciidoc: Set default size when missing in image directive (#1769) (
b886e4d) - Handle NoneType error in MsPowerpointDocumentBackend (#1747) (
7a275c7) - Prov for merged-elems (#1728) (
6613b9e) - tesseract: Initialize df_osd to avoid uninitialized variable error (#1718) (
e979750) - Allow custom torch_dtype in vlm models (#1735) (
f7f3113) - Improve extraction from textboxes in Word docs (#1701) (
9dbcb3d) - Add WEBP to the list of image file extensions (#1711) (
a2b83fe)
Documentation
v2.36.1 - 2025-06-04
Fix
Documentation
v2.36.0 - 2025-06-03
Feature
v2.35.0 - 2025-06-02
Feature
Fix
- Guess HTML content starting with script tag (#1673) (
984cb13) - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid continuation byte (#1665) (
51d3450)
Documentation
v2.34.0 - 2025-05-22
Feature
- ocr: Auto-detect rotated pages in Tesseract (#1167) (
45265bf) - Establish confidence estimation for document and pages (#1313) (
9087524)
Fix
- Fix ZeroDivisionError for cell_bbox.area() (#1636) (
c2f595d) - integration: Update the Apify Actor integration (#1619) (
14d4f5b)
v2.33.0 - 2025-05-20
Feature
Fix
- Fix issue with detecting docx files, and files with upper case extensions (#1609) (
f4d9d41) - Load_from_doctags static usage (#1617) (
0e00a26) - Incorrect force_backend_text behaviour for VLM DocTag pipelines (#1371) (
f2e9c07) - pypdfium: Resolve overlapping text when merging bounding boxes (#1549) (
98b5eeb)
v2.32.0 - 2025-05-14
Feature
- Improve parallelization for remote services API calls (#1548) (
3a04f2a) - Support image/webp file type (#1415) (
12dab0a)
Fix
- ocr: Orig field in TesseractOcrCliModel as str (#1553) (
9f8b479) - settings: Fix nested settings load via environment variables (#1551) (
2efb7a7)
Documentation
v2.31.2 - 2025-05-13
Fix
- AsciiDoc header identification (#1562) (#1563) (
4046d0b) - Restrict click version and update lock file (#1582) (
8baa85a)
v2.31.1 - 2025-05-12
Fix
- Add smoldocling in download utils (#1577) (
127e386) - HTML: Handle row spans in header rows (#1536) (
776e7ec) - Mime error in document streams (#1523) (
f1658ed) - Usage of hashlib for FIPS (#1512) (
7c70573) - Guard against attribute errors in TesseractOcrModel del (#1494) (
4ab7e9d) - Enable cuda_use_flash_attention2 for PictureDescriptionVlmModel (#1496) (
cc45396) - Updated the time-recorder label for reading order (#1490) (
976e92e) - Incorrect scaling of TableModel bboxes when do_cell_matching is False (#1459) (
94d66a0)
Documentation
- Update links in data_prep_kit (#1559) (
844babb) - Add serialization docs, update chunking docs (#1556) (
3220a59) - Update supported formats guide (#1463) (
3afbe6c)
v2.31.0 - 2025-04-25
Feature
Fix
- html: Handle address, details, and summary tags (#1436) (
ed20124) - Treat overflowing -v flags as DEBUG (#1419) (
8012a3e) - codecov: Fix codecov argument and yaml file (#1399) (
fa7fc9e)
Documentation
- Fix wrong output format in example code (#1427) (
c2470ed) - Add OpenSSF Best Practices badge (#1430) (
64918a8) - Typo fixes in docling_document.md (#1400) (
995b3b0) - Updated the [Usage] link in architecture.md (#1416) (
88948b0) - ocr: Add docs entry for OnnxTR OCR plugin (#1382) (
a7dd59c) - security: More statements about secure development (#1381) (
293c28c) - Add testing in the docs (#1379) (
01fbfd5) - Add Notes for Installing in Intel macOS (#1377) (
a026b4e)
v2.30.0 - 2025-04-14
Feature
- cli: Add option for html with split-page mode (#1355) (
c0ba88e) - xlsx: Create a page for each worksheet in XLSX backend (#1332) (
eef2bde) - OllamaVlmModel for Granite Vision 3.2 (#1337) (
c605edd)
Fix
- deps: Widen typer upper bound (#1375) (
7e40ad3) - Auto-recognize .xlsx, .docx and .pptx files (#1340) (
0de70e7) - docx: Declare image_data variable when handling pictures (#1359) (
415b877) - Implement PictureDescriptionApiOptions.bitmap_area_threshold (#1248) (
2503999) - Properly address page in pipeline _assemble_document when page_range is provided (#1334) (
6b696b5)
v2.29.0 - 2025-04-10
Feature
- Handle
tags as code blocks (#1320) (0499cd1) - docx: Add text formatting and hyperlink support (#630) (
bfcab3d)
Fix
- docx: Adding new latex symbols, simplifying how equations are added to text (#1295) (
14e9c0c)
- pptx: Check if picture shape has an image attached (#1316) (
dc3bf9c)
- docx: Improve text parsing (#1268) (
d2d6874)
- Tesseract OCR CLI can't process images composed with numbers only (#1201) (
b3d111a)
Documentation
v2.28.4 - 2025-03-29
Fix
v2.28.3 - 2025-03-28
Fix
v2.28.2 - 2025-03-26
Fix
- Improve HTML layer detection, various MD fixes (#1241) (
9210812)
- html: Fix HTML parsed heading level (#1244) (
85c4df8)
v2.28.1 - 2025-03-25
Fix
- converter: Cache same pipeline class with different options (#1152) (
825b226)
- debug: Missing translation of bbox to to_bounding_box (#1220) (
6df8827)
- docx: Identifying numbered headers (#1231) (
f739d0e)
Documentation
v2.28.0 - 2025-03-19
Feature
- SmolDocling: Support MLX acceleration in VLM pipeline (#1199) (
1c26769)
- Add PPTX notes slides (#474) (
b454aa1)
- Updated vlm pipeline (with latest changes from docling-core) (#1158) (
2f72167)
Fix
- Determine correct page size in DoclingParseV4Backend (#1196) (
f5adfb9)
- msword: Fixing function return in equations handling (#1194) (
0b707d0)
Documentation
v2.27.0 - 2025-03-18
Feature
- Add factory for ocr engines via plugins (#1010) (
6eaae3c)
- Add DoclingParseV4 backend, using high-level docling-parse API (#905) (
3960b19)
- actor: Docling Actor on Apify infrastructure (#875) (
772487f)
- Equations to latex in MSWord backend (with inline groups) (#1114) (
6eb718f)
Fix
- html: Handle nested empty lists (#1154) (
f94da44)
- Use first table row as col headers (#1156) (
0945973)
- Pass tests, update docling-core to 2.22.0 (#1150) (
aa92a57)
Documentation
v2.26.0 - 2025-03-11
Feature
Fix
Documentation
Performance
v2.25.2 - 2025-03-05
Fix
Documentation
v2.25.1 - 2025-03-03
Fix
- Enable locks for threadsafe pdfium (#1052) (
8dc0562)
- html: Use 'start' attribute when parsing ordered lists from HTML docs (#1062) (
de7b963)
Documentation
v2.25.0 - 2025-02-26
Feature
- [Experimental] Introduce VLM pipeline using HF AutoModelForVision2Seq, featuring SmolDocling model (#1054) (
3c9fe76)
- cli: Add option for downloading all models, refine help messages (#1061) (
ab683e4)
Fix
- Vlm using artifacts path (#1057) (
e197225)
- html: Parse text in div elements as TextItem (#1041) (
1b0ead6)
Documentation
v2.24.0 - 2025-02-20
Feature
v2.23.1 - 2025-02-20
Fix
Documentation
v2.23.0 - 2025-02-17
Feature
- Support cuda:n GPU device allocation (#694) (
77eb77b)
- xml-jats: Parse XML JATS documents (#967) (
428b656)
Fix
v2.22.0 - 2025-02-14
Feature
- Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945) (
00d9405)
- Introduce the enable_remote_services option to allow remote connections while processing (#941) (
2716c7d)
- Allow artifacts_path to be defined as ENV (#940) (
5101e25)
Fix
- Update Pillow constraints (#958) (
af19c03)
- Fix the initialization of the TesseractOcrModel (#935) (
c47ae70)
Documentation
- Update example Dockerfile with download CLI (#929) (
7493d5b)
- Examples for picture descriptions (#951) (
2d66e99)
v2.21.0 - 2025-02-10
Feature
v2.20.0 - 2025-02-07
Feature
Fix
v2.19.0 - 2025-02-07
Feature
Fix
- markdown: Handle nested lists (#910) (
90b766e)
- Test cases for RTL programmatic PDFs and fixes for the formula model (#903) (
9114ada)
- msword_backend: Handle conversion error in label parsing (#896) (
722a6eb)
- Enrichment models batch size and expose picture classifier (#878) (
5ad6de0)
Documentation
v2.18.0 - 2025-02-03
Feature
- Expose equation exports (#869) (
6a76b49)
- Add option to define page range (#852) (
70d68b6)
- docx: Support of SDTs in docx backend (#853) (
d727b04)
- Python 3.13 support (#841) (
4df085a)
Fix
- markdown: Fix parsing if doc ending with table (#873) (
5ac2887)
- markdown: Add support for HTML content (#855) (
94751a7)
- docx: Merged table cells not properly converted (#857) (
0cd81a8)
- Processing of placeholder shapes in pptx that have text but no bbox (#868) (
eff16b6)
- KeyError in tableformer prediction (#854) (
b1cf796)
- Fixed docx import with headers that are also lists (#842) (
2c037ae)
- Use new add_code in html backend and add more typing hints (#850) (
2a1f8af)
- markdown: Fix empty block handling (#843) (
bccb022)
- Fix for the crash when encountering WMF images in pptx and docx (#837) (
fea0a99)
Documentation
- Updated the readme with upcoming features (#831) (
d7c0828)
- Add example for inspection of picture content (#624) (
f9144f2)
v2.17.0 - 2025-01-28
Feature
- CLI: Expose code and formula models in the CLI (#820) (
6882e6c)
- Add platform info to CLI version printout (#816) (
95b293a)
- ocr: Expose
rec_keys_path in RapidOcrOptions to support custom dictionaries (#786) (5332755)
- Introduce automatic language detection in TesseractOcrCliModel (#800) (
3be2fb5)
Fix
- Fix single newline handling in MD backend (#824) (
5aed9f8)
- Use file extension if filetype fails with PDF (#827) (
adf6353)
- Parse html with omitted body tag (#818) (
a112d7a)
Documentation
- Document Docling JSON parsing (#819) (
6875913)
- Add SSL verification error mitigation (#821) (
5139b48)
- backend XML: Do not delete temp file in notebook (#817) (
4d41db3)
- Typo (#814) (
8a4ec77)
- Added markdown headings to enable TOC in github pages (#808) (
b885b2f)
- Description of supported formats and backends (#788) (
c2ae1cc)
v2.16.0 - 2025-01-24
Feature
- New document picture classifier (#805) (
16a218d)
- Add Docling JSON ingestion (#783) (
88a0e66)
- Code and equation model for PDF and code blocks in markdown (#752) (
3213b24)
- Add "auto" language for TesseractOcr (#759) (
8543c22)
Fix
- Added extraction of byte-images in excel (#804) (
a458e29)
- Update docling-parse-v2 backend version with new parsing fixes (#769) (
670a08b)
Documentation
- Fix minor typos (#801) (
c58f75d)
- Add Azure RAG example (#675) (
9020a93)
- Fix links between docs pages (#697) (
c49b352)
- Fix correct Accelerator pipeline options in docs/examples/custom_convert.py (#733) (
7686083)
- Example to translate documents (#739) (
f7e1cbf)
v2.15.1 - 2025-01-10
Fix
- Improve OCR results, stricten criteria before dropping bitmap areas (#719) (
5a060f2)
- Allow earlier requests versions (#716) (
e64b5a2)
Documentation
v2.15.0 - 2025-01-08
Feature
Fix
- Correct scaling of debug visualizations, tune OCR (#700) (
5cb4cf6)
- Let BeautifulSoup detect the HTML encoding (#695) (
42856fd)
- mspowerpoint: Handle invalid images in PowerPoint slides (#650) (
d49650c)
Documentation
- Specify docstring types (#702) (
ead396a)
- Add link to rag with granite (#698) (
6701f34)
- Add integrations, revamp docs (#693) (
2d24fae)
- Add OpenContracts as an integration (#679) (
569038d)
- Add Weaviate RAG recipe notebook (#451) (
2b591f9)
- Document Haystack & Vectara support (#628) (
fc645ea)
v2.14.0 - 2024-12-18
Feature
v2.13.0 - 2024-12-17
Feature
- Updated Layout processing with forms and key-value areas (#530) (
60dc852)
- Create a backend to parse USPTO patents into DoclingDocument (#606) (
4e08750)
- Add Easyocr parameter recog_network (#613) (
3b53bd3)
Documentation
- Add Haystack RAG example (#615) (
3e599c7)
- Fix the path to the run_with_accelerator.py example (#608) (
3bb3bf5)
v2.12.0 - 2024-12-13
Feature
v2.11.0 - 2024-12-12
Feature
Fix
- Do not import python modules from deepsearch-glm (#569) (
aee9c0b)
- Handle no result from RapidOcr reader (#558) (
f45499c)
- Make enum serializable with human-readable value (#555) (
a7df337)
Documentation
v2.10.0 - 2024-12-09
Feature
Fix
- Call into docling-core for legacy document transform (#551) (
7972d47)
- Introduce Image format options in CLI. Silence the tqdm downloading messages. (#544) (
78f61a8)
v2.9.0 - 2024-12-09
Feature
- Expose new hybrid chunker, update docs (#384) (
c8ecdd9)
- MS Word backend: Make detection of headers and other styles localization agnostic (#534) (
3e073df)
Fix
- Correcting DefaultText ID for MS Word backend (#537) (
eb7ffcd)
- Add
py.typed marker file (#531) (9102fe1)
- Enable HTML export in CLI and add options for image mode (#513) (
0d11e30)
- Missing text in docx (t tag) when embedded in a table (#528) (
b730b2d)
- Restore pydantic version pin after fixes (#512) (
c830b92)
- Folder input in cli (#511) (
8ada0bc)
Documentation
v2.8.3 - 2024-12-03
Fix
v2.8.2 - 2024-12-03
Fix
- ParserError EOF inside string (#470) (#472) (
c90c41c)
- PermissionError when using tesseract_ocr_cli_model (#496) (
d3f84b2)
Documentation
- Add styling for faq (#502) (
5ba3807)
- Typo in faq (#484) (
33cff98)
- Add automatic api reference (#475) (
d487210)
- Introduce faq section (#468) (
8ccb3c6)
Performance
v2.8.1 - 2024-11-29
Fix
Documentation
v2.8.0 - 2024-11-27
Feature
Fix
- Use correct image index in word backend (#442) (
767563b)
- Update tests and examples for docling-core 2.5.1 (#449) (
29807a2)
v2.7.1 - 2024-11-26
Fix
Documentation
v2.7.0 - 2024-11-20
Feature
Fix
v2.6.0 - 2024-11-19
Feature
- Added support for exporting DocItem to an image when page image is available (#379) (
3f91e7d)
- Expose ocr-lang in CLI (#375) (
ed785ea)
- Added excel backend (#334) (
926dfd2)
- Extracting picture data for raster images found in PPTX (#349) (
7a97d71)
Fix
- Fixing images in the input Word files (#330) (
8533039)
- Reduce logging by keeping option for more verbose (#323) (
8b437ad)
Documentation
- Fixed typo in v2 example v2 (#378) (
911c3bd)
- Add automatic generation of CLI reference (#325) (
ca8524e)
- Add architecture outline (#341) (
25fd149)
- Fix parameter in usage.md (#332) (
835e077)
v2.5.2 - 2024-11-13
Fix
v2.5.1 - 2024-11-12
Fix
Documentation
v2.5.0 - 2024-11-12
Feature
- OCR: Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) (
c6b3763)
Fix
- Configure env prefix for docling settings (#315) (
5d4a10b)
- Added handling of grouped elements in pptx backend (#307) (
81c8243)
- Allow mps usage for easyocr (#286) (
97f214e)
Documentation
v2.4.2 - 2024-11-08
Fix
- EasyOcrModel: Support the use_gpu pipeline parameter in EasyOcrModel. Initialize easyocr (#282) (
0eb065e)
v2.4.1 - 2024-11-08
Fix
- tesserocr: Raise Exception if tesserocr has not loaded any languages (#279) (
704d792)
- Dockerfile example copy command (#234) (
90836db)
Documentation
- Update badges & credits (#248) (
a84ec27)
- Add coming-soon section (#235) (
5ce02c5)
- Add artifacts-path param to CLI (#233) (
d5e65ae)
v2.4.0 - 2024-11-04
Feature
Documentation
- Add explicit artifacts path example (#224) (
eeee3b4)
- Update custom convert and dockerfile (#226) (
5f5fea9)
- Correct spelling of 'individual' (#219) (
41acaa9)
- Update LlamaIndex docs (#196) (
244ca69)
v2.3.1 - 2024-10-30
Fix
- Simplify torch dependencies and update pinned docling deps (#190) (
eb679cc)
- Allow to explicitly initialize the pipeline (#189) (
904d24d)
v2.3.0 - 2024-10-30
Feature
Fix
v2.2.1 - 2024-10-28
Fix
- Fix header levels for DOCX & HTML (#184) (
b9f5c74)
- Handling of long sequence of unescaped underscore chars in markdown (#173) (
94d0729)
- HTML backend, fixes for Lists and nested texts (#180) (
7d19418)
- MD Backend, fixes to properly handle trailing inline text and emphasis in headers (#178) (
88c1673)
Documentation
- Update LlamaIndex docs for Docling v2 (#182) (
2cece27)
- Fix batch convert (#177) (
189d3c2)
- Add export with embedded images (#175) (
8d356aa)
v2.2.0 - 2024-10-23
Feature
- Update to docling-parse v2 without history (#170) (
4116819)
- Support AsciiDoc and Markdown input format (#168) (
3023f18)
Fix
v2.1.0 - 2024-10-18
Feature
Fix
Documentation
- Typo fix (#155) (
f799e77)
- Add graphical band in readme (#154) (
034a411)
- Add use docling (#150) (
61c092f)
v2.0.0 - 2024-10-16
Feature
Breaking
Documentation
v1.20.0 - 2024-10-11
Feature
v1.19.1 - 2024-10-11
Fix
- Remove stderr from tesseract cli and introduce fuzziness in the text validation of OCR tests (#138) (
dae2a3b)
Documentation
v1.19.0 - 2024-10-08
Feature
v1.18.0 - 2024-10-03
Feature
v1.17.0 - 2024-10-03
Feature
v1.16.1 - 2024-09-27
Fix
Documentation
v1.16.0 - 2024-09-27
Feature
v1.15.0 - 2024-09-24
Feature
v1.14.0 - 2024-09-24
Feature
Fix
Documentation
v1.13.1 - 2024-09-23
Fix
v1.13.0 - 2024-09-18
Feature
Fix
Documentation
v1.12.2 - 2024-09-17
Fix
v1.12.1 - 2024-09-16
Fix
v1.12.0 - 2024-09-13
Feature
Documentation
v1.11.0 - 2024-09-10
Feature
v1.10.0 - 2024-09-10
Feature
v1.9.0 - 2024-09-03
Feature
Documentation
v1.8.5 - 2024-08-30
Fix
v1.8.4 - 2024-08-30
Fix
Documentation
v1.8.3 - 2024-08-28
Fix
v1.8.2 - 2024-08-27
Fix
Documentation
v1.8.1 - 2024-08-26
Fix
v1.8.0 - 2024-08-23
Feature
v1.7.1 - 2024-08-23
Fix
- Better raise exception when a page fails to parse (#46) (
8808463)
- Upgrade docling-parse to 1.1.1, safety checks for failed parse on pages (#45) (
7e84533)
v1.7.0 - 2024-08-22
Feature
v1.6.3 - 2024-08-22
Fix
v1.6.2 - 2024-08-22
Fix
v1.6.1 - 2024-08-21
Fix
v1.6.0 - 2024-08-20
Feature
v1.5.0 - 2024-08-20
Feature
Documentation
v1.4.0 - 2024-08-14
Feature
Fix
v1.3.0 - 2024-08-12
Feature
v1.2.1 - 2024-08-07
Fix
Documentation
v1.2.0 - 2024-08-07
Feature
v1.1.2 - 2024-07-31
Fix
v1.1.1 - 2024-07-30
Fix
v1.1.0 - 2024-07-26
Feature
v1.0.2 - 2024-07-24
Fix
v1.0.1 - 2024-07-24
Fix
v1.0.0 - 2024-07-18
Feature
Breaking
v0.4.0 - 2024-07-17
Feature
v0.3.1 - 2024-07-17
Fix
Documentation
v0.3.0 - 2024-07-17
Feature
Documentation
v0.2.0 - 2024-07-16
Feature