docling/docling
Alexander Vaagan 733360c7b2 A new HTML backend that handles styled html (ignors it) as well as images.
- Updated unit tests
- Added documentation (Example notebook)

Note: MyPy fails.
Seems to be a known issue with BeautifulSoup:
https://github.com/python/typeshed/pull/13604

Signed-off-by: Alexander Vaagan <alexander.vaagan@gmail.com>
Signed-off-by: vaaale <2428222+vaaale@users.noreply.github.com>
2025-05-24 22:29:22 +02:00
..
backend A new HTML backend that handles styled html (ignors it) as well as images. 2025-05-24 22:29:22 +02:00
chunking feat: expose new hybrid chunker, update docs (#384) 2024-12-09 08:28:29 +01:00
cli fix: add smoldocling in download utils (#1577) 2025-05-12 10:48:07 +02:00
datamodel feat: Improve parallelization for remote services API calls (#1548) 2025-05-14 15:47:55 +02:00
models feat: Improve parallelization for remote services API calls (#1548) 2025-05-14 15:47:55 +02:00
pipeline ci: add coverage and ruff (#1383) 2025-04-14 18:01:26 +02:00
utils fix: add smoldocling in download utils (#1577) 2025-05-12 10:48:07 +02:00
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
document_converter.py fix: usage of hashlib for FIPS (#1512) 2025-05-02 15:03:29 +02:00
exceptions.py feat: Introduce the enable_remote_services option to allow remote connections while processing (#941) 2025-02-12 15:18:01 +01:00
py.typed fix: Add py.typed marker file (#531) 2024-12-06 13:42:14 +01:00