mirror of
https://github.com/DS4SD/docling.git
synced 2025-08-01 15:02:21 +00:00
Add typing, switch to list comprehensions where possible, encapsulate all methods within new chunker implementation, use dataclass instead of unmanged dictionary, list dependencies in setup installation line. Fix token counting bug due to static initialization of `semchunk.Chunker`. Use expanded chunk typing (from -core) including embedding-specific and gen-specific texts. Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> |
||
---|---|---|
.. | ||
advanced_chunking_with_merging.ipynb | ||
advanced_chunking.ipynb | ||
batch_convert.py | ||
custom_convert.py | ||
develop_picture_enrichment.py | ||
export_figures.py | ||
export_multimodal.py | ||
export_tables.py | ||
minimal.py | ||
rag_langchain.ipynb | ||
rag_llamaindex.ipynb | ||
run_md.py | ||
run_with_formats.py |