docling/docs/examples
Panos Vagenas ce38baf7f7 add multiple improvements and fixes
Add typing, switch to list comprehensions where possible,
encapsulate all methods within new chunker implementation,
use dataclass instead of unmanged dictionary,
list dependencies in setup installation line.

Fix token counting bug due to static initialization of
`semchunk.Chunker`.

Use expanded chunk typing (from -core) including
embedding-specific and gen-specific texts.

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-11-19 23:36:50 +01:00
..
advanced_chunking_with_merging.ipynb add multiple improvements and fixes 2024-11-19 23:36:50 +01:00
advanced_chunking.ipynb docs: add advanced chunking example 2024-11-01 09:38:37 +01:00
batch_convert.py feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
custom_convert.py feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
develop_picture_enrichment.py feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
export_figures.py docs: add export with embedded images (#175) 2024-10-24 20:19:41 +02:00
export_multimodal.py feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
export_tables.py feat!: Docling v2 (#117) 2024-10-16 21:02:03 +02:00
minimal.py chore: various minor docs fixes (#169) 2024-10-22 15:29:36 +02:00
rag_langchain.ipynb Sample chunking notebook that includes merging, etc. (#193) 2024-11-19 23:12:04 +01:00
rag_llamaindex.ipynb docs: update LlamaIndex docs for Docling v2 (#182) 2024-10-28 14:28:26 +01:00
run_md.py feat: Support AsciiDoc and Markdown input format (#168) 2024-10-23 16:14:26 +02:00
run_with_formats.py fix: fix duplicate title and heading + add e2e tests for html and docx (#186) 2024-10-30 13:14:56 +01:00