mirror of
https://github.com/DS4SD/docling.git
synced 2025-08-02 07:22:14 +00:00
docs: added markdown headings to enable TOC in github pages
Signed-off-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com>
This commit is contained in:
parent
9e4ca90db1
commit
21cc7c4451
@ -33,6 +33,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"## A recipe 🧑🍳 🐥 💚\n",
|
||||
"\n",
|
||||
"This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system using:\n",
|
||||
"- [Docling](https://ds4sd.github.io/docling/) for document parsing and chunking\n",
|
||||
@ -61,7 +62,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Part 0: Prerequisites\n",
|
||||
"### Part 0: Prerequisites\n",
|
||||
" - **Azure AI Search** resource\n",
|
||||
" - **Azure OpenAI** resource with a deployed embedding and chat completion model (e.g. `text-embedding-3-small` and `gpt-4o`) \n",
|
||||
" - **Docling 2.12+** (installs `docling_core` automatically) Docling installed (Python 3.8+ environment)\n",
|
||||
@ -114,7 +115,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Part 1: Parse the PDF with Docling\n",
|
||||
"### Part 1: Parse the PDF with Docling\n",
|
||||
"\n",
|
||||
"We’ll parse the **Microsoft GraphRAG Research Paper** (~15 pages). Parsing should be relatively quick, even on CPU, but it will be faster on a GPU or MPS device if available.\n",
|
||||
"\n",
|
||||
@ -235,7 +236,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Part 2: Hierarchical Chunking\n",
|
||||
"### Part 2: Hierarchical Chunking\n",
|
||||
"We convert the `Document` into smaller chunks for embedding and indexing. The built-in `HierarchicalChunker` preserves structure. "
|
||||
]
|
||||
},
|
||||
@ -382,7 +383,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Embed and Upsert to Azure AI Search\n"
|
||||
"#### Embed and Upsert to Azure AI Search\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -497,7 +498,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Part 4: RAG Query with Azure OpenAI\n",
|
||||
"### Part 4: RAG Query with Azure OpenAI\n",
|
||||
"Combine retrieval from Azure Search with Chat Completions (aka. grounding your LLM)"
|
||||
]
|
||||
},
|
||||
|
Loading…
Reference in New Issue
Block a user