chore: typo fix (#1465)

* typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

* chore: typo fix

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>

---------

Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
This commit is contained in:
nkh0472
2025-04-28 14:52:09 +08:00
committed by GitHub
parent 3afbe6c969
commit a097ccd8d5
14 changed files with 19 additions and 19 deletions

View File

@@ -569,7 +569,7 @@
"The `DoclingDocument` format of the converted patents has a rich hierarchical structure, inherited from the original XML document and preserved by the Docling custom backend.\n",
"In this notebook, we will leverage:\n",
"- The `SimpleDirectoryReader` pattern to iterate over the exported XML files created in section [Fetch the data](#fetch-the-data).\n",
"- The LlamaIndex extensions, `DoclingReader` and `DoclingNodeParser`, to ingest the patent chunks into a Milvus vectore store.\n",
"- The LlamaIndex extensions, `DoclingReader` and `DoclingNodeParser`, to ingest the patent chunks into a Milvus vector store.\n",
"- The `HierarchicalChunker` implementation, which applies a document-based hierarchical chunking, to leverage the patent structures like sections and paragraphs within sections.\n",
"\n",
"Refer to other possible implementations and usage patterns in the [Chunking](../../concepts/chunking/) documentation and the [RAG with LlamaIndex](../rag_llamaindex/) notebook."

View File

@@ -206,7 +206,7 @@
"source": [
"Points to notice looking at the output chunks below:\n",
"- Where possible, we fit the limit of 64 tokens for the metadata-enriched serialization form (see chunk 2)\n",
"- Where neeeded, we stop before the limit, e.g. see cases of 63 as it would otherwise run into a comma (see chunk 6)\n",
"- Where needed, we stop before the limit, e.g. see cases of 63 as it would otherwise run into a comma (see chunk 6)\n",
"- Where possible, we merge undersized peer chunks (see chunk 0)\n",
"- \"Tail\" chunks trailing right after merges may still be undersized (see chunk 8)"
]

View File

@@ -279,7 +279,7 @@
"## Use other vision models\n",
"\n",
"The examples above can also be reproduced using other vision model.\n",
"The Docling options `PictureDescriptionVlmOptions` allows to speficy your favorite vision model from the Hugging Face Hub."
"The Docling options `PictureDescriptionVlmOptions` allows to specify your favorite vision model from the Hugging Face Hub."
]
},
{

View File

@@ -32,7 +32,7 @@
"\n",
"Note: For best results, please use **GPU acceleration** to run this notebook. Here are two options for running this notebook:\n",
"1. **Locally on a MacBook with an Apple Silicon chip.** Converting all documents in the notebook takes ~2 minutes on a MacBook M2 due to Docling's usage of MPS accelerators.\n",
"2. **Run this notebook on Google Colab.** Converting all documents in the notebook takes ~8 mintutes on a Google Colab T4 GPU.\n"
"2. **Run this notebook on Google Colab.** Converting all documents in the notebook takes ~8 minutes on a Google Colab T4 GPU.\n"
]
},
{

View File

@@ -43,7 +43,7 @@
"\n",
"Note: For best results, please use **GPU acceleration** to run this notebook. Here are two options for running this notebook:\n",
"1. **Locally on a MacBook with an Apple Silicon chip.** Converting all documents in the notebook takes ~2 minutes on a MacBook M2 due to Docling's usage of MPS accelerators.\n",
"2. **Run this notebook on Google Colab.** Converting all documents in the notebook takes ~8 mintutes on a Google Colab T4 GPU."
"2. **Run this notebook on Google Colab.** Converting all documents in the notebook takes ~8 minutes on a Google Colab T4 GPU."
]
},
{
@@ -716,7 +716,7 @@
"id": "7tGz49nfUegG"
},
"source": [
"We can see that our RAG pipeline performs relatively well for simple queries, especially given the small size of the dataset. Scaling this method for converting a larger sample of PDFs would require more compute (GPUs) and a more advanced deployment of Weaviate (like Docker, Kubernetes, or Weaviate Cloud). For more information on available Weaviate configurations, check out the [documetation](https://weaviate.io/developers/weaviate/starter-guides/which-weaviate)."
"We can see that our RAG pipeline performs relatively well for simple queries, especially given the small size of the dataset. Scaling this method for converting a larger sample of PDFs would require more compute (GPUs) and a more advanced deployment of Weaviate (like Docker, Kubernetes, or Weaviate Cloud). For more information on available Weaviate configurations, check out the [documentation](https://weaviate.io/developers/weaviate/starter-guides/which-weaviate)."
]
}
],