docs: update opensearch notebook and backend documentation (#2519)

* docs(opensearch): update the example notebook RAG with OpenSearch Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs(uspto): remove direct usage of the backend class for conversion Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: remove direct usage of backends from documentation Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
2025-12-08 20:58:11 +00:00 · 2025-10-27 10:02:50 +01:00
parent 10c1f06b74
commit 9a6fdf936b
3 changed files with 536 additions and 307 deletions
--- a/docs/usage/advanced_options.md
+++ b/docs/usage/advanced_options.md
@@ -163,37 +163,3 @@ result = converter.convert(source)
 ## Limit resource usage

 You can limit the CPU threads used by Docling by setting the environment variable `OMP_NUM_THREADS` accordingly. The default setting is using 4 CPU threads.
-
-
-## Use specific backend converters
-
-!!! note
-
-    This section discusses directly invoking a [backend](../concepts/architecture.md),
-    i.e. using a low-level API. This should only be done when necessary. For most cases,
-    using a `DocumentConverter` (high-level API) as discussed in the sections above
-    should suffice — and is the recommended way.
-
-By default, Docling will try to identify the document format to apply the appropriate conversion backend (see the list of [supported formats](supported_formats.md)).
-You can restrict the `DocumentConverter` to a set of allowed document formats, as shown in the [Multi-format conversion](../examples/run_with_formats.py) example.
-Alternatively, you can also use the specific backend that matches your document content. For instance, you can use `HTMLDocumentBackend` for HTML pages:
-
-```python
-import urllib.request
-from io import BytesIO
-from docling.backend.html_backend import HTMLDocumentBackend
-from docling.datamodel.base_models import InputFormat
-from docling.datamodel.document import InputDocument
-
-url = "https://en.wikipedia.org/wiki/Duck"
-text = urllib.request.urlopen(url).read()
-in_doc = InputDocument(
-    path_or_stream=BytesIO(text),
-    format=InputFormat.HTML,
-    backend=HTMLDocumentBackend,
-    filename="duck.html",
-)
-backend = HTMLDocumentBackend(in_doc=in_doc, path_or_stream=BytesIO(text))
-dl_doc = backend.convert()
-print(dl_doc.export_to_markdown())
-```