mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-27 04:24:45 +00:00
add docs for TESSDATA_PREFIX
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
parent
ea3f720ef5
commit
5bd64779d1
10
README.md
10
README.md
@ -85,23 +85,31 @@ Works on macOS, Linux and Windows environments. Both x86_64 and arm64 architectu
|
|||||||
[Tesseract](https://github.com/tesseract-ocr/tesseract) is a popular OCR engine which is available
|
[Tesseract](https://github.com/tesseract-ocr/tesseract) is a popular OCR engine which is available
|
||||||
on most operating systems. For using this engine with Docling, Tesseract must be installed on your
|
on most operating systems. For using this engine with Docling, Tesseract must be installed on your
|
||||||
system, using the packaging tool of your choice. Below we provide example commands.
|
system, using the packaging tool of your choice. Below we provide example commands.
|
||||||
|
After installing Tesseract you are expected to provide the path to its language files using the
|
||||||
|
`TESSDATA_PREFIX` environment variable (note that it must terminate with a slash `/`).
|
||||||
|
|
||||||
For macOS, we reccomend using [Homebrew](https://brew.sh/).
|
For macOS, we reccomend using [Homebrew](https://brew.sh/).
|
||||||
|
|
||||||
```console
|
```console
|
||||||
brew install tesseract leptonica pkg-config
|
brew install tesseract leptonica pkg-config
|
||||||
|
TESSDATA_PREFIX=/opt/homebrew/share/tessdata/
|
||||||
|
echo "Set TESSDATA_PREFIX=${TESSDATA_PREFIX}"
|
||||||
```
|
```
|
||||||
|
|
||||||
For Debian-based systems.
|
For Debian-based systems.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
apt-get install tesseract-ocr tesseract-ocr-eng libtesseract-dev libleptonica-dev pkg-config
|
apt-get install tesseract-ocr tesseract-ocr-eng libtesseract-dev libleptonica-dev pkg-config
|
||||||
|
TESSDATA_PREFIX=$(dpkg -L tesseract-ocr-eng | grep tessdata$)
|
||||||
|
echo "Set TESSDATA_PREFIX=${TESSDATA_PREFIX}"
|
||||||
```
|
```
|
||||||
|
|
||||||
For RHEL systems.
|
For RHEL systems.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
dnf install tesseract tesseract-devel tesseract-langpack-eng leptonica-devel
|
dnf install tesseract tesseract-devel tesseract-langpack-eng leptonica-devel
|
||||||
|
TESSDATA_PREFIX=/usr/share/tesseract/tessdata/
|
||||||
|
echo "Set TESSDATA_PREFIX=${TESSDATA_PREFIX}"
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Linking to Tesseract
|
#### Linking to Tesseract
|
||||||
|
Loading…
Reference in New Issue
Block a user