Nikos Livathinos
|
bb8cd0f7fc
|
fix: Rename the tesseract OCR related classes and filenames
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-08 16:46:25 +02:00 |
|
Nikos Livathinos
|
70a8a2cc82
|
chore(OCR): Rename class names to use Tesseract for the tesserocr and TesseractCLI for the tesseract process
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-08 14:44:23 +02:00 |
|
Nikos Livathinos
|
29e65e911b
|
fix(test): Introduce parameter in verify_conversion_result() to allow skipping the verification of the cells. It is used in case of OCR tests.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-08 14:30:33 +02:00 |
|
Nikos Livathinos
|
be6489bde0
|
fix(tests): Refactor the data_scanned with a very simple document that allows all OCR engines to produce the same result.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-08 07:07:28 +02:00 |
|
Nikos Livathinos
|
49652eec54
|
feat(tests): Introduce fuzzy text comparison for OCR tests based on Levenshtein edit distance
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-04 14:13:24 +02:00 |
|
Michele Dolfi
|
f57e4b2afb
|
add tesseract in CI, improve error messages and allow to specify the tesseract cmd
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-03 18:59:29 +02:00 |
|
Nikos Livathinos
|
e571ab50ee
|
fix(tests): Extend test_e2e_ocr_conversion to cover all OCR engines (easyocr, tesserocr, tesseract)
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-03 16:49:23 +02:00 |
|
Nikos Livathinos
|
c28846a866
|
feat: Implement the TesserOcrModel. Introduce the test_e2e_ocr_conversion.py unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-02 18:12:32 +02:00 |
|
Peter Staar
|
a3e2cf5473
|
fixed conflicts
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
|
2024-10-02 17:01:34 +02:00 |
|
Michele Dolfi
|
0b76211eed
|
add examples for swtching OCR engine and CLI support
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-10-02 16:57:48 +02:00 |
|
Nikos Livathinos
|
c211808742
|
feat: tesseract and tesserocr models. WIP.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
|
2024-10-02 13:35:00 +02:00 |
|