mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-24 02:54:25 +00:00
* scaffolding in place Signed-off-by: Peter Staar <taa@zurich.ibm.com> * doing scaffolding for audio pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * WIP: got first transcription working Signed-off-by: Peter Staar <taa@zurich.ibm.com> * all working, time to start cleaning up Signed-off-by: Peter Staar <taa@zurich.ibm.com> * first working ASR pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added openai-whisper as a first transcription model Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updating with asr_options Signed-off-by: Peter Staar <taa@zurich.ibm.com> * finalised the first working ASR pipeline with Whisper Signed-off-by: Peter Staar <taa@zurich.ibm.com> * use whisper from the latest git commit Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update docling/datamodel/pipeline_options.py Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> * Update docling/datamodel/pipeline_options.py Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> * updated comment Signed-off-by: Peter Staar <taa@zurich.ibm.com> * AudioBackend -> DummyBackend Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * file rename Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Rename to NoOpBackend, add test for ASR pipeline Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Support every format in NoOpBackend Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add missing audio file and test Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Install ffmpeg system dependency for ASR test Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com> |
||
---|---|---|
.. | ||
data | ||
data_scanned | ||
__init__.py | ||
test_asr_pipeline.py | ||
test_backend_asciidoc.py | ||
test_backend_csv.py | ||
test_backend_docling_json.py | ||
test_backend_docling_parse_v2.py | ||
test_backend_docling_parse_v4.py | ||
test_backend_docling_parse.py | ||
test_backend_html.py | ||
test_backend_jats.py | ||
test_backend_markdown.py | ||
test_backend_msexcel.py | ||
test_backend_msword.py | ||
test_backend_patent_uspto.py | ||
test_backend_pdfium.py | ||
test_backend_pptx.py | ||
test_backend_webp.py | ||
test_cli.py | ||
test_code_formula.py | ||
test_data_gen_flag.py | ||
test_document_picture_classifier.py | ||
test_e2e_conversion.py | ||
test_e2e_ocr_conversion.py | ||
test_input_doc.py | ||
test_interfaces.py | ||
test_invalid_input.py | ||
test_legacy_format_transform.py | ||
test_options.py | ||
test_settings_load.py | ||
verify_utils.py |