feat: Repetition-based StoppingCriteria for GraniteDocling (#2323)

* Experimental code for repetition detection, VLLM Streaming

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update VLLM Streaming

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update VLLM inference code, CLI and VLM specs

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix generation and decoder args for HF model

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix vllm device args

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Cleanup

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Bugfixes

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Remove streaming VLLM for the moment

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add repetition StoppingCriteria for GraniteDocling/SmolDocling

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Make GenerationStopper base class and port for MLX

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add streaming support and custom GenerationStopper support for ApiVlmModel

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fixes for ApiVlmModel

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fixes for ApiVlmModel

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix api_image_request_streaming when GenerationStopper triggers.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Move DocTagsRepetitionStopper to utility unit, update examples

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
Christoph Auer
2025-09-30 15:26:09 +02:00
committed by GitHub
parent 68ae7ccf3c
commit 1e9dc43b72
15 changed files with 541 additions and 38 deletions

BIN
tests/data_scanned/old_newspaper.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 MiB