mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 20:58:11 +00:00
feat: add a backend parser for WebVTT files (#2288)
* feat: add a backend parser for WebVTT files Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: update README with VTT support Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: add description to supported formats Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: upgrade docling-core to unescape WebVTT in markdown Pin the new release of docling-core 2.48.2. Do not escape HTML reserved characters when exporting WebVTT documents to markdown. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * test: add missing copyright notice Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
This commit is contained in:
committed by
GitHub
parent
b5628f1227
commit
46efaaefee
17
tests/data/groundtruth/docling_v2/webvtt_example_02.vtt.md
vendored
Normal file
17
tests/data/groundtruth/docling_v2/webvtt_example_02.vtt.md
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
00:00.000 --> 00:02.000
|
||||
|
||||
Esme (first, loud): It’s a blue apple tree!
|
||||
|
||||
00:02.000 --> 00:04.000
|
||||
|
||||
Mary: No way!
|
||||
|
||||
00:04.000 --> 00:06.000
|
||||
|
||||
Esme: Hee!
|
||||
|
||||
*laughter*
|
||||
|
||||
00:06.000 --> 00:08.000
|
||||
|
||||
Mary (loud): That’s awesome!
|
||||
Reference in New Issue
Block a user