mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-09 13:18:24 +00:00
feat: add a backend parser for WebVTT files (#2288)
* feat: add a backend parser for WebVTT files Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: update README with VTT support Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: add description to supported formats Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: upgrade docling-core to unescape WebVTT in markdown Pin the new release of docling-core 2.48.2. Do not escape HTML reserved characters when exporting WebVTT documents to markdown. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * test: add missing copyright notice Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
This commit is contained in:
committed by
GitHub
parent
b5628f1227
commit
46efaaefee
51
tests/data/groundtruth/docling_v2/webvtt_example_01.vtt.md
vendored
Normal file
51
tests/data/groundtruth/docling_v2/webvtt_example_01.vtt.md
vendored
Normal file
@@ -0,0 +1,51 @@
|
||||
00:11.000 --> 00:13.000
|
||||
|
||||
Roger Bingham: We are in New York City
|
||||
|
||||
00:13.000 --> 00:16.000
|
||||
|
||||
Roger Bingham: We’re actually at the Lucern Hotel, just down the street
|
||||
|
||||
00:16.000 --> 00:18.000
|
||||
|
||||
Roger Bingham: from the American Museum of Natural History
|
||||
|
||||
00:18.000 --> 00:20.000
|
||||
|
||||
Roger Bingham: And with me is Neil deGrasse Tyson
|
||||
|
||||
00:20.000 --> 00:22.000
|
||||
|
||||
Roger Bingham: Astrophysicist, Director of the Hayden Planetarium
|
||||
|
||||
00:22.000 --> 00:24.000
|
||||
|
||||
Roger Bingham: at the AMNH.
|
||||
|
||||
00:24.000 --> 00:26.000
|
||||
|
||||
Roger Bingham: Thank you for walking down here.
|
||||
|
||||
00:27.000 --> 00:30.000
|
||||
|
||||
Roger Bingham: And I want to do a follow-up on the last conversation we did.
|
||||
|
||||
00:30.000 --> 00:31.500
|
||||
|
||||
Roger Bingham: When we e-mailed—
|
||||
|
||||
00:30.500 --> 00:32.500
|
||||
|
||||
Neil deGrasse Tyson: Didn’t we talk about enough in that conversation?
|
||||
|
||||
00:32.000 --> 00:35.500
|
||||
|
||||
Roger Bingham: No! No no no no; 'cos 'cos obviously 'cos
|
||||
|
||||
00:32.500 --> 00:33.500
|
||||
|
||||
Neil deGrasse Tyson: *Laughs*
|
||||
|
||||
00:35.500 --> 00:38.000
|
||||
|
||||
Roger Bingham: You know I’m so excited my glasses are falling off here.
|
||||
Reference in New Issue
Block a user