mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-26 03:55:00 +00:00
* Testing fix for docling-core dt Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * fix: Fix code_formula test unit, update test-cases Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Fix code-formula model for new docling-core Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: Update fixes Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update test cases for office formats Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update deps and lockfile Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Clean up imports Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
1 line
6.7 KiB
JSON
1 line
6.7 KiB
JSON
{"_name": "", "type": "pdf-document", "description": {"title": null, "abstract": null, "authors": null, "affiliations": null, "subjects": null, "keywords": null, "publication_date": null, "languages": null, "license": null, "publishers": null, "url_refs": null, "references": null, "publication": null, "reference_count": null, "citation_count": null, "citation_date": null, "advanced": null, "analytics": null, "logs": [], "collection": null, "acquisition": null}, "file-info": {"filename": "picture_classification.pdf", "filename-prov": null, "document-hash": "959854dff729acaa22404d629a45cefcad8d942e595961185fc03a80d9fcc3a1", "#-pages": 2, "collection-name": null, "description": null, "page-hashes": [{"hash": "d9e3fc1226356b30c66012f05ad14089b00c59ea129195cd6ff8a0c68bda6f39", "model": "default", "page": 1}, {"hash": "9386884e13a97ce9662210a7e4258bbbb4f2e0e00663636160918e55b2806575", "model": "default", "page": 2}]}, "main-text": [{"prov": [{"bbox": [133.76800537109375, 654.4518432617188, 252.35513305664062, 667.1912231445312], "page": 1, "span": [0, 15], "__ref_s3_data": null}], "text": "Figures Example", "type": "subtitle-level-1", "payload": null, "name": "Section-header", "font": null}, {"prov": [{"bbox": [133.76800537109375, 501.97412109375, 477.4827575683594, 642.3280639648438], "page": 1, "span": [0, 887], "__ref_s3_data": null}], "text": "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.", "type": "paragraph", "payload": null, "name": "Text", "font": null}, {"prov": [{"bbox": [226.89100646972656, 254.0182647705078, 384.35479736328125, 262.86505126953125], "page": 1, "span": [0, 35], "__ref_s3_data": null}], "text": "Figure 1: This is an example image.", "type": "caption", "payload": null, "name": "Caption", "font": null}, {"name": "Picture", "type": "figure", "$ref": "#/figures/0"}, {"prov": [{"bbox": [133.76800537109375, 122.51225280761719, 477.4817199707031, 238.95504760742188], "page": 1, "span": [0, 747], "__ref_s3_data": null}], "text": "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.", "type": "paragraph", "payload": null, "name": "Text", "font": null}, {"prov": [{"bbox": [133.76800537109375, 523.7951049804688, 477.4817199707031, 664.1490478515625], "page": 2, "span": [0, 887], "__ref_s3_data": null}], "text": "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.", "type": "paragraph", "payload": null, "name": "Text", "font": null}, {"prov": [{"bbox": [226.89100646972656, 259.9422607421875, 384.35479736328125, 268.7890319824219], "page": 2, "span": [0, 35], "__ref_s3_data": null}], "text": "Figure 2: This is an example image.", "type": "caption", "payload": null, "name": "Caption", "font": null}, {"name": "Picture", "type": "figure", "$ref": "#/figures/1"}, {"prov": [{"bbox": [133.76800537109375, 117.32023620605469, 477.4817199707031, 245.71804809570312], "page": 2, "span": [0, 804], "__ref_s3_data": null}], "text": "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum.", "type": "paragraph", "payload": null, "name": "Text", "font": null}], "figures": [{"prov": [{"bbox": [134.9200439453125, 281.78173828125, 475.6635437011719, 487.109375], "page": 1, "span": [0, 35], "__ref_s3_data": null}], "text": "Figure 1: This is an example image.", "type": "figure", "payload": null, "bounding-box": null}, {"prov": [{"bbox": [218.8155517578125, 283.10589599609375, 391.96246337890625, 513.984619140625], "page": 2, "span": [0, 35], "__ref_s3_data": null}], "text": "Figure 2: This is an example image.", "type": "figure", "payload": null, "bounding-box": null}], "tables": [], "bitmaps": null, "equations": [], "footnotes": [], "page-dimensions": [{"height": 792.0, "page": 1, "width": 612.0}, {"height": 792.0, "page": 2, "width": 612.0}], "page-footers": [], "page-headers": [], "_s3_data": null, "identifiers": null} |