mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-09 05:08:14 +00:00
fix: restrict click version and update lock file (#1582)
* fix click dependency and update lock file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update test GT Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
@@ -340,6 +340,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/0"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
134.765,
|
||||
591.77942,
|
||||
480.59189,
|
||||
665.66583
|
||||
],
|
||||
"page": 2,
|
||||
"span": [
|
||||
0,
|
||||
574
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 1. Comparison between HTML and OTSL table structure representation: (A) table-example with complex row and column headers, including a 2D empty span, (B) minimal graphical representation of table structure using rectangular layout, (C) HTML representation, (D) OTSL representation. This example demonstrates many of the key-features of OTSL, namely its reduced vocabulary size (12 versus 5 in this case), its reduced sequence length (55 versus 30) and a enhanced internal structure (variable token sequence length per row in HTML versus a fixed length of rows in OTSL).",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
@@ -644,6 +667,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/1"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
145.60701,
|
||||
562.78821,
|
||||
469.75223000000005,
|
||||
570.92072
|
||||
],
|
||||
"page": 5,
|
||||
"span": [
|
||||
0,
|
||||
73
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 2. Frequency of tokens in HTML and OTSL as they appear in PubTabNet.",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
@@ -1017,6 +1063,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/2"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
134.765,
|
||||
636.15033,
|
||||
480.5874,
|
||||
666.2008100000002
|
||||
],
|
||||
"page": 7,
|
||||
"span": [
|
||||
0,
|
||||
207
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 3. OTSL description of table structure: A - table example; B - graphical representation of table structure; C - mapping structure on a grid; D - OTSL structure encoding; E - explanation on cell encoding",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
@@ -1390,6 +1459,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/3"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
134.76501,
|
||||
288.26035,
|
||||
480.59082,
|
||||
307.35187
|
||||
],
|
||||
"page": 8,
|
||||
"span": [
|
||||
0,
|
||||
104
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 4. Architecture sketch of the TableFormer model, which is a representative for the Im2Seq approach.",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
@@ -1658,6 +1750,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/4"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
134.765,
|
||||
352.28284,
|
||||
480.59106,
|
||||
394.40988
|
||||
],
|
||||
"page": 10,
|
||||
"span": [
|
||||
0,
|
||||
270
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 5. The OTSL model produces more accurate bounding boxes with less overlap (E) than the HTML model (D), when predicting the structure of a sparse table (A), at twice the inference speed because of shorter sequence length (B),(C). \"PMC2807444_006_00.png\" PubTabNet. \u03bc",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
@@ -1709,6 +1824,29 @@
|
||||
"type": "figure",
|
||||
"$ref": "#/figures/5"
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
"bbox": [
|
||||
134.765,
|
||||
614.23236,
|
||||
480.58838000000003,
|
||||
666.2008100000002
|
||||
],
|
||||
"page": 11,
|
||||
"span": [
|
||||
0,
|
||||
390
|
||||
],
|
||||
"__ref_s3_data": null
|
||||
}
|
||||
],
|
||||
"text": "Fig. 6. Visualization of predicted structure and detected bounding boxes on a complex table with many rows. The OTSL model (B) captured repeating pattern of horizontally merged cells from the GT (A), unlike the HTML model (C). The HTML model also didn't complete the HTML sequence correctly and displayed a lot more of drift and overlap of bounding boxes. \"PMC5406406_003_01.png\" PubTabNet.",
|
||||
"type": "caption",
|
||||
"payload": null,
|
||||
"name": "Caption",
|
||||
"font": null
|
||||
},
|
||||
{
|
||||
"prov": [
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user