mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-27 12:34:22 +00:00
Updated test ground-truth (again), bugfix for empty layout
Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
parent
731e48ea43
commit
fbb28b851d
@ -236,6 +236,9 @@ class LayoutPostprocessor:
|
||||
# Initial cell assignment
|
||||
clusters = self._assign_cells_to_clusters(clusters)
|
||||
|
||||
# Remove clusters with no cells
|
||||
clusters = [cluster for cluster in clusters if cluster.cells]
|
||||
|
||||
# Handle orphaned cells
|
||||
unassigned = self._find_unassigned_cells(clusters)
|
||||
if unassigned:
|
||||
|
@ -10,11 +10,11 @@
|
||||
<figure>
|
||||
<location><page_1><loc_52><loc_62><loc_88><loc_71></location>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_1><loc_52><loc_58><loc_79><loc_60></location>b. Red-annotation of bounding boxes, Blue-predictions by TableFormer</subtitle-level-1>
|
||||
<paragraph><location><page_1><loc_52><loc_58><loc_79><loc_60></location>- b. Red-annotation of bounding boxes, Blue-predictions by TableFormer</paragraph>
|
||||
<figure>
|
||||
<location><page_1><loc_51><loc_48><loc_88><loc_57></location>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_1><loc_52><loc_46><loc_80><loc_47></location>c. Structure predicted by TableFormer:</subtitle-level-1>
|
||||
<paragraph><location><page_1><loc_52><loc_46><loc_80><loc_47></location>- c. Structure predicted by TableFormer:</paragraph>
|
||||
<caption><location><page_1><loc_50><loc_29><loc_89><loc_35></location>Figure 1: Picture of a table with subtle, complex features such as (1) multi-column headers, (2) cell with multi-row text and (3) cells with no content. Image from PubTabNet evaluation set, filename: 'PMC2944238 004 02'.</caption>
|
||||
<figure>
|
||||
<location><page_1><loc_52><loc_37><loc_88><loc_45></location>
|
||||
@ -152,7 +152,7 @@
|
||||
<row_6><col_0><row_header>TableFormer</col_0><col_1><body>95.4</col_1><col_2><body>90.1</col_2><col_3><body>93.6</col_3></row_6>
|
||||
</table>
|
||||
<paragraph><location><page_8><loc_9><loc_89><loc_10><loc_90></location>- a.</paragraph>
|
||||
<paragraph><location><page_8><loc_11><loc_89><loc_82><loc_90></location>Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells</paragraph>
|
||||
<paragraph><location><page_8><loc_11><loc_89><loc_82><loc_90></location>- Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells</paragraph>
|
||||
<paragraph><location><page_8><loc_9><loc_87><loc_46><loc_88></location>Japanese language (previously unseen by TableFormer):</paragraph>
|
||||
<paragraph><location><page_8><loc_50><loc_87><loc_70><loc_88></location>Example table from FinTabNet:</paragraph>
|
||||
<figure>
|
||||
@ -283,7 +283,7 @@
|
||||
<paragraph><location><page_12><loc_8><loc_13><loc_47><loc_16></location>where c is one of { left, centroid, right } and x$_{c}$ is the xcoordinate for the corresponding point.</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_13><loc_89><loc_16></location>- 9d. Intersect the orphan's bounding box with the column bands, and map the cell to the closest grid column.</paragraph>
|
||||
<paragraph><location><page_12><loc_8><loc_10><loc_47><loc_13></location>- 5. Use the alignment computed in step 4, to compute the median x -coordinate for all table columns and the me-</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_10><loc_89><loc_13></location>9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_10><loc_89><loc_13></location>- 9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_21><loc_89><loc_23></location>- 9b. Intersect the orphan's bounding box with the row bands, and map the cell to the closest grid row.</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_16><loc_89><loc_20></location>- 9c. Compute the left and right boundary of the vertical band for each grid column (min/max x coordinates per column).</paragraph>
|
||||
<paragraph><location><page_12><loc_50><loc_42><loc_89><loc_51></location>- 8. In some rare occasions, we have noticed that TableFormer can confuse a single column as two. When the postprocessing steps are applied, this results with two predicted columns pointing to the same PDF column. In such case we must de-duplicate the columns according to highest total column intersection score.</paragraph>
|
||||
@ -293,79 +293,22 @@
|
||||
<paragraph><location><page_13><loc_8><loc_89><loc_15><loc_91></location>phan cell.</paragraph>
|
||||
<paragraph><location><page_13><loc_8><loc_86><loc_47><loc_89></location>9f. Otherwise create a new structural cell and match it wit the orphan cell.</paragraph>
|
||||
<paragraph><location><page_13><loc_8><loc_83><loc_47><loc_86></location>Aditional images with examples of TableFormer predictions and post-processing can be found below.</paragraph>
|
||||
<subtitle-level-1><location><page_13><loc_14><loc_81><loc_18><loc_81></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_73><loc_39><loc_80></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_14><loc_71><loc_30><loc_72></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_63><loc_39><loc_70></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_14><loc_61><loc_27><loc_62></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_54><loc_39><loc_61></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_14><loc_50><loc_27><loc_51></location></subtitle-level-1>
|
||||
<caption><location><page_13><loc_10><loc_35><loc_45><loc_37></location>Figure 8: Example of a table with multi-line header.</caption>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_38><loc_41><loc_50></location>
|
||||
<caption>Figure 8: Example of a table with multi-line header.</caption>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_51><loc_87><loc_54><loc_88></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_83><loc_91><loc_87></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_51><loc_81><loc_62><loc_82></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_77><loc_91><loc_80></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_51><loc_75><loc_60><loc_76></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_71><loc_91><loc_75></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_51><loc_68><loc_60><loc_69></location></subtitle-level-1>
|
||||
<paragraph><location><page_13><loc_10><loc_35><loc_45><loc_37></location>Figure 8: Example of a table with multi-line header.</paragraph>
|
||||
<caption><location><page_13><loc_50><loc_59><loc_89><loc_61></location>Figure 9: Example of a table with big empty distance between cells.</caption>
|
||||
<figure>
|
||||
<location><page_13><loc_51><loc_63><loc_70><loc_68></location>
|
||||
<caption>Figure 9: Example of a table with big empty distance between cells.</caption>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_13><loc_55><loc_51><loc_58><loc_52></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_45><loc_80><loc_51></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_55><loc_43><loc_69><loc_44></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_37><loc_80><loc_43></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_55><loc_35><loc_67><loc_36></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_28><loc_80><loc_34></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_13><loc_55><loc_25><loc_66><loc_26></location></subtitle-level-1>
|
||||
<caption><location><page_13><loc_51><loc_13><loc_89><loc_14></location>Figure 10: Example of a complex table with empty cells.</caption>
|
||||
<figure>
|
||||
<location><page_13><loc_55><loc_16><loc_85><loc_25></location>
|
||||
<caption>Figure 10: Example of a complex table with empty cells.</caption>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_14><loc_8><loc_86><loc_12><loc_87></location></subtitle-level-1>
|
||||
<caption><location><page_14><loc_8><loc_52><loc_47><loc_55></location>Figure 11: Simple table with different style and empty cells.</caption>
|
||||
<figure>
|
||||
<location><page_14><loc_8><loc_56><loc_46><loc_87></location>
|
||||
<caption>Figure 11: Simple table with different style and empty cells.</caption>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_14><loc_8><loc_43><loc_11><loc_44></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_38><loc_51><loc_43></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_8><loc_37><loc_20><loc_37></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_32><loc_51><loc_36></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_8><loc_30><loc_18><loc_31></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_25><loc_51><loc_30></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_8><loc_23><loc_18><loc_24></location></subtitle-level-1>
|
||||
<caption><location><page_14><loc_9><loc_14><loc_46><loc_15></location>Figure 12: Simple table predictions and post processing.</caption>
|
||||
<figure>
|
||||
<location><page_14><loc_8><loc_17><loc_29><loc_23></location>
|
||||
@ -376,32 +319,13 @@
|
||||
<location><page_14><loc_52><loc_55><loc_87><loc_89></location>
|
||||
<caption>Figure 13: Table predictions example on colorful table.</caption>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_14><loc_52><loc_46><loc_55><loc_46></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_40><loc_85><loc_46></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_52><loc_38><loc_63><loc_39></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_32><loc_85><loc_38></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_52><loc_31><loc_61><loc_32></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_25><loc_85><loc_31></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_14><loc_52><loc_23><loc_61><loc_23></location></subtitle-level-1>
|
||||
<caption><location><page_14><loc_56><loc_13><loc_83><loc_14></location>Figure 14: Example with multi-line text.</caption>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_16><loc_87><loc_23></location>
|
||||
<caption>Figure 14: Example with multi-line text.</caption>
|
||||
</table>
|
||||
<caption><location><page_15><loc_9><loc_67><loc_20><loc_68></location></caption>
|
||||
<paragraph><location><page_14><loc_56><loc_13><loc_83><loc_14></location>Figure 14: Example with multi-line text.</paragraph>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_69><loc_46><loc_83></location>
|
||||
</figure>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_53><loc_46><loc_67></location>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_15><loc_9><loc_51><loc_18><loc_52></location></subtitle-level-1>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_37><loc_46><loc_51></location>
|
||||
</figure>
|
||||
@ -410,19 +334,9 @@
|
||||
<location><page_15><loc_8><loc_20><loc_52><loc_36></location>
|
||||
<caption>Figure 15: Example with triangular table.</caption>
|
||||
</figure>
|
||||
<caption><location><page_15><loc_53><loc_85><loc_57><loc_86></location></caption>
|
||||
<table>
|
||||
<location><page_15><loc_53><loc_72><loc_86><loc_85></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_15><loc_53><loc_70><loc_70><loc_71></location></subtitle-level-1>
|
||||
<table>
|
||||
<location><page_15><loc_53><loc_57><loc_86><loc_69></location>
|
||||
</table>
|
||||
<subtitle-level-1><location><page_15><loc_53><loc_55><loc_67><loc_56></location></subtitle-level-1>
|
||||
<figure>
|
||||
<location><page_15><loc_53><loc_41><loc_86><loc_54></location>
|
||||
</figure>
|
||||
<subtitle-level-1><location><page_15><loc_58><loc_39><loc_73><loc_39></location></subtitle-level-1>
|
||||
<caption><location><page_15><loc_50><loc_15><loc_89><loc_18></location>Figure 16: Example of how post-processing helps to restore mis-aligned bounding boxes prediction artifact.</caption>
|
||||
<figure>
|
||||
<location><page_15><loc_58><loc_20><loc_81><loc_38></location>
|
||||
|
File diff suppressed because one or more lines are too long
@ -17,12 +17,12 @@ The occurrence of tables in documents is ubiquitous. They often summarise quanti
|
||||
|
||||
<!-- image -->
|
||||
|
||||
## b. Red-annotation of bounding boxes, Blue-predictions by TableFormer
|
||||
- b. Red-annotation of bounding boxes, Blue-predictions by TableFormer
|
||||
|
||||
|
||||
<!-- image -->
|
||||
|
||||
## c. Structure predicted by TableFormer:
|
||||
- c. Structure predicted by TableFormer:
|
||||
|
||||
Figure 1: Picture of a table with subtle, complex features such as (1) multi-column headers, (2) cell with multi-row text and (3) cells with no content. Image from PubTabNet evaluation set, filename: 'PMC2944238 004 02'.
|
||||
<!-- image -->
|
||||
@ -217,7 +217,7 @@ Table 4: Results of structure with content retrieved using cell detection on Pub
|
||||
|
||||
- a.
|
||||
|
||||
Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells
|
||||
- Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells
|
||||
|
||||
Japanese language (previously unseen by TableFormer):
|
||||
|
||||
@ -420,7 +420,7 @@ where c is one of { left, centroid, right } and x$_{c}$ is the xcoordinate for t
|
||||
|
||||
- 5. Use the alignment computed in step 4, to compute the median x -coordinate for all table columns and the me-
|
||||
|
||||
9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-
|
||||
- 9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-
|
||||
|
||||
- 9b. Intersect the orphan's bounding box with the row bands, and map the cell to the closest grid row.
|
||||
|
||||
@ -440,7 +440,7 @@ phan cell.
|
||||
|
||||
Aditional images with examples of TableFormer predictions and post-processing can be found below.
|
||||
|
||||
##
|
||||
Figure 8: Example of a table with multi-line header.
|
||||
|
||||
Figure 9: Example of a table with big empty distance between cells.
|
||||
<!-- image -->
|
||||
@ -457,6 +457,8 @@ Figure 12: Simple table predictions and post processing.
|
||||
Figure 13: Table predictions example on colorful table.
|
||||
<!-- image -->
|
||||
|
||||
Figure 14: Example with multi-line text.
|
||||
|
||||
|
||||
<!-- image -->
|
||||
|
||||
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -2,7 +2,8 @@
|
||||
<subtitle-level-1><location><page_1><loc_22><loc_82><loc_79><loc_85></location>Optimized Table Tokenization for Table Structure Recognition</subtitle-level-1>
|
||||
<paragraph><location><page_1><loc_23><loc_75><loc_78><loc_79></location>Maksym Lysak [0000 - 0002 - 3723 - $^{6960]}$, Ahmed Nassar[0000 - 0002 - 9468 - $^{0822]}$, Nikolaos Livathinos [0000 - 0001 - 8513 - $^{3491]}$, Christoph Auer[0000 - 0001 - 5761 - $^{0422]}$, [0000 - 0002 - 8088 - 0823]</paragraph>
|
||||
<paragraph><location><page_1><loc_38><loc_74><loc_49><loc_75></location>and Peter Staar</paragraph>
|
||||
<paragraph><location><page_1><loc_36><loc_70><loc_64><loc_73></location>{mly,ahn,nli,cau,taa}@zurich.ibm.com IBM Research</paragraph>
|
||||
<paragraph><location><page_1><loc_46><loc_72><loc_55><loc_73></location>IBM Research</paragraph>
|
||||
<paragraph><location><page_1><loc_36><loc_70><loc_64><loc_71></location>{mly,ahn,nli,cau,taa}@zurich.ibm.com</paragraph>
|
||||
<paragraph><location><page_1><loc_27><loc_41><loc_74><loc_66></location>Abstract. Extracting tables from documents is a crucial task in any document conversion pipeline. Recently, transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence (Im2Seq) approaches. Taking only the image of a table, such models predict a sequence of tokens (e.g. in HTML, LaTeX) which represent the structure of the table. Since the token representation of the table structure has a significant impact on the accuracy and run-time performance of any Im2Seq model, we investigate in this paper how table-structure representation can be optimised. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules. The benefits of OTSL are that it reduces the number of tokens to 5 (HTML needs 28+) and shortens the sequence length to half of HTML on average. Consequently, model accuracy improves significantly, inference time is halved compared to HTML-based models, and the predicted table structures are always syntactically correct. This in turn eliminates most post-processing needs. Popular table structure data-sets will be published in OTSL format to the community.</paragraph>
|
||||
<paragraph><location><page_1><loc_27><loc_37><loc_74><loc_40></location>Keywords: Table Structure Recognition · Data Representation · Transformers · Optimization.</paragraph>
|
||||
<subtitle-level-1><location><page_1><loc_22><loc_33><loc_37><loc_34></location>1 Introduction</subtitle-level-1>
|
||||
@ -56,7 +57,7 @@
|
||||
<paragraph><location><page_7><loc_22><loc_58><loc_59><loc_59></location>The OTSL representation follows these syntax rules:</paragraph>
|
||||
<paragraph><location><page_7><loc_23><loc_54><loc_79><loc_56></location>- 1. Left-looking cell rule : The left neighbour of an "L" cell must be either another "L" cell or a "C" cell.</paragraph>
|
||||
<paragraph><location><page_7><loc_23><loc_51><loc_79><loc_53></location>- 2. Up-looking cell rule : The upper neighbour of a "U" cell must be either another "U" cell or a "C" cell.</paragraph>
|
||||
<paragraph><location><page_7><loc_23><loc_49><loc_37><loc_50></location>- 3. Cross cell rule :</paragraph>
|
||||
<subtitle-level-1><location><page_7><loc_23><loc_49><loc_37><loc_50></location>3. Cross cell rule :</subtitle-level-1>
|
||||
<paragraph><location><page_7><loc_25><loc_44><loc_79><loc_49></location>- The left neighbour of an "X" cell must be either another "X" cell or a "U" cell, and the upper neighbour of an "X" cell must be either another "X" cell or an "L" cell.</paragraph>
|
||||
<paragraph><location><page_7><loc_23><loc_43><loc_78><loc_44></location>- 4. First row rule : Only "L" cells and "C" cells are allowed in the first row.</paragraph>
|
||||
<paragraph><location><page_7><loc_23><loc_40><loc_79><loc_43></location>- 5. First column rule : Only "U" cells and "C" cells are allowed in the first column.</paragraph>
|
||||
|
File diff suppressed because one or more lines are too long
@ -4,7 +4,9 @@ Maksym Lysak [0000 - 0002 - 3723 - $^{6960]}$, Ahmed Nassar[0000 - 0002 - 9468 -
|
||||
|
||||
and Peter Staar
|
||||
|
||||
{mly,ahn,nli,cau,taa}@zurich.ibm.com IBM Research
|
||||
IBM Research
|
||||
|
||||
{mly,ahn,nli,cau,taa}@zurich.ibm.com
|
||||
|
||||
Abstract. Extracting tables from documents is a crucial task in any document conversion pipeline. Recently, transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence (Im2Seq) approaches. Taking only the image of a table, such models predict a sequence of tokens (e.g. in HTML, LaTeX) which represent the structure of the table. Since the token representation of the table structure has a significant impact on the accuracy and run-time performance of any Im2Seq model, we investigate in this paper how table-structure representation can be optimised. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules. The benefits of OTSL are that it reduces the number of tokens to 5 (HTML needs 28+) and shortens the sequence length to half of HTML on average. Consequently, model accuracy improves significantly, inference time is halved compared to HTML-based models, and the predicted table structures are always syntactically correct. This in turn eliminates most post-processing needs. Popular table structure data-sets will be published in OTSL format to the community.
|
||||
|
||||
@ -91,7 +93,7 @@ The OTSL representation follows these syntax rules:
|
||||
|
||||
- 2. Up-looking cell rule : The upper neighbour of a "U" cell must be either another "U" cell or a "C" cell.
|
||||
|
||||
- 3. Cross cell rule :
|
||||
## 3. Cross cell rule :
|
||||
|
||||
- The left neighbour of an "X" cell must be either another "X" cell or a "U" cell, and the upper neighbour of an "X" cell must be either another "X" cell or an "L" cell.
|
||||
|
||||
|
File diff suppressed because one or more lines are too long
@ -256,7 +256,7 @@
|
||||
<paragraph><location><page_13><loc_25><loc_58><loc_66><loc_59></location>- -Employees can see only their own unmasked TAX_ID.</paragraph>
|
||||
<paragraph><location><page_13><loc_25><loc_55><loc_89><loc_57></location>- -Managers see a masked version of TAX_ID with the first five characters replaced with the X character (for example, XXX-XX-1234).</paragraph>
|
||||
<paragraph><location><page_13><loc_25><loc_52><loc_87><loc_54></location>- -Any other person sees the entire TAX_ID as masked, for example, XXX-XX-XXXX.</paragraph>
|
||||
<paragraph><location><page_13><loc_25><loc_50><loc_87><loc_51></location>To implement this column mask, run the SQL statement that is shown in Example 3-9.</paragraph>
|
||||
<paragraph><location><page_13><loc_25><loc_50><loc_87><loc_51></location>- To implement this column mask, run the SQL statement that is shown in Example 3-9.</paragraph>
|
||||
<paragraph><location><page_13><loc_22><loc_48><loc_58><loc_49></location>Example 3-9 Creating a mask on the TAX_ID column</paragraph>
|
||||
<paragraph><location><page_13><loc_22><loc_14><loc_86><loc_47></location>CREATE MASK HR_SCHEMA.MASK_TAX_ID_ON_EMPLOYEES ON HR_SCHEMA.EMPLOYEES AS EMPLOYEES FOR COLUMN TAX_ID RETURN CASE WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'HR' ) = 1 THEN EMPLOYEES . TAX_ID WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'MGR' ) = 1 AND SESSION_USER = EMPLOYEES . USER_ID THEN EMPLOYEES . TAX_ID WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'MGR' ) = 1 AND SESSION_USER <> EMPLOYEES . USER_ID THEN ( 'XXX-XX-' CONCAT QSYS2 . SUBSTR ( EMPLOYEES . TAX_ID , 8 , 4 ) ) WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'EMP' ) = 1 THEN EMPLOYEES . TAX_ID ELSE 'XXX-XX-XXXX' END ENABLE ;</paragraph>
|
||||
<paragraph><location><page_14><loc_22><loc_90><loc_74><loc_91></location>- 3. Figure 3-10 shows the masks that are created in the HR_SCHEMA.</paragraph>
|
||||
@ -268,7 +268,7 @@
|
||||
<subtitle-level-1><location><page_14><loc_11><loc_73><loc_33><loc_74></location>3.6.6 Activating RCAC</subtitle-level-1>
|
||||
<paragraph><location><page_14><loc_22><loc_67><loc_89><loc_71></location>Now that you have created the row permission and the two column masks, RCAC must be activated. The row permission and the two column masks are enabled (last clause in the scripts), but now you must activate RCAC on the table. To do so, complete the following steps:</paragraph>
|
||||
<paragraph><location><page_14><loc_22><loc_65><loc_67><loc_66></location>- 1. Run the SQL statements that are shown in Example 3-10.</paragraph>
|
||||
<paragraph><location><page_14><loc_22><loc_62><loc_61><loc_63></location>Example 3-10 Activating RCAC on the EMPLOYEES table</paragraph>
|
||||
<subtitle-level-1><location><page_14><loc_22><loc_62><loc_61><loc_63></location>Example 3-10 Activating RCAC on the EMPLOYEES table</subtitle-level-1>
|
||||
<paragraph><location><page_14><loc_22><loc_60><loc_62><loc_61></location>- /* Active Row Access Control (permissions) */</paragraph>
|
||||
<paragraph><location><page_14><loc_22><loc_58><loc_58><loc_60></location>- /* Active Column Access Control (masks)</paragraph>
|
||||
<paragraph><location><page_14><loc_60><loc_58><loc_62><loc_60></location>*/</paragraph>
|
||||
|
File diff suppressed because one or more lines are too long
@ -368,7 +368,7 @@ WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'HR', 'EMP' ) = 1 THEN EMPLOYEES . D
|
||||
|
||||
- -Any other person sees the entire TAX_ID as masked, for example, XXX-XX-XXXX.
|
||||
|
||||
To implement this column mask, run the SQL statement that is shown in Example 3-9.
|
||||
- To implement this column mask, run the SQL statement that is shown in Example 3-9.
|
||||
|
||||
Example 3-9 Creating a mask on the TAX_ID column
|
||||
|
||||
@ -385,7 +385,7 @@ Now that you have created the row permission and the two column masks, RCAC must
|
||||
|
||||
- 1. Run the SQL statements that are shown in Example 3-10.
|
||||
|
||||
Example 3-10 Activating RCAC on the EMPLOYEES table
|
||||
## Example 3-10 Activating RCAC on the EMPLOYEES table
|
||||
|
||||
- /* Active Row Access Control (permissions) */
|
||||
|
||||
|
File diff suppressed because one or more lines are too long
@ -10,11 +10,15 @@
|
||||
<figure>
|
||||
<location><page_1><loc_52><loc_62><loc_88><loc_71></location>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_1><loc_52><loc_58><loc_79><loc_60></location>b. Red-annotation of bounding boxes, Blue-predictions by TableFormer</section_header_level_1>
|
||||
<unordered_list>
|
||||
<list_item><location><page_1><loc_52><loc_58><loc_79><loc_60></location>b. Red-annotation of bounding boxes, Blue-predictions by TableFormer</list_item>
|
||||
</unordered_list>
|
||||
<figure>
|
||||
<location><page_1><loc_51><loc_48><loc_88><loc_57></location>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_1><loc_52><loc_46><loc_80><loc_47></location>c. Structure predicted by TableFormer:</section_header_level_1>
|
||||
<unordered_list>
|
||||
<list_item><location><page_1><loc_52><loc_46><loc_80><loc_47></location>c. Structure predicted by TableFormer:</list_item>
|
||||
</unordered_list>
|
||||
<figure>
|
||||
<location><page_1><loc_52><loc_37><loc_88><loc_45></location>
|
||||
<caption>Figure 1: Picture of a table with subtle, complex features such as (1) multi-column headers, (2) cell with multi-row text and (3) cells with no content. Image from PubTabNet evaluation set, filename: 'PMC2944238 004 02'.</caption>
|
||||
@ -150,8 +154,8 @@
|
||||
</table>
|
||||
<unordered_list>
|
||||
<list_item><location><page_8><loc_9><loc_89><loc_10><loc_90></location>a.</list_item>
|
||||
<list_item><location><page_8><loc_11><loc_89><loc_82><loc_90></location>Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells</list_item>
|
||||
</unordered_list>
|
||||
<text><location><page_8><loc_11><loc_89><loc_82><loc_90></location>Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells</text>
|
||||
<text><location><page_8><loc_9><loc_87><loc_46><loc_88></location>Japanese language (previously unseen by TableFormer):</text>
|
||||
<text><location><page_8><loc_50><loc_87><loc_70><loc_88></location>Example table from FinTabNet:</text>
|
||||
<figure>
|
||||
@ -297,9 +301,7 @@
|
||||
<unordered_list>
|
||||
<list_item><location><page_12><loc_50><loc_13><loc_89><loc_16></location>9d. Intersect the orphan's bounding box with the column bands, and map the cell to the closest grid column.</list_item>
|
||||
<list_item><location><page_12><loc_8><loc_10><loc_47><loc_13></location>5. Use the alignment computed in step 4, to compute the median x -coordinate for all table columns and the me-</list_item>
|
||||
</unordered_list>
|
||||
<text><location><page_12><loc_50><loc_10><loc_89><loc_13></location>9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-</text>
|
||||
<unordered_list>
|
||||
<list_item><location><page_12><loc_50><loc_10><loc_89><loc_13></location>9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-</list_item>
|
||||
<list_item><location><page_12><loc_50><loc_21><loc_89><loc_23></location>9b. Intersect the orphan's bounding box with the row bands, and map the cell to the closest grid row.</list_item>
|
||||
<list_item><location><page_12><loc_50><loc_16><loc_89><loc_20></location>9c. Compute the left and right boundary of the vertical band for each grid column (min/max x coordinates per column).</list_item>
|
||||
<list_item><location><page_12><loc_50><loc_42><loc_89><loc_51></location>8. In some rare occasions, we have noticed that TableFormer can confuse a single column as two. When the postprocessing steps are applied, this results with two predicted columns pointing to the same PDF column. In such case we must de-duplicate the columns according to highest total column intersection score.</list_item>
|
||||
@ -312,75 +314,19 @@
|
||||
<text><location><page_13><loc_8><loc_89><loc_15><loc_91></location>phan cell.</text>
|
||||
<text><location><page_13><loc_8><loc_86><loc_47><loc_89></location>9f. Otherwise create a new structural cell and match it wit the orphan cell.</text>
|
||||
<text><location><page_13><loc_8><loc_83><loc_47><loc_86></location>Aditional images with examples of TableFormer predictions and post-processing can be found below.</text>
|
||||
<section_header_level_1><location><page_13><loc_14><loc_81><loc_18><loc_81></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_73><loc_39><loc_80></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_14><loc_71><loc_30><loc_72></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_63><loc_39><loc_70></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_14><loc_61><loc_27><loc_62></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_54><loc_39><loc_61></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_14><loc_50><loc_27><loc_51></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_14><loc_38><loc_41><loc_50></location>
|
||||
<caption>Figure 8: Example of a table with multi-line header.</caption>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_51><loc_87><loc_54><loc_88></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_83><loc_91><loc_87></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_51><loc_81><loc_62><loc_82></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_77><loc_91><loc_80></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_51><loc_75><loc_60><loc_76></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_51><loc_71><loc_91><loc_75></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_51><loc_68><loc_60><loc_69></location></section_header_level_1>
|
||||
<paragraph><location><page_13><loc_10><loc_35><loc_45><loc_37></location>Figure 8: Example of a table with multi-line header.</paragraph>
|
||||
<figure>
|
||||
<location><page_13><loc_51><loc_63><loc_70><loc_68></location>
|
||||
<caption>Figure 9: Example of a table with big empty distance between cells.</caption>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_13><loc_55><loc_51><loc_58><loc_52></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_45><loc_80><loc_51></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_55><loc_43><loc_69><loc_44></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_37><loc_80><loc_43></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_55><loc_35><loc_67><loc_36></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_13><loc_55><loc_28><loc_80><loc_34></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_13><loc_55><loc_25><loc_66><loc_26></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_13><loc_55><loc_16><loc_85><loc_25></location>
|
||||
<caption>Figure 10: Example of a complex table with empty cells.</caption>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_14><loc_8><loc_86><loc_12><loc_87></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_14><loc_8><loc_56><loc_46><loc_87></location>
|
||||
<caption>Figure 11: Simple table with different style and empty cells.</caption>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_14><loc_8><loc_43><loc_11><loc_44></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_38><loc_51><loc_43></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_8><loc_37><loc_20><loc_37></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_32><loc_51><loc_36></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_8><loc_30><loc_18><loc_31></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_8><loc_25><loc_51><loc_30></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_8><loc_23><loc_18><loc_24></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_14><loc_8><loc_17><loc_29><loc_23></location>
|
||||
<caption>Figure 12: Simple table predictions and post processing.</caption>
|
||||
@ -389,30 +335,13 @@
|
||||
<location><page_14><loc_52><loc_55><loc_87><loc_89></location>
|
||||
<caption>Figure 13: Table predictions example on colorful table.</caption>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_14><loc_52><loc_46><loc_55><loc_46></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_40><loc_85><loc_46></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_52><loc_38><loc_63><loc_39></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_32><loc_85><loc_38></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_52><loc_31><loc_61><loc_32></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_25><loc_85><loc_31></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_14><loc_52><loc_23><loc_61><loc_23></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_14><loc_52><loc_16><loc_87><loc_23></location>
|
||||
<caption>Figure 14: Example with multi-line text.</caption>
|
||||
</table>
|
||||
<paragraph><location><page_14><loc_56><loc_13><loc_83><loc_14></location>Figure 14: Example with multi-line text.</paragraph>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_69><loc_46><loc_83></location>
|
||||
</figure>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_53><loc_46><loc_67></location>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_15><loc_9><loc_51><loc_18><loc_52></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_15><loc_9><loc_37><loc_46><loc_51></location>
|
||||
</figure>
|
||||
@ -420,18 +349,9 @@
|
||||
<location><page_15><loc_8><loc_20><loc_52><loc_36></location>
|
||||
<caption>Figure 15: Example with triangular table.</caption>
|
||||
</figure>
|
||||
<table>
|
||||
<location><page_15><loc_53><loc_72><loc_86><loc_85></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_15><loc_53><loc_70><loc_70><loc_71></location></section_header_level_1>
|
||||
<table>
|
||||
<location><page_15><loc_53><loc_57><loc_86><loc_69></location>
|
||||
</table>
|
||||
<section_header_level_1><location><page_15><loc_53><loc_55><loc_67><loc_56></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_15><loc_53><loc_41><loc_86><loc_54></location>
|
||||
</figure>
|
||||
<section_header_level_1><location><page_15><loc_58><loc_39><loc_73><loc_39></location></section_header_level_1>
|
||||
<figure>
|
||||
<location><page_15><loc_58><loc_20><loc_81><loc_38></location>
|
||||
<caption>Figure 16: Example of how post-processing helps to restore mis-aligned bounding boxes prediction artifact.</caption>
|
||||
|
File diff suppressed because one or more lines are too long
@ -16,11 +16,11 @@ The occurrence of tables in documents is ubiquitous. They often summarise quanti
|
||||
|
||||
<!-- image -->
|
||||
|
||||
## b. Red-annotation of bounding boxes, Blue-predictions by TableFormer
|
||||
- b. Red-annotation of bounding boxes, Blue-predictions by TableFormer
|
||||
|
||||
<!-- image -->
|
||||
|
||||
## c. Structure predicted by TableFormer:
|
||||
- c. Structure predicted by TableFormer:
|
||||
|
||||
Figure 1: Picture of a table with subtle, complex features such as (1) multi-column headers, (2) cell with multi-row text and (3) cells with no content. Image from PubTabNet evaluation set, filename: 'PMC2944238 004 02'.
|
||||
|
||||
@ -221,8 +221,7 @@ Table 4: Results of structure with content retrieved using cell detection on Pub
|
||||
| TableFormer | 95.4 | 90.1 | 93.6 |
|
||||
|
||||
- a.
|
||||
|
||||
Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells
|
||||
- Red - PDF cells, Green - predicted bounding boxes, Blue - post-processed predictions matched to PDF cells
|
||||
|
||||
Japanese language (previously unseen by TableFormer):
|
||||
|
||||
@ -381,9 +380,7 @@ where c is one of { left, centroid, right } and x$\_{c}$ is the xcoordinate for
|
||||
|
||||
- 9d. Intersect the orphan's bounding box with the column bands, and map the cell to the closest grid column.
|
||||
- 5. Use the alignment computed in step 4, to compute the median x -coordinate for all table columns and the me-
|
||||
|
||||
9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-
|
||||
|
||||
- 9e. If the table cell under the identified row and column is not empty, extend its content with the content of the or-
|
||||
- 9b. Intersect the orphan's bounding box with the row bands, and map the cell to the closest grid row.
|
||||
- 9c. Compute the left and right boundary of the vertical band for each grid column (min/max x coordinates per column).
|
||||
- 8. In some rare occasions, we have noticed that TableFormer can confuse a single column as two. When the postprocessing steps are applied, this results with two predicted columns pointing to the same PDF column. In such case we must de-duplicate the columns according to highest total column intersection score.
|
||||
@ -399,54 +396,20 @@ phan cell.
|
||||
|
||||
Aditional images with examples of TableFormer predictions and post-processing can be found below.
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
Figure 8: Example of a table with multi-line header.
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
Figure 9: Example of a table with big empty distance between cells.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
Figure 10: Example of a complex table with empty cells.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
Figure 11: Simple table with different style and empty cells.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
Figure 12: Simple table predictions and post processing.
|
||||
|
||||
<!-- image -->
|
||||
@ -455,36 +418,20 @@ Figure 13: Table predictions example on colorful table.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
Figure 14: Example with multi-line text.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
<!-- image -->
|
||||
|
||||
Figure 15: Example with triangular table.
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
##
|
||||
|
||||
<!-- image -->
|
||||
|
||||
##
|
||||
|
||||
Figure 16: Example of how post-processing helps to restore mis-aligned bounding boxes prediction artifact.
|
||||
|
||||
<!-- image -->
|
||||
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -2,7 +2,8 @@
|
||||
<section_header_level_1><location><page_1><loc_22><loc_82><loc_79><loc_85></location>Optimized Table Tokenization for Table Structure Recognition</section_header_level_1>
|
||||
<text><location><page_1><loc_23><loc_75><loc_78><loc_79></location>Maksym Lysak [0000 - 0002 - 3723 - $^{6960]}$, Ahmed Nassar[0000 - 0002 - 9468 - $^{0822]}$, Nikolaos Livathinos [0000 - 0001 - 8513 - $^{3491]}$, Christoph Auer[0000 - 0001 - 5761 - $^{0422]}$, [0000 - 0002 - 8088 - 0823]</text>
|
||||
<text><location><page_1><loc_38><loc_74><loc_49><loc_75></location>and Peter Staar</text>
|
||||
<text><location><page_1><loc_36><loc_70><loc_64><loc_73></location>{mly,ahn,nli,cau,taa}@zurich.ibm.com IBM Research</text>
|
||||
<text><location><page_1><loc_46><loc_72><loc_55><loc_73></location>IBM Research</text>
|
||||
<text><location><page_1><loc_36><loc_70><loc_64><loc_71></location>{mly,ahn,nli,cau,taa}@zurich.ibm.com</text>
|
||||
<text><location><page_1><loc_27><loc_41><loc_74><loc_66></location>Abstract. Extracting tables from documents is a crucial task in any document conversion pipeline. Recently, transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence (Im2Seq) approaches. Taking only the image of a table, such models predict a sequence of tokens (e.g. in HTML, LaTeX) which represent the structure of the table. Since the token representation of the table structure has a significant impact on the accuracy and run-time performance of any Im2Seq model, we investigate in this paper how table-structure representation can be optimised. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules. The benefits of OTSL are that it reduces the number of tokens to 5 (HTML needs 28+) and shortens the sequence length to half of HTML on average. Consequently, model accuracy improves significantly, inference time is halved compared to HTML-based models, and the predicted table structures are always syntactically correct. This in turn eliminates most post-processing needs. Popular table structure data-sets will be published in OTSL format to the community.</text>
|
||||
<text><location><page_1><loc_27><loc_37><loc_74><loc_40></location>Keywords: Table Structure Recognition · Data Representation · Transformers · Optimization.</text>
|
||||
<section_header_level_1><location><page_1><loc_22><loc_33><loc_37><loc_34></location>1 Introduction</section_header_level_1>
|
||||
@ -56,7 +57,9 @@
|
||||
<unordered_list>
|
||||
<list_item><location><page_7><loc_23><loc_54><loc_79><loc_56></location>1. Left-looking cell rule : The left neighbour of an "L" cell must be either another "L" cell or a "C" cell.</list_item>
|
||||
<list_item><location><page_7><loc_23><loc_51><loc_79><loc_53></location>2. Up-looking cell rule : The upper neighbour of a "U" cell must be either another "U" cell or a "C" cell.</list_item>
|
||||
<list_item><location><page_7><loc_23><loc_49><loc_37><loc_50></location>3. Cross cell rule :</list_item>
|
||||
</unordered_list>
|
||||
<section_header_level_1><location><page_7><loc_23><loc_49><loc_37><loc_50></location>3. Cross cell rule :</section_header_level_1>
|
||||
<unordered_list>
|
||||
<list_item><location><page_7><loc_25><loc_44><loc_79><loc_49></location>The left neighbour of an "X" cell must be either another "X" cell or a "U" cell, and the upper neighbour of an "X" cell must be either another "X" cell or an "L" cell.</list_item>
|
||||
<list_item><location><page_7><loc_23><loc_43><loc_78><loc_44></location>4. First row rule : Only "L" cells and "C" cells are allowed in the first row.</list_item>
|
||||
<list_item><location><page_7><loc_23><loc_40><loc_79><loc_43></location>5. First column rule : Only "U" cells and "C" cells are allowed in the first column.</list_item>
|
||||
|
File diff suppressed because one or more lines are too long
@ -4,7 +4,9 @@ Maksym Lysak [0000 - 0002 - 3723 - $^{6960]}$, Ahmed Nassar[0000 - 0002 - 9468 -
|
||||
|
||||
and Peter Staar
|
||||
|
||||
{mly,ahn,nli,cau,taa}@zurich.ibm.com IBM Research
|
||||
IBM Research
|
||||
|
||||
{mly,ahn,nli,cau,taa}@zurich.ibm.com
|
||||
|
||||
Abstract. Extracting tables from documents is a crucial task in any document conversion pipeline. Recently, transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence (Im2Seq) approaches. Taking only the image of a table, such models predict a sequence of tokens (e.g. in HTML, LaTeX) which represent the structure of the table. Since the token representation of the table structure has a significant impact on the accuracy and run-time performance of any Im2Seq model, we investigate in this paper how table-structure representation can be optimised. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules. The benefits of OTSL are that it reduces the number of tokens to 5 (HTML needs 28+) and shortens the sequence length to half of HTML on average. Consequently, model accuracy improves significantly, inference time is halved compared to HTML-based models, and the predicted table structures are always syntactically correct. This in turn eliminates most post-processing needs. Popular table structure data-sets will be published in OTSL format to the community.
|
||||
|
||||
@ -88,7 +90,9 @@ The OTSL representation follows these syntax rules:
|
||||
|
||||
- 1. Left-looking cell rule : The left neighbour of an "L" cell must be either another "L" cell or a "C" cell.
|
||||
- 2. Up-looking cell rule : The upper neighbour of a "U" cell must be either another "U" cell or a "C" cell.
|
||||
- 3. Cross cell rule :
|
||||
|
||||
## 3. Cross cell rule :
|
||||
|
||||
- The left neighbour of an "X" cell must be either another "X" cell or a "U" cell, and the upper neighbour of an "X" cell must be either another "X" cell or an "L" cell.
|
||||
- 4. First row rule : Only "L" cells and "C" cells are allowed in the first row.
|
||||
- 5. First column rule : Only "U" cells and "C" cells are allowed in the first column.
|
||||
|
File diff suppressed because one or more lines are too long
@ -265,8 +265,8 @@
|
||||
<list_item><location><page_13><loc_25><loc_58><loc_66><loc_59></location>-Employees can see only their own unmasked TAX_ID.</list_item>
|
||||
<list_item><location><page_13><loc_25><loc_55><loc_89><loc_57></location>-Managers see a masked version of TAX_ID with the first five characters replaced with the X character (for example, XXX-XX-1234).</list_item>
|
||||
<list_item><location><page_13><loc_25><loc_52><loc_87><loc_54></location>-Any other person sees the entire TAX_ID as masked, for example, XXX-XX-XXXX.</list_item>
|
||||
<list_item><location><page_13><loc_25><loc_50><loc_87><loc_51></location>To implement this column mask, run the SQL statement that is shown in Example 3-9.</list_item>
|
||||
</unordered_list>
|
||||
<text><location><page_13><loc_25><loc_50><loc_87><loc_51></location>To implement this column mask, run the SQL statement that is shown in Example 3-9.</text>
|
||||
<paragraph><location><page_13><loc_22><loc_48><loc_58><loc_49></location>Example 3-9 Creating a mask on the TAX_ID column</paragraph>
|
||||
<code><location><page_13><loc_22><loc_14><loc_86><loc_47></location>CREATE MASK HR_SCHEMA.MASK_TAX_ID_ON_EMPLOYEES ON HR_SCHEMA.EMPLOYEES AS EMPLOYEES FOR COLUMN TAX_ID RETURN CASE WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'HR' ) = 1 THEN EMPLOYEES . TAX_ID WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'MGR' ) = 1 AND SESSION_USER = EMPLOYEES . USER_ID THEN EMPLOYEES . TAX_ID WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'MGR' ) = 1 AND SESSION_USER <> EMPLOYEES . USER_ID THEN ( 'XXX-XX-' CONCAT QSYS2 . SUBSTR ( EMPLOYEES . TAX_ID , 8 , 4 ) ) WHEN VERIFY_GROUP_FOR_USER ( SESSION_USER , 'EMP' ) = 1 THEN EMPLOYEES . TAX_ID ELSE 'XXX-XX-XXXX' END ENABLE ;</code>
|
||||
<unordered_list>
|
||||
@ -281,7 +281,7 @@
|
||||
<unordered_list>
|
||||
<list_item><location><page_14><loc_22><loc_65><loc_67><loc_66></location>1. Run the SQL statements that are shown in Example 3-10.</list_item>
|
||||
</unordered_list>
|
||||
<paragraph><location><page_14><loc_22><loc_62><loc_61><loc_63></location>Example 3-10 Activating RCAC on the EMPLOYEES table</paragraph>
|
||||
<section_header_level_1><location><page_14><loc_22><loc_62><loc_61><loc_63></location>Example 3-10 Activating RCAC on the EMPLOYEES table</section_header_level_1>
|
||||
<unordered_list>
|
||||
<list_item><location><page_14><loc_22><loc_60><loc_62><loc_61></location>/* Active Row Access Control (permissions) */</list_item>
|
||||
<list_item><location><page_14><loc_22><loc_58><loc_58><loc_60></location>/* Active Column Access Control (masks)</list_item>
|
||||
|
File diff suppressed because one or more lines are too long
@ -334,8 +334,7 @@ WHEN VERIFY\_GROUP\_FOR\_USER ( SESSION\_USER , 'HR', 'EMP' ) = 1 THEN EMPLOYEES
|
||||
- -Employees can see only their own unmasked TAX\_ID.
|
||||
- -Managers see a masked version of TAX\_ID with the first five characters replaced with the X character (for example, XXX-XX-1234).
|
||||
- -Any other person sees the entire TAX\_ID as masked, for example, XXX-XX-XXXX.
|
||||
|
||||
To implement this column mask, run the SQL statement that is shown in Example 3-9.
|
||||
- To implement this column mask, run the SQL statement that is shown in Example 3-9.
|
||||
|
||||
Example 3-9 Creating a mask on the TAX\_ID column
|
||||
|
||||
@ -355,7 +354,7 @@ Now that you have created the row permission and the two column masks, RCAC must
|
||||
|
||||
- 1. Run the SQL statements that are shown in Example 3-10.
|
||||
|
||||
Example 3-10 Activating RCAC on the EMPLOYEES table
|
||||
## Example 3-10 Activating RCAC on the EMPLOYEES table
|
||||
|
||||
- /* Active Row Access Control (permissions) */
|
||||
- /* Active Column Access Control (masks)
|
||||
|
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue
Block a user