feat: support xlsm files (#1520)

* code for xlsm support

* updated support for xlsm

* updated code for xlsm support

* Update docling_parse_v4_backend.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update docling_parse_v4_backend.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel_xlsm.py

 updated the tests/test_backend_msexcel_xlsm.py:

 have a function starting with test
removed all print statements
** To add an explicit assert {test}=={pred}

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update base_models.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel_xlsm.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update document_converter.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Delete tests/test_backend_msexcel_xlsm.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* xlsm file

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* run tests

* ran tests

* Fix tests, upgrade XSLM example to a valid file

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
Ayraf
2025-06-10 20:25:59 +05:30
committed by GitHub
parent 6613b9e98b
commit df140227c3
19 changed files with 4834 additions and 632 deletions

View File

@@ -5,92 +5,89 @@ item-0 at level 0: unspecified: group _root_
item-4 at level 1: section: group textbox
item-5 at level 2: paragraph: Student falls ill
item-6 at level 2: paragraph:
item-7 at level 2: paragraph:
item-8 at level 2: list: group list
item-9 at level 3: list_item: Suggested Reportable Symptoms:
item-7 at level 2: list: group list
item-8 at level 3: list_item: Suggested Reportable Symptoms:
... sh
Blisters
Headache
Sore throat
item-10 at level 1: list_item:
item-9 at level 1: list_item:
item-10 at level 1: paragraph:
item-11 at level 1: paragraph:
item-12 at level 1: paragraph:
item-13 at level 1: section: group textbox
item-14 at level 2: paragraph: If a caregiver suspects that wit ... the same suggested reportable symptoms
item-12 at level 1: section: group textbox
item-13 at level 2: paragraph: If a caregiver suspects that wit ... the same suggested reportable symptoms
item-14 at level 1: paragraph:
item-15 at level 1: paragraph:
item-16 at level 1: paragraph:
item-17 at level 1: paragraph:
item-18 at level 1: paragraph:
item-19 at level 1: section: group textbox
item-20 at level 2: paragraph: Yes
item-18 at level 1: section: group textbox
item-19 at level 2: paragraph: Yes
item-20 at level 1: paragraph:
item-21 at level 1: paragraph:
item-22 at level 1: paragraph:
item-23 at level 1: section: group textbox
item-24 at level 2: list: group list
item-25 at level 3: list_item: A report must be submitted withi ... saster Prevention Information Network.
item-26 at level 3: list_item: A report must also be submitted ... d Infectious Disease Reporting System.
item-27 at level 2: paragraph:
item-28 at level 2: paragraph:
item-29 at level 1: list: group list
item-30 at level 2: list_item:
item-22 at level 1: section: group textbox
item-23 at level 2: list: group list
item-24 at level 3: list_item: A report must be submitted withi ... saster Prevention Information Network.
item-25 at level 3: list_item: A report must also be submitted ... d Infectious Disease Reporting System.
item-26 at level 2: paragraph:
item-27 at level 1: list: group list
item-28 at level 2: list_item:
item-29 at level 1: paragraph:
item-30 at level 1: paragraph:
item-31 at level 1: paragraph:
item-32 at level 1: paragraph:
item-33 at level 1: paragraph:
item-34 at level 1: paragraph:
item-35 at level 1: paragraph:
item-36 at level 1: section: group textbox
item-37 at level 2: paragraph: Health Bureau:
item-38 at level 2: paragraph: Upon receiving a report from the ... rt to the Centers for Disease Control.
item-39 at level 2: list: group list
item-40 at level 3: list_item: If necessary, provide health edu ... vidual to undergo specimen collection.
item-41 at level 3: list_item: Implement appropriate epidemic p ... the Communicable Disease Control Act.
item-42 at level 2: paragraph:
item-43 at level 2: paragraph:
item-44 at level 1: list: group list
item-45 at level 2: list_item:
item-46 at level 1: paragraph:
item-47 at level 1: section: group textbox
item-48 at level 2: paragraph: Department of Education:
item-34 at level 1: section: group textbox
item-35 at level 2: paragraph: Health Bureau:
item-36 at level 2: paragraph: Upon receiving a report from the ... rt to the Centers for Disease Control.
item-37 at level 2: list: group list
item-38 at level 3: list_item: If necessary, provide health edu ... vidual to undergo specimen collection.
item-39 at level 3: list_item: Implement appropriate epidemic p ... the Communicable Disease Control Act.
item-40 at level 2: paragraph:
item-41 at level 1: list: group list
item-42 at level 2: list_item:
item-43 at level 1: paragraph:
item-44 at level 1: section: group textbox
item-45 at level 2: paragraph: Department of Education:
Collabo ... vention measures at all school levels.
item-46 at level 1: paragraph:
item-47 at level 1: paragraph:
item-48 at level 1: paragraph:
item-49 at level 1: paragraph:
item-50 at level 1: paragraph:
item-51 at level 1: paragraph:
item-52 at level 1: paragraph:
item-53 at level 1: paragraph:
item-54 at level 1: paragraph:
item-55 at level 1: paragraph:
item-56 at level 1: section: group textbox
item-57 at level 2: inline: group group
item-58 at level 3: paragraph: The Health Bureau will handle
item-59 at level 3: paragraph: reporting and specimen collection
item-60 at level 3: paragraph: .
item-61 at level 2: paragraph:
item-62 at level 2: paragraph:
item-63 at level 1: paragraph:
item-64 at level 1: paragraph:
item-53 at level 1: section: group textbox
item-54 at level 2: inline: group group
item-55 at level 3: paragraph: The Health Bureau will handle
item-56 at level 3: paragraph: reporting and specimen collection
item-57 at level 3: paragraph: .
item-58 at level 2: paragraph:
item-59 at level 1: paragraph:
item-60 at level 1: paragraph:
item-61 at level 1: paragraph:
item-62 at level 1: section: group textbox
item-63 at level 2: paragraph: Whether the epidemic has eased.
item-64 at level 2: paragraph:
item-65 at level 1: paragraph:
item-66 at level 1: section: group textbox
item-67 at level 2: paragraph: Whether the epidemic has eased.
item-68 at level 2: paragraph:
item-69 at level 2: paragraph:
item-67 at level 2: paragraph: Whether the test results are pos ... legally designated infectious disease.
item-68 at level 2: paragraph: No
item-69 at level 1: paragraph:
item-70 at level 1: paragraph:
item-71 at level 1: section: group textbox
item-72 at level 2: paragraph: Whether the test results are pos ... legally designated infectious disease.
item-73 at level 2: paragraph: No
item-74 at level 1: paragraph:
item-75 at level 1: paragraph:
item-76 at level 1: section: group textbox
item-72 at level 2: paragraph: Yes
item-73 at level 1: paragraph:
item-74 at level 1: section: group textbox
item-75 at level 2: paragraph: Yes
item-76 at level 1: paragraph:
item-77 at level 1: paragraph:
item-78 at level 1: section: group textbox
item-79 at level 1: paragraph:
item-80 at level 1: paragraph:
item-81 at level 1: section: group textbox
item-82 at level 2: paragraph: Case closed.
item-83 at level 2: paragraph:
item-84 at level 2: paragraph:
item-85 at level 2: paragraph: The Health Bureau will carry out ... ters for Disease Control if necessary.
item-79 at level 2: paragraph: Case closed.
item-80 at level 2: paragraph:
item-81 at level 2: paragraph: The Health Bureau will carry out ... ters for Disease Control if necessary.
item-82 at level 1: paragraph:
item-83 at level 1: section: group textbox
item-84 at level 2: paragraph: No
item-85 at level 1: paragraph:
item-86 at level 1: paragraph:
item-87 at level 1: section: group textbox
item-88 at level 1: paragraph:
item-89 at level 1: paragraph:
item-90 at level 1: paragraph:
item-87 at level 1: paragraph: