fix: Fixes for wordx (#432)

* fixes for referencing drawing blip in wordx

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added safety try-except when trying to load pillow image from a docx blob. Added explicit dependency on lxml.

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added test for word file with embedded emf images, re-generated full tests for docx, eased up dependency on lxml

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated lxml dependency version

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
This commit is contained in:
Maxim Lysak
2024-11-26 14:44:43 +01:00
committed by GitHub
parent d7072b4b56
commit d0a1180478
13 changed files with 1295 additions and 152 deletions

View File

@@ -48,6 +48,7 @@ beautifulsoup4 = "^4.12.3"
pandas = "^2.1.4"
marko = "^2.1.2"
openpyxl = "^3.1.5"
lxml = ">=4.0.0,<6.0.0"
ocrmac = { version = "^1.0.0", markers = "sys_platform == 'darwin'", optional = true }
[tool.poetry.group.dev.dependencies]