Commit Graph

4 Commits

Author SHA1 Message Date
Christoph Auer
b77f73beec Text fixes, new test data
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-03-14 11:44:09 +01:00
Christoph Auer
e00f362405 Update tests, use TextCell.from_ocr property
Some checks failed
Run Docs CI / build-docs (push) Failing after 1m26s
Run CI / code-checks (push) Failing after 6m37s
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-03-13 16:04:08 +01:00
Christoph Auer
cf78d5b7b9
feat: Add content_layer property to items to address body, furniture and other roles (#735)
* feat: Pass predicted page-headers and page-footers through to DoclingDocument furniture

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* chore: Update all test GT

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: update all test cases

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: update all test cases again

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update lock

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update lock to final docling-core

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-02-10 12:07:49 +01:00
Maxim Lysak
7a97d7119f
feat: Extracting picture data for raster images found in PPTX (#349)
* Added picture data for pptx pictures

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added tests for pptx

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Inferring image DPI from pptx file

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2024-11-18 15:22:28 +01:00