mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
fix(pypdfium2): Fix OCR bounding box misalignment caused by mismatched rotation metadata (#2039)
* Fix OCR bounding box misalignment caused by rotation metadata Signed-off-by: AndrewTsai0406 <tsai247365@gmail.com> * Add rotation-mismatch scanned pdf test case Signed-off-by: AndrewTsai0406 <tsai247365@gmail.com> * add ground truth for ocr_test_rotation_mismatch.pdf Signed-off-by: AndrewTsai0406 <tsai247365@gmail.com> * add ground truth for ocr_test_rotation_mismatch.pdf Signed-off-by: AndrewTsai0406 <tsai247365@gmail.com> * Updated test GT and merged from main Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix OCR test by excluding mismatched rotation example Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: AndrewTsai0406 <tsai247365@gmail.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
This commit is contained in:
BIN
tests/data_scanned/sample_with_rotation_mismatch.pdf
vendored
Normal file
BIN
tests/data_scanned/sample_with_rotation_mismatch.pdf
vendored
Normal file
Binary file not shown.
Reference in New Issue
Block a user