mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
fix: Fixes for wordx (#432)
* fixes for referencing drawing blip in wordx Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Added safety try-except when trying to load pillow image from a docx blob. Added explicit dependency on lxml. Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Added test for word file with embedded emf images, re-generated full tests for docx, eased up dependency on lxml Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated lxml dependency version Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> --------- Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
This commit is contained in:
45
tests/data/groundtruth/docling_v2/word_sample.md
Normal file
45
tests/data/groundtruth/docling_v2/word_sample.md
Normal file
@@ -0,0 +1,45 @@
|
||||
Summer activities
|
||||
|
||||
# Swimming in the lake
|
||||
|
||||
Duck
|
||||
|
||||
<!-- image -->
|
||||
|
||||
Figure 1: This is a cute duckling
|
||||
|
||||
## Let’s swim!
|
||||
|
||||
To get started with swimming, first lay down in a water and try not to drown:
|
||||
|
||||
- You can relax and look around
|
||||
- Paddle about
|
||||
- Enjoy summer warmth
|
||||
|
||||
Also, don’t forget:
|
||||
|
||||
- Wear sunglasses
|
||||
- Don’t forget to drink water
|
||||
- Use sun cream
|
||||
|
||||
Hmm, what else…
|
||||
|
||||
### Let’s eat
|
||||
|
||||
After we had a good day of swimming in the lake, it’s important to eat something nice
|
||||
|
||||
I like to eat leaves
|
||||
|
||||
Here are some interesting things a respectful duck could eat:
|
||||
|
||||
| | Food | Calories per portion |
|
||||
|---------|----------------------------------|------------------------|
|
||||
| Leaves | Ash, Elm, Maple | 50 |
|
||||
| Berries | Blueberry, Strawberry, Cranberry | 150 |
|
||||
| Grain | Corn, Buckwheat, Barley | 200 |
|
||||
|
||||
And let’s add another list in the end:
|
||||
|
||||
- Leaves
|
||||
- Berries
|
||||
- Grain
|
||||
Reference in New Issue
Block a user