perf: prevent temp file leftovers, reuse core type (#487)

* chore: reuse DocumentStream from docling-core

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

* update docling-core version

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

* [skip ci] document  import line

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

* fix: use new resolve_source_to_x functions to avoid tempfile leftovers (#490)

use new resolve_source_to_x functions

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
This commit is contained in:
Panos Vagenas
2024-12-03 10:40:28 +01:00
committed by GitHub
parent d3f84b2457
commit 051789d017
5 changed files with 103 additions and 97 deletions

View File

@@ -1,5 +1,4 @@
from enum import Enum, auto
from io import BytesIO
from typing import TYPE_CHECKING, Dict, List, Optional, Union
from docling_core.types.doc import (
@@ -9,6 +8,9 @@ from docling_core.types.doc import (
Size,
TableCell,
)
from docling_core.types.io import ( # DO ΝΟΤ REMOVE; explicitly exposed from this location
DocumentStream,
)
from PIL.Image import Image
from pydantic import BaseModel, ConfigDict
@@ -207,10 +209,3 @@ class Page(BaseModel):
@property
def image(self) -> Optional[Image]:
return self.get_image(scale=self._default_image_scale)
class DocumentStream(BaseModel):
model_config = ConfigDict(arbitrary_types_allowed=True)
name: str
stream: BytesIO