Commit Graph

17 Commits

Author SHA1 Message Date
Václav Vančura
b745459a34 Actor: Enhance Dockerfile with additional utilities and env vars
- Add installation of `time` and `procps` packages for better resource monitoring.
- Set environment variables `PYTHONUNBUFFERED`, `MALLOC_ARENA_MAX`, and `EASYOCR_DOWNLOAD_CACHE` for improved performance.
- Create a cache directory for EasyOCR to optimize storage usage.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:38:04 +01:00
Václav Vančura
1b6d4b5c50 Actor: README update
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:38:00 +01:00
Václav Vančura
e261111daa Actor: Adding README
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:56 +01:00
Václav Vančura
f064f762f5 Actor: Updating Docling to 2.17.0
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:51 +01:00
Václav Vančura
7d651eb61f Actor: Improve script logging and error handling
- Initialize log file at `/tmp/docling.log` and redirect all output to it
- Remove exit on error trap, now only logs error line numbers
- Use temporary directory for timestamp file
- Capture Docling exit code and handle errors more gracefully
- Update log file references to use `LOG_FILE` variable
- Remove local log file during cleanup

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:47 +01:00
Václav Vančura
ff7d64b421 Actor: Improve shell script robustness and error handling
The shell script has been enhanced with better error handling, input validation, and cleanup procedures. Key improvements include:

- Added proper quoting around variables to prevent word splitting.
- Improved error messages and logging functionality.
- Implemented a cleanup trap to ensure temporary files are removed.
- Enhanced validation of input parameters and output formats.
- Added better handling of the log file and its storage.
- Improved command execution with proper evaluation.
- Added comments for better code readability and maintenance.
- Fixed potential security issues with proper variable expansion.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:36 +01:00
Václav Vančura
dde401d134 Actor: Update Docker configuration for improved security
- Add `ACTOR_PATH_IN_DOCKER_CONTEXT` argument to ignore the Apify-tooling related warning.
- Improve readability with consistent formatting and spacing in RUN commands.
- Enhance security by properly setting up appuser home directory and permissions.
- Streamline directory structure and ownership for runtime operations.
- Remove redundant `.apify` directory creation as it's handled by the CLI.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:31 +01:00
Václav Vančura
b2ac6cc218 Actor: Create Apify user home directory in Docker setup
Add and configure `/home/appuser/.apify` directory with proper permissions for the appuser in the Docker container. This ensures the Apify SDK has a writable home directory for storing its configuration and temporary files.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:26 +01:00
Václav Vančura
784571f9ce Actor: Fix apify-cli version problem
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:19 +01:00
Václav Vančura
4dce886b17 Actor: Update dependencies with fixed versions
Upgrade pip and npm to latest versions, pin docling to 2.15.1 and apify-cli to 2.7.1 for better stability and reproducibility. This change helps prevent unexpected behavior from dependency updates and ensures consistent builds across environments.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:14 +01:00
Václav Vančura
ac7c5053f0 Actor: Add Docker image metadata labels
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:10 +01:00
Václav Vančura
e1adc4ee8f Actor: Optimize Dockerfile with security and size improvements
- Combine RUN commands to reduce image layers and overall size.
- Add non-root user `appuser` for improved security.
- Use `--no-install-recommends` flag to minimize installed packages.
- Install only necessary dependencies in a single RUN command.
- Maintain proper cleanup of package lists and caches.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:37:04 +01:00
Václav Vančura
19f612c009 Actor: Enhance Docker security with proper user permissions
- Set proper ownership and permissions for runtime directory.
- Switch to non-root user for enhanced security.
- Use `--chown` flag in COPY commands to maintain correct file ownership.
- Ensure all files and directories are owned by `appuser`.

Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:36:57 +01:00
Václav Vančura
ae491b0516 Actor: Switching Docker to python:3.11-slim-bookworm
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:36:54 +01:00
Václav Vančura
67e1129365 Actor: Documentation update
Signed-off-by: Václav Vančura <commit@vancura.dev>
Signed-off-by: Adam Kliment <adam@netmilk.net>
2025-03-13 10:36:46 +01:00
Václav Vančura
352301b58d Actor: .dockerignore update
Signed-off-by: Václav Vančura <commit@vancura.dev>
2025-03-13 10:36:23 +01:00
Václav Vančura
4d13bb2650 Actor: Initial implementation
Signed-off-by: Václav Vančura <commit@vancura.dev>
Signed-off-by: Adam Kliment <adam@netmilk.net>
2025-03-13 10:36:01 +01:00