mirror of
https://github.com/DS4SD/docling.git
synced 2025-08-01 23:12:20 +00:00
Actor: README update
Signed-off-by: Václav Vančura <commit@vancura.dev>
This commit is contained in:
parent
e261111daa
commit
1b6d4b5c50
@ -50,7 +50,7 @@ This Actor wraps the [Docling project](https://ds4sd.github.io/docling/) to prov
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl --request POST \
|
curl --request POST \
|
||||||
--url https://api.apify.com/v2/acts/vancura~docling/runs \
|
--url "https://api.apify.com/v2/acts/username~actorname/run" \
|
||||||
--header 'Content-Type: application/json' \
|
--header 'Content-Type: application/json' \
|
||||||
--header 'Authorization: Bearer YOUR_API_TOKEN' \
|
--header 'Authorization: Bearer YOUR_API_TOKEN' \
|
||||||
--data '{
|
--data '{
|
||||||
@ -63,8 +63,7 @@ curl --request POST \
|
|||||||
### Using Apify CLI
|
### Using Apify CLI
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Example CLI usage
|
apify call username/actorname --input='{
|
||||||
apify call vancura/docling --input='{
|
|
||||||
"documentUrl": "https://example.com/file.pdf",
|
"documentUrl": "https://example.com/file.pdf",
|
||||||
"outputFormat": "md",
|
"outputFormat": "md",
|
||||||
"ocr": true
|
"ocr": true
|
||||||
@ -75,11 +74,11 @@ apify call vancura/docling --input='{
|
|||||||
|
|
||||||
The Actor accepts a JSON schema matching the file `.actor/input_schema.json`. Below is a summary of the fields:
|
The Actor accepts a JSON schema matching the file `.actor/input_schema.json`. Below is a summary of the fields:
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|---------------|---------|----------|----------|-----------------------------------------------------------------------------------------------------------|
|
|----------------|---------|----------|----------|-----------------------------------------------------------------------------------------------------------|
|
||||||
| documentUrl | string | Yes | None | URL of the document (PDF, image, DOCX, etc.) to be processed. Must be directly accessible via public URL. |
|
| `documentUrl` | string | Yes | None | URL of the document (PDF, image, DOCX, etc.) to be processed. Must be directly accessible via public URL. |
|
||||||
| outputFormat | string | No | "md" | Desired output format. One of `md`, `json`, `html`, `text`, or `doctags`. |
|
| `outputFormat` | string | No | `md` | Desired output format. One of `md`, `json`, `html`, `text`, or `doctags`. |
|
||||||
| ocr | boolean | No | true | If set to true, OCR will be applied to scanned PDFs or images for text recognition. |
|
| `ocr` | boolean | No | `true` | If set to true, OCR will be applied to scanned PDFs or images for text recognition. |
|
||||||
|
|
||||||
### Example Input
|
### Example Input
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user