Available as an add-on on paid plans.
Enabling storage on a task
Storage is configured when you create or update a task. Setstorage.enabled to true to capture files. Set storage.extraction to true to also extract structured data from those files.
| Field | Type | Description |
|---|---|---|
storage.enabled | boolean | Capture files produced during task execution |
storage.extraction | boolean | Parse captured files and extract structured data |
storage.extraction_schema | object | JSON Schema describing the fields to extract. Required when extraction is true. |
storage.extraction_prompt | string | Optional natural-language guidance for the extraction model. |
storage.deduplication | boolean | Enable deduplication to skip files that match a previous capture. See Deduplication. |
storage.deduplication_schema | object | JSON Schema describing the fields used for duplicate detection. Required when deduplication is true. |
Retrieving storage items
Task run responses include astorage array of lightweight summaries. To get extracted data and a pre-signed download URL, list a task run’s storage items or fetch a single item by ID.
Storage item fields
| Field | Task run summary | List items | Get item |
|---|---|---|---|
id | ✓ | ✓ | ✓ |
file_name | ✓ | ✓ | ✓ |
file_type | ✓ | ✓ | ✓ |
file_size | ✓ | ✓ | ✓ |
purpose | ✓ | ✓ | ✓ |
created_at | ✓ | ✓ | ✓ |
extraction | ✓ | ✓ | |
url | ✓ | ✓ | |
task_run_id | ✓ |
Unique identifier, prefixed with
stor_.Original file name as it appeared on the source.
MIME type (
application/pdf, image/png, video/mp4, text/csv, etc.).Size in bytes.
output for files the agent captures during a run, attachment for files you provide as task input, or extraction for files Deck processed via direct extraction.When the storage item was created.
Structured data extracted from the file, if extraction is enabled.
Signed download URL.
The task run that produced this storage item.
Downloading files
Both list and get-by-id responses include a pre-signedurl you can use to download the raw file. URLs are time-limited; if one expires, re-fetch the item to get a fresh one.
Providing files as input
Tasks can accept files as input. The file is uploaded to storage and malware-scanned, then either handed to the agent or extracted directly by Deck, depending on the field’spurpose:
purpose | What Deck does | Available on |
|---|---|---|
attachment | The agent receives the file at run time and uses it on the source. | Enterprise plans |
extraction | Deck extracts structured JSON from the file directly. The agent is skipped. | Enterprise plans with the extraction and storage add-ons |
input_invalid error.
Defining the field
Define a file field in the input schema. The shape is the same for both purposes; only thepurpose constant changes.
| Field | Description |
|---|---|
purpose | Constant "attachment" or "extraction". Marks the field as a file input and selects how Deck handles it. |
file_name | Original file name, e.g. resume.pdf. |
content_type | MIME type, e.g. application/pdf. |
data | The file contents, base64-encoded. |
Sending a file
Provide the file inline as base64 in the task run input:Attachments
Withpurpose: "attachment", the agent receives the file at run time and uses it on the source: uploading it to a portal, attaching it to a form, or referencing it while completing the task.
The run stays queued until every attachment passes the scan, then transitions to running and dispatches to the agent. If a file is flagged, the run fails with an attachment_invalid error on the task run object. Listen for task_run.failed or fetch the run to handle it.
Deck replaces the base64 data in the stored input with a storage_id reference so the raw bytes aren’t carried through the run. The file becomes a storage item with purpose: "attachment", alongside the output files the agent captures, and appears under the Input tab on the task run in the Console.
Extraction
Withpurpose: "extraction", Deck processes the file directly against the task’s extraction_schema and returns the structured result on the run. The agent doesn’t execute. Use this when you have a document and want JSON back, with no source interaction.
The run transitions to running as soon as the file is uploaded, and finalizes when extraction completes. It doesn’t sit in queued waiting on the scan.
The task must have storage and extraction enabled, with an extraction_schema defining the result shape:
credential_id or source_id to link the extraction to a user or source, even though the agent doesn’t execute.
purpose: "extraction" carrying the extracted data, matching the fields defined in extraction_schema. See Document extraction for guidance on writing extraction schemas.
Reusing a file across runs
Once a file is uploaded, you can reference it on a later run instead of sending the bytes again. Pass thestorage_id in place of data, keeping the same purpose:
purpose: "extraction". Deck verifies the file belongs to your organization, then copies it for the new run so each run keeps its own input files.
Document extraction
When extraction is enabled, Deck parses the captured files and populates theextraction field on each storage item with structured JSON.
Extraction works with common document types including PDFs, spreadsheets, invoices, receipts, and reports. The extracted data depends on the document. A utility bill produces different fields than a hotel receipt.
Custom extraction schemas
Use theextraction_schema field on the task’s storage config to define exactly what fields you want extracted. Deck uses this schema to guide parsing.
Extraction example
A utility bill extraction might return:Extraction errors
If extraction fails on a file, theextraction field on that storage item stays null. The raw file is still available for download.
If any file in a task run fails extraction, the task run itself completes with a failure result and an extraction_failed error in the errors array indicating how many files were affected. Successfully extracted files in the same run still return their extraction data.
Deduplication
Deduplication tells Deck to skip files that match one captured by a previous run for the same task and credential, so recurring tasks only return new documents. You define a set of fields that uniquely identify a document. Deck reads those fields from each captured file and compares them against prior captures. If every field matches, the new file is dropped: it’s not stored, nostorage.created event fires, and it’s not extracted even if extraction is enabled.
Configuration
Setdeduplication to true and provide a deduplication_schema on the task’s storage config:
| Field | Type | Description |
|---|---|---|
deduplication | boolean | Turn deduplication on for this task |
deduplication_schema | object | JSON Schema with a properties map of field_name → { type, description }. Required when deduplication is true. |
type, one of string, integer, number, or boolean. Nested objects and arrays aren’t supported, so pick top-level scalar fields.
Property names are arbitrary; you make them up, and they’re just keys for the result. The description is what tells Deck where to find the value on each document, so describe each field precisely. For example, "Account number printed at the top of the bill" works better than a vague "account".
Choosing fields
The fields you list together form the dedup key. Two files match only if every field is identical. A few rules of thumb:- Pick fields that stay stable for the same logical document. A monthly bill should have the same account number and billing period every time it’s fetched.
- Avoid volatile fields. File names, fetch dates, and page numbers will produce false negatives, since the same document looks new every time.
- Pick enough fields to be unique. A single field like
vendor_namewill collide across unrelated invoices from the same vendor. - Two or three fields is usually right.
Field combinations by document type
- Utility bills
- Invoices
- Receipts
- Bank and credit-card statements
Account plus billing period:
Errors
deduplication_schema must be present with at least one property whenever deduplication is true. Starting a task run without it returns:
deduplication to false (or omit it entirely).
Events
Storage items emit events you can subscribe to through event destinations:| Event | When it fires |
|---|---|
storage.created | A new file has been captured and is ready for download |