Skip to main content
Available as an add-on on paid plans.
Tasks can produce files like receipts, invoices, statements, reports, images, videos, and spreadsheets. When storage is enabled on a task, Deck captures the files the agent is instructed to collect and makes them available through the API. If extraction is also enabled, Deck parses supported files and returns structured JSON alongside the raw file.

Enabling storage on a task

Storage is configured when you create or update a task. Set storage.enabled to true to capture files. Set storage.extraction to true to also extract structured data from those files.
POST /v2/tasks

{
  "name": "Fetch utility bills",
  "agent_id": "agt_a1b2c3d4...",
  "input_schema": {
    "type": "object",
    "properties": {
      "start_date": { "type": "string" },
      "end_date": { "type": "string" }
    }
  },
  "output_schema": {
    "type": "object",
    "properties": {
      "bill_count": { "type": "integer" }
    }
  },
  "storage": {
    "enabled": true,
    "extraction": true
  }
}
FieldTypeDescription
storage.enabledbooleanCapture files produced during task execution
storage.extractionbooleanParse captured files and extract structured data
storage.extraction_schemaobjectJSON Schema describing the fields to extract. Required when extraction is true.
storage.extraction_promptstringOptional natural-language guidance for the extraction model.
storage.deduplicationbooleanEnable deduplication to skip files that match a previous capture. See Deduplication.
storage.deduplication_schemaobjectJSON Schema describing the fields used for duplicate detection. Required when deduplication is true.

Retrieving storage items

Task run responses include a storage array of lightweight summaries. To get extracted data and a pre-signed download URL, list a task run’s storage items or fetch a single item by ID.
curl https://api.deck.co/v2/task-runs/trun_a1b2c3d4/storage \
  -H "Authorization: Bearer sk_live_your_key_here"
{
  "data": [
    {
      "id": "stor_x1y2z3...",
      "object": "storage",
      "file_name": "statement_jan_2025.pdf",
      "file_type": "application/pdf",
      "file_size": 245678,
      "url": "https://files.deck.co/stor_x1y2z3...?signature=...",
      "extraction": null,
      "created_at": "2025-01-23T14:30:00Z"
    },
    {
      "id": "stor_a4b5c6...",
      "object": "storage",
      "file_name": "statement_dec_2024.pdf",
      "file_type": "application/pdf",
      "file_size": 198432,
      "url": "https://files.deck.co/stor_a4b5c6...?signature=...",
      "extraction": {
        "company_name": "EnergyLink",
        "account_number": "58291-44720",
        "billing_date": "2024-12-22",
        "amount_due": 6925.18,
        "currency": "USD"
      },
      "created_at": "2025-01-22T09:15:00Z"
    }
  ],
  "has_more": false,
  "next_cursor": null,
  "request_id": "req_f5g6h7..."
}

Storage item fields

FieldTask run summaryList itemsGet item
id
file_name
file_type
file_size
purpose
created_at
extraction
url
task_run_id
id
string
Unique identifier, prefixed with stor_.
file_name
string
Original file name as it appeared on the source.
file_type
string
MIME type (application/pdf, image/png, video/mp4, text/csv, etc.).
file_size
integer
Size in bytes.
purpose
string
output for files the agent captures during a run, attachment for files you provide as task input, or extraction for files Deck processed via direct extraction.
created_at
datetime
When the storage item was created.
extraction
object or null
Structured data extracted from the file, if extraction is enabled.
url
string
Signed download URL.
task_run_id
string
The task run that produced this storage item.

Downloading files

Both list and get-by-id responses include a pre-signed url you can use to download the raw file. URLs are time-limited; if one expires, re-fetch the item to get a fresh one.
curl https://api.deck.co/v2/storage/stor_x1y2z3 \
  -H "Authorization: Bearer sk_live_your_key_here"

Providing files as input

Tasks can accept files as input. The file is uploaded to storage and malware-scanned, then either handed to the agent or extracted directly by Deck, depending on the field’s purpose:
purposeWhat Deck doesAvailable on
attachmentThe agent receives the file at run time and uses it on the source.Enterprise plans
extractionDeck extracts structured JSON from the file directly. The agent is skipped.Enterprise plans with the extraction and storage add-ons
Both purposes share the same field shape and upload behavior, and differ in what happens after the upload. A single task can declare attachment file inputs or extraction file inputs, not both. Creating a task whose input schema mixes both is rejected with an input_invalid error.

Defining the field

Define a file field in the input schema. The shape is the same for both purposes; only the purpose constant changes.
"resume": {
  "type": "object",
  "properties": {
    "purpose": { "const": "attachment" },
    "file_name": { "type": "string" },
    "content_type": { "type": "string" },
    "data": { "type": "string", "contentEncoding": "base64" }
  }
}
FieldDescription
purposeConstant "attachment" or "extraction". Marks the field as a file input and selects how Deck handles it.
file_nameOriginal file name, e.g. resume.pdf.
content_typeMIME type, e.g. application/pdf.
dataThe file contents, base64-encoded.

Sending a file

Provide the file inline as base64 in the task run input:
POST /v2/tasks/task_a1b2c3d4.../run

{
  "credential_id": "cred_a1b2c3d4...",
  "input": {
    "applicant_name": "Jordan Lee",
    "resume": {
      "purpose": "attachment",
      "file_name": "resume.pdf",
      "content_type": "application/pdf",
      "data": "JVBERi0xLjQKJ..."
    }
  }
}
Each file can be up to 20 MB. Larger files, or invalid base64, are rejected synchronously with a validation error and the run isn’t created. After upload, every file is malware-scanned asynchronously. The rest of the run lifecycle depends on the purpose.

Attachments

With purpose: "attachment", the agent receives the file at run time and uses it on the source: uploading it to a portal, attaching it to a form, or referencing it while completing the task. The run stays queued until every attachment passes the scan, then transitions to running and dispatches to the agent. If a file is flagged, the run fails with an attachment_invalid error on the task run object. Listen for task_run.failed or fetch the run to handle it. Deck replaces the base64 data in the stored input with a storage_id reference so the raw bytes aren’t carried through the run. The file becomes a storage item with purpose: "attachment", alongside the output files the agent captures, and appears under the Input tab on the task run in the Console.

Extraction

With purpose: "extraction", Deck processes the file directly against the task’s extraction_schema and returns the structured result on the run. The agent doesn’t execute. Use this when you have a document and want JSON back, with no source interaction. The run transitions to running as soon as the file is uploaded, and finalizes when extraction completes. It doesn’t sit in queued waiting on the scan. The task must have storage and extraction enabled, with an extraction_schema defining the result shape:
POST /v2/tasks

{
  "name": "Extract utility bill",
  "agent_id": "agt_a1b2c3d4...",
  "input_schema": {
    "type": "object",
    "properties": {
      "bill": {
        "type": "object",
        "properties": {
          "purpose": { "const": "extraction" },
          "file_name": { "type": "string" },
          "content_type": { "type": "string" },
          "data": { "type": "string", "contentEncoding": "base64" }
        }
      }
    },
    "required": ["bill"]
  },
  "storage": {
    "enabled": true,
    "extraction": true,
    "extraction_schema": {
      "type": "object",
      "properties": {
        "vendor_name": { "type": "string" },
        "total_amount": { "type": "number" },
        "invoice_date": { "type": "string", "format": "date" }
      }
    }
  }
}
To run it, send the file in the extraction field. The run still needs a credential_id or source_id to link the extraction to a user or source, even though the agent doesn’t execute.
POST /v2/tasks/task_a1b2c3d4.../run

{
  "credential_id": "cred_a1b2c3d4...",
  "input": {
    "bill": {
      "purpose": "extraction",
      "file_name": "january-bill.pdf",
      "content_type": "application/pdf",
      "data": "JVBERi0xLjQKJ..."
    }
  }
}
A task accepts one extraction input per run. The file becomes a storage item with purpose: "extraction" carrying the extracted data, matching the fields defined in extraction_schema. See Document extraction for guidance on writing extraction schemas.

Reusing a file across runs

Once a file is uploaded, you can reference it on a later run instead of sending the bytes again. Pass the storage_id in place of data, keeping the same purpose:
"resume": {
  "purpose": "attachment",
  "storage_id": "stor_x1y2z3..."
}
The same shape works with purpose: "extraction". Deck verifies the file belongs to your organization, then copies it for the new run so each run keeps its own input files.

Document extraction

When extraction is enabled, Deck parses the captured files and populates the extraction field on each storage item with structured JSON. Extraction works with common document types including PDFs, spreadsheets, invoices, receipts, and reports. The extracted data depends on the document. A utility bill produces different fields than a hotel receipt.

Custom extraction schemas

Use the extraction_schema field on the task’s storage config to define exactly what fields you want extracted. Deck uses this schema to guide parsing.
{
  "storage": {
    "enabled": true,
    "extraction": true,
    "extraction_schema": {
      "type": "object",
      "properties": {
        "vendor_name": { "type": "string" },
        "total_amount": { "type": "number" },
        "invoice_date": { "type": "string", "format": "date" },
        "line_items": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "description": { "type": "string" },
              "amount": { "type": "number" }
            }
          }
        }
      }
    }
  }
}

Extraction example

A utility bill extraction might return:
{
  "company_name": "EnergyLink",
  "account_number": "58291-44720",
  "billing_date": "2025-01-22",
  "billing_period": {
    "start_date": "2024-12-18",
    "end_date": "2025-01-20",
    "total_days": 33
  },
  "amount_due": 8247.41,
  "payment_due_date": "2025-02-14",
  "currency": "USD",
  "service_locations": [
    {
      "service_type": "Fuel",
      "service_address": {
        "street": "4421 OAK VIEW LN UNIT 3A",
        "city": "CAMBRIDGE",
        "state": "MA",
        "postal_code": "02140"
      },
      "total_usage": 5348,
      "total_usage_unit": "Therms",
      "total_charges": 8214.67
    }
  ]
}

Extraction errors

If extraction fails on a file, the extraction field on that storage item stays null. The raw file is still available for download. If any file in a task run fails extraction, the task run itself completes with a failure result and an extraction_failed error in the errors array indicating how many files were affected. Successfully extracted files in the same run still return their extraction data.

Deduplication

Deduplication tells Deck to skip files that match one captured by a previous run for the same task and credential, so recurring tasks only return new documents. You define a set of fields that uniquely identify a document. Deck reads those fields from each captured file and compares them against prior captures. If every field matches, the new file is dropped: it’s not stored, no storage.created event fires, and it’s not extracted even if extraction is enabled.

Configuration

Set deduplication to true and provide a deduplication_schema on the task’s storage config:
{
  "storage": {
    "enabled": true,
    "deduplication": true,
    "deduplication_schema": {
      "type": "object",
      "properties": {
        "account_number": {
          "type": "string",
          "description": "The utility account number"
        },
        "billing_period_start": {
          "type": "string",
          "description": "Start date of the billing period (YYYY-MM-DD)"
        }
      }
    }
  }
}
FieldTypeDescription
deduplicationbooleanTurn deduplication on for this task
deduplication_schemaobjectJSON Schema with a properties map of field_name → { type, description }. Required when deduplication is true.
Each property must declare a type, one of string, integer, number, or boolean. Nested objects and arrays aren’t supported, so pick top-level scalar fields. Property names are arbitrary; you make them up, and they’re just keys for the result. The description is what tells Deck where to find the value on each document, so describe each field precisely. For example, "Account number printed at the top of the bill" works better than a vague "account".

Choosing fields

The fields you list together form the dedup key. Two files match only if every field is identical. A few rules of thumb:
  • Pick fields that stay stable for the same logical document. A monthly bill should have the same account number and billing period every time it’s fetched.
  • Avoid volatile fields. File names, fetch dates, and page numbers will produce false negatives, since the same document looks new every time.
  • Pick enough fields to be unique. A single field like vendor_name will collide across unrelated invoices from the same vendor.
  • Two or three fields is usually right.

Field combinations by document type

Account plus billing period:
"deduplication_schema": {
  "type": "object",
  "properties": {
    "account_number": { "type": "string", "description": "Utility account number" },
    "billing_period_start": { "type": "string", "description": "Billing period start date (YYYY-MM-DD)" }
  }
}

Errors

deduplication_schema must be present with at least one property whenever deduplication is true. Starting a task run without it returns:
422 Unprocessable Entity
deduplication_schema must be defined before running a task with deduplication enabled.
To disable deduplication, set deduplication to false (or omit it entirely).

Events

Storage items emit events you can subscribe to through event destinations:
EventWhen it fires
storage.createdA new file has been captured and is ready for download

Retention

Retention period varies by plan. All files are deleted after 90 days.