Skip to main content
Every generation endpoint (/excel/generate, /word/generate, /slides/generate, /slides/outline, and the /text/* endpoints) can execute in two modes.

Sync (default)

The HTTP call blocks until the document is ready, up to a 300-second (5 min) wall-clock ceiling. If the agent is still running at 300s, the request is automatically promoted to async and returns 202 with a task_id.
POST /excel/generate
{ "prompt": "Create a sales report" }

200 OK (typical: 90s-3min for most agent-driven generations)
{
  "run_id": "run_...",
  "task_id": "task_...",
  "status": "completed",
  "download_url": "...",
  "edit_url": "...",
  "summary": "...",
  "credits_used": 84,
  "duration_ms": 187707
}
These are agent runs, not simple templating. Even a “small” prompt typically takes 90s-3min because the agent plans, writes, verifies, and often refines across multiple turns. A single-sheet Excel or 1–2-page Word doc usually lands inside the 300s sync window; anything beyond that — multi-sheet financial models, 15+ slide decks with charts, long-form reports — should be submitted async from the start.
Use sync when:
  • You’re prototyping and don’t want to build polling yet.
  • The prompt is simple enough that you’re confident it’ll land under 300s (single sheet, short memo, small deck).
  • Your caller (browser, Zapier, etc.) can hold the connection open.
Sync will auto-convert to async when:
  • The job runs past 300s — common for complex multi-sheet models (5-25 min), long decks, long-form reports, or multi-turn refinement workflows.
  • Our server signals an intermediate delay (e.g. pool saturation during peak load).
If a sync call upgrades to async, the response is:
202 Accepted
{
  "run_id": "run_...",
  "task_id": "task_...",
  "status": "queued",
  "message": "Sync ceiling (300s) exceeded; continuing asynchronously."
}
You then poll /tasks/{task_id} exactly as you would for an explicit async call.

Async

Set async: true in the request body. Returns immediately with task_id:
POST /excel/generate
{
  "prompt": "Build a 3-statement financial model",
  "async": true
}

202 Accepted
{
  "run_id": "run_...",
  "task_id": "task_...",
  "status": "queued"
}
Use async when:
  • You know the job will be large (full decks, multi-sheet financial models).
  • Your caller is a background job worker.
  • You want a webhook callback instead of polling.

Task states

queued → running → completed | failed | cancelled | expired
StatusTerminalMeaning
queuednoAccepted, waiting for a worker
runningnoExecuting (see phase for sub-state)
completedyesSuccess; result populated
failedyesUnrecoverable error; error populated
cancelledyesClient cancelled or server shutdown mid-run
expiredyesQueued >24h without pickup
While running, the phase field describes sub-steps:
preflight → acquiring_pool → loading_doc → (restoring_state) →
  agent_turn → executing_plan → [compacting_context] →
  saving_doc → uploading_result → dispatching_webhook
progress (integer 0-100) approximates overall completion. Use it for progress bars; don’t parse logic off it.

Polling

GET /tasks/{task_id}
Sample progression:
{
  "task_id": "task_...",
  "run_id": "run_...",
  "status": "running",
  "phase": "agent_turn",
  "turn_index": 3,
  "progress": 45
}
Then later:
{
  "task_id": "task_...",
  "status": "completed",
  "result": {
    "download_url": "...",
    "edit_url": "...",
    "summary": "Built a 3-statement model with FY26-FY30 projections.",
    "artifacts": []
  },
  "credits_used": 127,
  "duration_ms": 54321
}

Polling strategy

  • Initial wait: 2–5 seconds. Most tasks transition from queued to running within a second, but the actual work is the long part.
  • Backoff: linear or exponential, cap at 10s between polls. Polling more aggressively than that doesn’t make the agent finish faster and burns your rate-limit budget.
  • Overall timeout: plan for up to 30 minutes for heavy jobs. Financial models, long-form reports with deep research, and decks with many generated charts regularly run 10–25 minutes end-to-end. A task that’s still non-terminal at 30 minutes is worth a support ticket; before that, it’s almost certainly just legitimate work.
import time

# ~30 min max, 10s cap between polls
for attempt in range(200):
    task = requests.get(f"{API}/tasks/{task_id}", headers=headers).json()
    if task["status"] in {"completed", "failed", "cancelled", "expired"}:
        break
    time.sleep(min(2 + attempt * 0.5, 10))
Production jobs should use webhooks, not polling. Polling for 20 minutes is 240 wasted HTTP calls per task — fine for a dev script, noisy at scale. Use webhook_url and let us push the result to you (see below).

Webhooks

Polling is fine for small integrations, but for production workloads use webhooks and let us push the result to you:
POST /excel/generate
{
  "prompt": "...",
  "async": true,
  "webhook_url": "https://myapp.example.com/webhooks/overten"
}
When the task finishes, we POST to that URL with an HMAC-signed payload. See Webhooks for signature verification and the retry schedule.

Which should I use?

Sync. Show a spinner, wait for the response, redirect to the download_url or embed the edit_url. Promotes to async automatically if it runs long.
Async + webhooks. Your cron submits 500 async jobs; our webhook pings your queue when each finishes. Your worker downloads and files them. Doesn’t hold HTTP connections open.
Sync. A short memo or 1-sheet summary typically lands in 90s-3min — fast enough for a human to wait with a spinner.
Sync each step and reuse run_id. Subsequent calls are cheaper because the agent resumes from state — but each step is still a real agent run (often 60s+), so don’t expect “instant” turns.
Async + webhooks. These jobs commonly take 10–25 minutes. Blocking on them from a browser doesn’t work; polling for 20+ minutes from a client is wasteful. Submit with async: true + webhook_url, then do whatever else you want until we call you.