.docx
report that summarizes everything.
There is no separate “start analysis” call. Uploading a file to a
project triggers ingestion + analysis automatically.
What you get
- Fact extraction per file — entities, numerical values, dates, named parties, claims.
- Cross-file reconciliation — the pipeline compares new facts against the project’s existing ledger and surfaces discrepancies (conflicting values across files) and insights (notable single-file findings).
- Verified decisions — high-confidence escalated items pass through an agentic verification stage with citations and recommendations.
- A structured insights feed — one document per alert, filterable by severity, kind, source file, or resolution state.
- Optional
.docxreport — a polished, human-readable rollup generated on demand via the same Word agent/word/generateuses.
Lifecycle
POST /projects— create a project (name, description, tags).POST /projects/{id}/files— upload each file. Ingestion and analysis kick off automatically.GET /projects/{id}/analysis/status— poll until the pipeline iscompleted.GET /projects/{id}/insights— read the results. Filter as needed.PATCH /projects/{id}/insights/{insight_id}— mark items resolved.POST /projects/{id}/reports— (optional) generate a.docxinsights report.
Quickstart
Files and roles
POST /projects/{id}/files takes one file per call (multipart):
| Form field | Required | Values |
|---|---|---|
file | yes | the binary payload |
role | no (default context) | primary, context, or reference |
category | no | one of document, spreadsheet, presentation, image, cv_resume, contract, report, research, other |
description | no | up to 1000 chars |
tags | no | JSON array string, e.g. '["audit", "q3"]' |
role is a hint to the analysis pipeline: primary files carry
the authoritative claims; context adds supporting material;
reference is for appendices / lookups.
Allowed MIME types: PDF, DOCX, XLSX, PPTX, CSV, TSV, TXT, JSON, XML,
legacy Office formats, and common image types (PNG, JPEG, WebP).
Polling status
Phases
The pipeline moves through these phases (pipeline-level, surfaced inpipeline.phase):
extracting_facts— per-file fact extraction (LLM + NER).reconciling_facts— comparing against the project’s fact ledger.verifying_decisions— agentic verification of escalated items.completed— terminal success.failed— terminal failure (seepipeline.error_message).
files.phase_counts.
Filtering insights
Query parameters onGET /projects/{id}/insights:
| Param | Type | Notes |
|---|---|---|
limit | int | 1–200, default 50 |
cursor | string | opaque, returned as next_cursor |
severity | string | comma list: critical,high,medium,low |
alert_kind | string | comma list: disparity,insight,similarity |
file_id | string | comma list of source file_ids to restrict to |
is_resolved | bool | true or false |
Insight shape
Marking insights resolved
Generating a report
POST /projects/{id}/reports produces a .docx built from the
project’s insights, using the same Word agent /word/generate uses.
Default behaviour: run sync, return the download URL inline.
| Field | Default | Notes |
|---|---|---|
format | "docx" | Only .docx is supported today. |
title | "<project name> — Insights Report" | Overrides the header shown in the doc. |
include_severity | all | Array of critical|high|medium|low. |
include_alert_kind | all | Array of disparity|insight|similarity. |
include_resolved | false | Whether to include items already marked resolved. |
file_ids | all | Restrict to insights whose source file is in this list. |
async | false | Enqueue and return 202 instead of blocking. |
webhook_url | — | Called on completion when async: true. |
POST /word/generate — same run_id,
download_url, edit_url, summary, credits_used, etc. — so you
can reuse your existing Word-generation handling.
409 no_insights is returned if the filter produces zero matching
insights.
Listing past reports
next_cursor.
Fetching a specific report
download_url, so you don’t need
to store it — re-fetch when the URL expires.
Archiving
DELETE /projects/{id} soft-archives the project. Files and insights
are preserved; the project just stops appearing in default listings.
Pass ?include_archived=true on GET /projects to see archived ones.
Scope and visibility
The same scope rules as the rest of the API apply:- Personal keys see projects whose owner matches the key’s user.
- Workspace keys with
canViewAllFilessee every project in the org. Without that permission, members only see projects they own.
404 not_found — we don’t leak existence across tenants.
Credits
Report generation charges Word credits on the org’s balance, identical to aPOST /word/generate call. The per-file extraction and
reconciliation stages run on the internal pipeline and are logged
through the same LLM usage tracking the web app uses, so org-level
usage rolls up automatically.
Every attach call and every GET is free.