The Slides endpoint produces professional .pptx output with a cohesive
visual system: one palette across the deck, consistent type scale, visual
pacing between hero / content / section-break slides. The agent thinks
like a senior UI/UX designer.
What the Slides agent can do
Ordered from “used in almost every deck” to specialized:
- Multi-slide decks with a cohesive theme — the agent picks a
color palette, accent color, type scale, and spacing system once and
applies it consistently across every slide.
- Rich layouts — hero / title slides, agenda slides, two-column
and three-column content, KPI / big-number cards, comparison
matrices, quote breaks, thank-you closers. The agent picks the right
layout per slide based on what the content is doing.
- Data visualizations — column / bar / line / pie / combo /
donut / scatter / timeline / roadmap charts with the deck’s accent
color; KPI cards with prominent numbers; process flowcharts and
pyramid / funnel diagrams.
- Tables on slides — headed tables, shaded rows, per-cell borders
and alignment; great for feature comparisons and metrics breakdowns.
- Text styling per shape — font family, size, weight, color,
alignment, line height; bold and italic runs inline; bulleted and
numbered lists.
- Backgrounds — solid color, gradient (linear or radial), or
uploaded image; per-slide override so hero and section-break slides
can differ from content slides.
- Speaker notes — the agent writes presenter notes per slide when
the prompt asks for them; notes travel with the
.pptx and appear
in PowerPoint / Keynote / Google Slides presenter view.
- Images and icons — embed uploaded assets sized and positioned to
the layout; re-use the same image across multiple slides.
- Shapes and callouts — rectangles, circles, arrows, connectors,
process-flow shapes — positioned precisely with x/y/width/height.
- Slide-level actions — insert, replace, duplicate, delete, or
reorder specific slides in an existing deck (via follow-up calls
with the same
run_id).
- Outline-first drafting — produce just the slide titles and
bullets via
POST /slides/outline for preview / approval before
committing to a full deck build.
Describe the deck’s purpose and audience; the agent picks the layouts,
palette, and pacing. You can steer with explicit hints (“dark-navy
theme”, “Series B pitch deck”, “include a timeline on slide 7”) and
the agent respects them.
Quickstart
resp = requests.post(
f"{API}/slides/generate",
headers={"X-API-Key": KEY},
json={
"prompt": (
"10-slide Q3 investor update for our SaaS company. "
"Focus: MRR growth, churn reduction, Q4 roadmap. "
"Visual style: clean, minimal, dark navy accents."
),
},
)
Output: a 10-slide .pptx with hero opener, content slides, section
breaks, and a closing CTA.
Presentation modes
The agent picks the right mode from the prompt, but you can nudge it:
| Mode | Signal words | Typical output |
|---|
| Visual-first / minimal | ”pitch deck”, “narrative”, “storytelling”, “clean” | Big hero images, few bullets, emphasis on hierarchy |
| Balanced | ”investor update”, “business review” | Mix of visual and data; moderate text density |
| Information-dense | ”technical deep-dive”, “detailed analysis”, “academic” | Tighter layouts, more tables/charts, appendix-friendly |
For pitch decks, keynotes, and executive summaries, the agent defaults
to visual-first. For internal reporting and technical content, it
leans information-dense.
Working with images
Image generation is NOT part of the API. Upload your own and reference
them:
hero = requests.post(
f"{API}/images", headers=headers,
files={"file": open("hero.png", "rb")},
data={"caption": "Q3 hero", "tags": '["hero"]'},
).json()
product = requests.post(
f"{API}/images", headers=headers,
files={"file": open("dashboard.png", "rb")},
data={"tags": '["product", "screenshot"]'},
).json()
resp = requests.post(f"{API}/slides/generate", headers=headers, json={
"prompt": "10-slide pitch deck with product screenshots",
"image_assets": [
{"image_id": hero["image_id"], "tags": ["hero"]},
{"image_id": product["image_id"], "tags": ["product"]},
],
})
If your prompt references imagery but no image_assets are supplied,
preflight rejects the request upfront with
400 preflight_failed: missing=["images"].
Brand customization
Pass a brand object to enforce your visual identity:
resp = requests.post(f"{API}/slides/generate", headers=headers, json={
"prompt": "10-slide pitch deck on Series B round",
"brand": {
"primary_color": "#2E75B6",
"font": "Inter",
"logo_url": "https://cdn.acme.com/logo.svg",
},
"image_assets": [...],
})
The agent threads the color through headers, accents, dividers, and the
closing CTA. Font applies to all slide text.
Editing existing decks
upload = requests.post(
f"{API}/files", headers=headers,
files={"file": open("deck_draft.pptx", "rb")},
data={"purpose": "init"},
).json()
resp = requests.post(f"{API}/slides/generate", headers=headers, json={
"prompt": "Replace slide 3 with a KPI summary showing three metric cards",
"init_file": upload["file_id"],
})
Outline-only generation
If you want the agent to draft just the slide structure (titles + bullets)
before committing to a full deck:
resp = requests.post(f"{API}/slides/outline", headers=headers, json={
"prompt": "10-slide Series B pitch deck",
"slide_count_hint": 10,
})
# → {"outline": [{"title": "...", "bullets": [...]}, ...]}
Use this for preview / approval workflows. Once the outline’s good, pass
the same run_id to /slides/generate with a follow-up prompt like “now
build the full deck from this outline”.
Slide types
The agent builds three canonical types:
- Hero — dark / accent background, bold headline. Opening and closing
slides default to this.
- Content — white/light background, 3–5 bullets max, consistent type.
- Section break — gradient tint, big section title, transitions
between major chapters.
Visual pacing: no more than 2 consecutive slides with the same layout family.
Things the agent handles well
- Cohesive palette — one industry preset (finance, tech, healthcare,
etc.) per deck. Inherited by every slide.
- Chart-led proof slides — market size, traction, comparison matrices.
- Split-panel layouts — image left, text right (or vice versa).
- Sticky type scale — headline, body, caption — applied uniformly.
- Contrast rules — text on dark backgrounds is always light, and
vice versa. Never invisible.
Things to be aware of
- Image search is not available in API mode. Upload what you want to
use. The agent won’t search the web for images.
- Max slide count: 30 per call. Larger decks should be built in
chunks via
run_id resume — each follow-up call costs only a
fraction of the first because the agent remembers the deck’s style.
- Speaker notes are supported — the agent writes them into the
notes pane of each slide when your prompt asks for them (e.g. “add
speaker notes under each slide for the presenter”). They travel with
the
.pptx and show up in PowerPoint / Keynote / Google Slides
presenter view.
- Pattern selection is agent-driven. Describe the layout you want
in your prompt (“use a three-column comparison on slide 4”, “make
slide 7 a big-number KPI card”); the agent matches to the closest
supported pattern.