Back to the journal

June 10, 2026 · 6 min · product · api

Synthesia pricing in 2026 — and the alternatives when you need more than rendered video

Synthesia's plans, what the API actually includes, and the alternative paths — D-ID, HeyGen, and realtime avatar APIs — when your use case is interactive rather than rendered.

Synthesia is the default name in AI presenter video, and for produced training content it has earned that. But its pricing model — and one structural gap — decide quickly whether it fits your project. Here are the numbers as of June 2026, and the alternatives for the cases it doesn't cover. (Always confirm against the official pricing page; plans move.)

Synthesia pricing, decoded

  • Basic (free): 10 minutes of video a month, 9 stock avatars, no API
  • Starter $29/mo ($18 annual): 10 min/mo, 125+ avatars, 3 personal avatars, no API
  • Creator $89/mo ($64 annual): 5 min/mo, 180+ avatars — and API access, capped at 360 minutes of video per year, deducted from the plan's own limits
  • Enterprise: custom pricing, unlimited minutes, 240+ avatars

Two things developers should notice. First, there is no standalone, metered API price — API access is bundled into Creator/Enterprise plan limits, so you can't scale API usage independently of a seat plan. Second, Synthesia has no realtime conversational avatar: its "interactivity" is in-video quizzes and links, and its Video Agents are marked "coming soon." Everything it ships today is rendered, not live.

Synthesia alternatives, by job

For produced video at lower entry cost: D-ID's studio plans start at $5.9/mo, and HeyGen's Creator plan at $29/mo undercuts Synthesia's per-minute economics for short clips. Synthesia still wins on enterprise features, template depth, and localization breadth for L&D teams.

For developer APIs with real usage-based pricing: HeyGen's API wallet charges per second of generated video ($0.05/sec for photo avatars); D-ID sells credit plans that include streaming minutes (effectively $0.50–0.56/min for realtime streams).

For live, conversational avatars — tutors that check understanding out loud, trainers that push back, companions that hold a persona — rendered video is the wrong primitive entirely. That's the category TIC Realtime Avatar occupies: audio-clocked live video with sub-second first frames, priced purely in realtime minutes at about $5/hour ($49/mo for ten hours up to $999/mo for 216 hours; free 5-minute sandbox), with a typed TypeScript SDK and an MCP server for agent-driven apps.

The honest decision rule

If a human will watch it, render it with Synthesia or HeyGen. If a human will talk to it, you need a realtime avatar API — and per-minute pricing you can meter.

Training is the instructive case: Synthesia excels at producing the course video; it cannot roleplay the difficult customer with your trainee. Plenty of teams end up with both — rendered video for the curriculum, a live character for the rehearsal.

Trying the live path

If your shortlist exists because "interactive" crept into the requirements, prototype the conversation before committing anywhere: register a character from a single image, warm it with one API call, and measure time-to-first-frame against your latency budget. Our sandbox's free monthly minutes exist precisely so that test costs nothing.

Meet the cast. Hold the first conversation.

Enter the studio