Realtime Avatar
Prepare AI avatars, stream realtime lipsync turns, and render audio to talking-head video — from one typed API.
Realtime Avatar turns a portrait or source video into a talking avatar. Drive it two ways:
Realtime turns
Stream audio-clocked talking-avatar video as the model speaks — for live conversation and co-hosting.
Non-streaming lipsync
Pass a pre-recorded audio URL, get back a rendered MP4 URL. Ideal for backend and batch jobs.
Start here
Quickstart
A talking avatar in a few lines, in TypeScript or Python.
Authentication
API keys, environments, and keeping live keys out of the browser.
TypeScript SDK
realtime-avatar — client, streaming player, React hooks.
Python SDK
realtime-avatar — lean httpx client for backend and batch.
How it works
Create an avatar
Register a portrait image or a source video as an avatar (ava_…).
Prepare
Warm the avatar once so the first turn is fast. (Lipsync prepares implicitly.)
Drive it
Either stream a turn (live video) or lipsync an audio URL (MP4 URL back).
For agents and tooling
Machine-readable references: the
OpenAPI spec,
llms.txt, and a remote
MCP server at /api/mcp.