Web playback
Play a streaming turn on a canvas with AvatarPlayer.
AvatarPlayer (from realtime-avatar/browser) decodes the avatar mux stream and
renders it to a <canvas> with an audio-clocked loop, so audio and video stay in
sync. It handles JPEG/I420 decode, buffering, and frame dropping for you.
Minimal player
import { RealtimeAvatarClient } from "realtime-avatar";
import { AvatarPlayer } from "realtime-avatar/browser";
const client = RealtimeAvatarClient.webProxy(); // browser-safe; key stays server-side
const player = new AvatarPlayer();
player.attach(document.querySelector("canvas")!);
// Unlock audio inside a user gesture (browser autoplay policy):
playButton.addEventListener("click", async () => {
await player.unlock();
await client.prepare({ avatar_id: "ava_…" });
const stream = await client.turn({ avatar_id: "ava_…", mode: "speak_text", text: "Hi!" });
await player.play(stream);
});Browsers start audio suspended until a user gesture. Call player.unlock() from
a click/tap before the first turn, or the avatar will appear to play silently.
React
For React apps, start with useRealtimeAvatar — it hides WebSocket setup, GPU
prepare, audio-clocked canvas playback, optional WHEP/SFU preconnect, keepalives,
and HTTP fallback:
import { useRealtimeAvatar, RealtimeAvatarView } from "realtime-avatar/react";
function Avatar() {
const avatar = useRealtimeAvatar({
avatarId: "ava_…",
mode: "interactive", // default: fastest 1:1 path; "livestream" preconnects WHEP/SFU
});
return (
<>
<RealtimeAvatarView avatar={avatar} style={{ aspectRatio: "9 / 16" }} />
<button
disabled={!avatar.ready || avatar.busy}
onClick={() => void avatar.speak("Hello!")}
>
Speak
</button>
</>
);
}The controller exposes ready/busy/error for UI state, chat(text, { history })
for LLM turns, stop() to cancel a turn while keeping the warm socket, and
metrics/transport/mediaState for debugging. Set app-wide defaults once with
RealtimeAvatarProvider. The lower-level useAvatarSession + AvatarStage remain
available for advanced control.
Lifecycle
attach(canvas)— bind (or rebind) the canvas.unlock()— resume the audio context from a gesture; safe to call repeatedly.play(stream)— resolves only after playout fully drains (no end-of-turn freeze).stop()/dispose()— stop the current turn / release the audio context on unmount.