Limits and specifications of the Live API


This page describes various limits and specifications for using the Live API and its models.

Session-related limits

For the Live API, a session refers to a persistent connection where input and output are streamed continuously over the same connection.

If the session exceeds any of the following limits, the connection is terminated.

  • Connection length is limited to around 10 minutes.

  • Session length depends on the input modalities:

    • Audio-only input sessions are limited to 15 minutes.
    • Video + audio input are limited to 2 minutes.
  • Session context window is limited to 128k tokens.

Rate limits

The Live API has rate limits for both concurrent sessions per Firebase project as well as tokens per minute (TPM).

  • Gemini Developer API:

  • Vertex AI Gemini API:

    • 5,000 concurrent sessions per Firebase project
    • 4M tokens per minute

Audio formats

The Live API supports the following audio formats:

  • Input audio format: Raw 16 bit PCM audio at 16kHz little-endian
  • Output audio format: Raw 16 bit PCM audio at 24kHz little-endian

To convey the sample rate of input audio, set the MIME type of each audio-containing Blob to a value like audio/pcm;rate=16000.

Video formats

The Live API expects a sequence of discrete image frames and supports video frames input at 1 frame per second (FPS). For best results, use native 768x768 resolution at 1 FPS.

Note that this specification makes the Live API unsuitable for use cases that require analyzing fast-changing video, such as play-by-play in high-speed sports.

Response voices

The Live API supports the following response voice options. For demos of what each voice sounds like, see Chirp 3: HD voices.

If you don't specify a response voice, the default is Puck.

Learn how to specify the response voice.

Zephyr -- Bright
Kore -- Firm
Orus -- Firm
Autonoe -- Bright
Umbriel -- Easy-going
Erinome -- Clear
Laomedeia -- Upbeat
Schedar -- Even
Achird -- Friendly
Sadachbia -- Lively
Puck -- Upbeat
Fenrir -- Excitable
Aoede -- Breezy
Enceladus -- Breathy
Algieba -- Smooth
Algenib -- Gravelly
Achernar -- Soft
Gacrux -- Mature
Zubenelgenubi -- Casual
Sadaltager -- Knowledgeable
Charon -- Informative
Leda -- Youthful
Callirrhoe -- Easy-going
Iapetus -- Clear
Despina -- Smooth
Rasalgethi -- Informative
Alnilam -- Firm
Pulcherrima -- Forward
Vindemiatrix -- Gentle
Sulafat -- Warm

Languages

The Live API supports the following languages. Learn how to influence the response language.

Language BCP-47 Code Language BCP-47 Code
Arabic (Egyptian) ar-EG German (Germany) de-DE
English (US) en-US Spanish (US) es-US
French (France) fr-FR Hindi (India) hi-IN
Indonesian (Indonesia) id-ID Italian (Italy) it-IT
Japanese (Japan) ja-JP Korean (Korea) ko-KR
Portuguese (Brazil) pt-BR Russian (Russia) ru-RU
Dutch (Netherlands) nl-NL Polish (Poland) pl-PL
Thai (Thailand) th-TH Turkish (Turkey) tr-TR
Vietnamese (Vietnam) vi-VN Romanian (Romania) ro-RO
Ukrainian (Ukraine) uk-UA Bengali (Bangladesh) bn-BD
English (India) en-IN & hi-IN bundle Marathi (India) mr-IN
Tamil (India) ta-IN Telugu (India) te-IN