The latest Gemini models, like Gemini 3.1 Flash Image (Nano Banana 2), are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models will shut down on June 1, 2026. To avoid service disruption, update to a newer model like gemini-3.1-flash-lite. Learn more.

All Imagen models will shut down on June 24, 2026. Learn about migrating your apps to use Nano Banana.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Gestisci le sessioni per l'API Live

Il Gemini Live API elabora flussi continui di audio o testo chiamati sessioni. Puoi gestire il ciclo di vita della sessione, dall'handshake iniziale alla terminazione controllata.

Limiti per le sessioni

Per l'Live API, una sessione si riferisce a una connessione persistente in cui l'input e l'output vengono trasmessi in streaming continuo su una connessione.

Se la sessione supera uno qualsiasi dei seguenti limiti, la connessione viene terminata. Tieni presente, tuttavia, che il Live API fornisce alcune opzioni (vedi di seguito) per gestire questi limiti relativi alla sessione.

La finestra contestuale della sessione è limitata a 128.000 token.

A causa di questo limite della finestra contestuale, ecco le durate massime approssimative delle sessioni in base alle modalità di input:
- Le sessioni di input solo audio sono limitate a 15 minuti.
- L'input video + audio è limitato a 2 minuti.
La durata della connessione è limitata a circa 10 minuti.

Riceverai una notifica di interruzione circa 60 secondi prima della fine della connessione.

Ecco alcune opzioni per gestire i limiti relativi alla sessione:

Comprimi la finestra contestuale della sessione in modo che il server mantenga automaticamente le dimensioni del contesto entro il limite.
Riprendi una sessione per evitare di perdere il contesto della conversazione durante brevi disconnessioni di rete o dopo aver ricevuto una notifica di interruzione.

Avvia una sessione

Per uno snippet completo che mostra come avviare una sessione, consulta la guida introduttiva Live API.

Aggiorna a metà sessione

I modelli Live API supportano le seguenti funzionalità avanzate per gli aggiornamenti a metà sessione:

Aggiungi aggiornamenti incrementali dei contenuti
Aggiorna le istruzioni di sistema (solo per Vertex AI Gemini API)

Aggiungi aggiornamenti incrementali dei contenuti

Puoi aggiungere aggiornamenti incrementali durante una sessione attiva. Utilizza questa opzione per inviare input di testo, stabilire il contesto della sessione o ripristinare il contesto della sessione.

Per contesti più lunghi, ti consigliamo di fornire un riepilogo di un singolo messaggio per liberare la finestra contestuale per le interazioni successive.
Per contesti brevi, puoi inviare interazioni passo passo per rappresentare la sequenza esatta di eventi, come lo snippet riportato di seguito.

Swift

// Define initial turns (history/context).
let turns: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of France?")]),
  ModelContent(role: "model", parts: [TextPart("Paris")]),
]

// Send history, keeping the conversational turn OPEN (false).
await session.sendContent(turns, turnComplete: false)

// Define the new user query.
let newTurn: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of Germany?")]),
]

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.sendContent(newTurn, turnComplete: true)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

const turns = [{ text: "Hello from the user!" }];

await session.send(
  turns,
  false // turnComplete: false
);

console.log("Sent history. Waiting for next input...");

// Define the new user query.
const newTurn [{ text: "And what is the capital of Germany?" }];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
    newTurn,
    true // turnComplete: true
);
console.log("Sent final query. Model response expected now.");

Dart

// Define initial turns (history/context).
final List turns = [
  Content(
    "user",
    [Part.text("What is the capital of France?")],
  ),
  Content(
    "model",
    [Part.text("Paris")],
  ),
];

// Send history, keeping the conversational turn OPEN (false).
await session.send(
  input: turns,
  turnComplete: false,
);

// Define the new user query.
final List newTurn = [
  Content(
    "user",
    [Part.text("What is the capital of Germany?")],
  ),
];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
  input: newTurn,
  turnComplete: true,
);

Unity

// Define initial turns (history/context).
List turns = new List {
    new ModelContent("user", new ModelContent.TextPart("What is the capital of France?") ),
    new ModelContent("model", new ModelContent.TextPart("Paris") ),
};

// Send history, keeping the conversational turn OPEN (false).
foreach (ModelContent turn in turns)
{
    await session.SendAsync(
        content: turn,
        turnComplete: false
    );
}

// Define the new user query.
ModelContent newTurn = ModelContent.Text("What is the capital of Germany?");

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.SendAsync(
    content: newTurn,
    turnComplete: true
);

Aggiorna le istruzioni di sistema a metà sessione

Disponibile solo quando utilizzi Vertex AI Gemini API come provider API.

Puoi aggiornare le istruzioni di sistema durante una sessione attiva. Utilizza questa opzione per adattare le risposte del modello, ad esempio per modificare la lingua della risposta o il tono.

Per aggiornare le istruzioni di sistema a metà sessione, puoi inviare contenuti di testo con il ruolo system. Le istruzioni di sistema aggiornate rimarranno in vigore per il resto della sessione.

Swift

await session.sendContent(
  [ModelContent(
    role: "system",
    parts: [TextPart("new system instruction")]
  )],
  turnComplete: false
)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

Not yet supported for Web apps - check back soon!

Dart

try {
  await _session.send(
    input: Content(
      'system',
      [Part.text('new system instruction')],
    ),
    turnComplete: false,
  );
} catch (e) {
  print('Failed to update system instructions: $e');
}

Unity

try
{
    await session.SendAsync(
        content: new ModelContent(
            "system",
            new ModelContent.TextPart("new system instruction")
        ),
        turnComplete: false
    );
}
catch (Exception e)
{
    Debug.LogError($"Failed to update system instructions: {e.Message}");
}

Comprimi la finestra contestuale

Fai clic sul tuo provider Gemini API per visualizzare i contenuti specifici del provider e il codice in questa pagina.

La Live API finestra contestuale della sessione archivia i dati trasmessi in streaming in tempo reale (25 token al secondo (TPS) per l'audio e 258 TPS per il video), nonché altri contenuti, inclusi input di testo e output del modello. Tutti i modelli Live API hanno un limite della finestra contestuale della sessione di 128.000 token .

Per impostazione predefinita, a causa di questo limite della finestra contestuale, ecco le durate massime approssimative delle sessioni in base alle modalità di input:

Le sessioni di input solo audio sono limitate a 15 minuti.
L'input video + audio è limitato a 2 minuti.

Nelle sessioni a lunga esecuzione, man mano che la conversazione procede, la cronologia dei token audio e/o video si accumula. Se questa cronologia supera il limite del modello, il modello potrebbe avere allucinazioni, rallentare o la sessione potrebbe essere terminata forzatamente.

Per consentire sessioni più lunghe, puoi attivare la compressione della finestra contestuale impostando il campo contextWindowCompression come parte di LiveGenerationConfig. Quando è attivato, il server utilizza un meccanismo di finestra scorrevole per eliminare automaticamente i turni più vecchi o riepilogarli per mantenere le dimensioni del contesto entro i limiti predefiniti o specificati. Le istruzioni di sistema non vengono eliminate e rimarranno sempre all'inizio della finestra contestuale.

Dal punto di vista dell'utente, questo consente durate delle sessioni teoricamente infinite, poiché la "memoria" viene gestita costantemente.

Puoi configurare il meccanismo della finestra scorrevole, nonché facoltativamente il numero di token che attiva la compressione (vedi le impostazioni e i valori disponibili di seguito). Ecco alcune considerazioni di alto livello sull'utilizzo di queste impostazioni:

Se imposti targetTokens su un valore molto basso, libererai più spazio contestuale per i flussi continui, ma il modello "dimenticherà" rapidamente i turni precedenti della conversazione.
Se imposti targetTokens su un valore più vicino a triggerTokens, verrà conservata più memoria, ma le routine di compressione verranno attivate molto più spesso.

Impostazione	Valore predefinito per la finestra scorrevole se non impostato nella configurazione	Valore minimo	Valore massimo
`triggerTokens` la finestra contestuale prima dell'attivazione della compressione	80% del limite della finestra contestuale del modello	5000	128.000
`targetTokens` il numero di token di destinazione da conservare	50% del valore `triggerTokens` Se `triggerTokens` non è impostato esplicitamente, `targetTokens` assume per impostazione predefinita il 50% del valore predefinito `triggerTokens`. Il valore `targetTokens` deve essere inferiore al valore `triggerTokens`	0	128.000

Swift


// Initialize the Gemini Developer API backend service
let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(
        targetTokens: 2000,
      )
    )
  )
)

Kotlin


// Initialize the Gemini Developer API backend service
val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO,
        contextWindowCompression = ContextWindowCompressionConfig(
            triggerTokens = 10000,
            slidingWindow = SlidingWindow(targetTokens = 2000)
        )
    }
)

Java


// Initialize the Gemini Developer API backend service
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.5-flash-native-audio-preview-12-2025",
        // Enable context window compression.
        // (Optional) Configure the number of tokens in the context window that triggers the compression.
        new LiveGenerationConfig.Builder()
                .setResponseModality(ResponseModality.AUDIO)
                .setContextWindowCompression(
                        new ContextWindowCompressionConfig(10000, new SlidingWindow(2000))
                )
                .build()
);

Web


const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    contextWindowCompression: {
      triggerTokens: 10000,
      slidingWindow: {
        targetTokens: 2000,
      },
    },
  },
});

Dart


final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(targetTokens: 2000),
    ),
  ),
);

Unity


var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        contextWindowCompression: new ContextWindowCompressionConfig(
            triggerTokens: 10000,
            slidingWindow: new SlidingWindow(targetTokens: 2000)
        )
    )
);

Rileva quando una sessione sta per terminare

La durata massima di una singola connessione WebSocket continua è di circa 10 minuti. Al client viene inviata una notifica di interruzione 60 secondi prima della fine della connessione, che può aiutarti a intraprendere ulteriori azioni (ad esempio, riprendere una sessione).

L'esempio seguente mostra come rilevare un'imminente terminazione della connessione ascoltando una notifica di interruzione:

Swift

for try await response in session.responses {
  switch response.payload {

  case .goingAwayNotice(let goingAwayNotice):
    // Prepare for the session to close soon
    if let timeLeft = goingAwayNotice.timeLeft {
        print("Server going away in \(timeLeft) seconds")
    }
  }
}

Kotlin

for (response in session.responses) {
    when (val message = response.payload) {
        is LiveServerGoAway -> {
            // Prepare for the session to close soon
            val remaining = message.timeLeft
            logger.info("Server going away in $remaining")
        }
    }
}

Java

session.getResponses().forEach(response -> {
    if (response.getPayload() instanceof LiveServerResponse.GoingAwayNotice) {
        LiveServerResponse.GoingAwayNotice notice = (LiveServerResponse.GoingAwayNotice) response.getPayload();
        // Prepare for the session to close soon
        Duration timeLeft = notice.getTimeLeft();
    }
});

Web

for await (const message of session.receive()) {
  switch (message.type) {

  ...
  case "goingAwayNotice":
    console.log("Server going away. Time left:", message.timeLeft);
    break;
  }
}

Dart

Future _handleLiveServerMessage(LiveServerResponse response) async {
  final message = response.message;
  if (message is GoingAwayNotice) {
     // Prepare for the session to close soon
     developer.log('Server going away. Time left: ${message.timeLeft}');
  }
}

Unity

foreach (var response in session.Responses) {
    if (response.Payload is LiveSessionGoingAway notice) {
        // Prepare for the session to close soon
        TimeSpan timeLeft = notice.TimeLeft;
        Debug.Log($"Server going away notice received. Remaining: {timeLeft}");
    }
}

Riprendi una sessione

Il Live API supporta la ripresa della sessione per evitare di perdere il contesto della conversazione. Ogni sessione ha un handle che può essere utilizzato nei seguenti modi:

Mantenere una sessione prima di raggiungere il limite di tempo della connessione

La durata massima di una singola connessione WebSocket continua è di circa 10 minuti. Puoi rilevare quando una connessione sta per terminare ascoltando una notifica di interruzione e poi estendere la sessione stabilendo una nuova connessione utilizzando l'handle della sessione.
Riprendere una sessione subito dopo un'interruzione della connessione

Se una connessione termina o si interrompe prima del limite di tempo massimo della connessione (ad esempio, passando dal Wi-Fi al 5G), il server mantiene lo stato della sessione per circa 10 minuti. Durante questa finestra, puoi riprendere la sessione stabilendo una nuova connessione utilizzando l'handle della sessione.
Riprendere una sessione dopo un periodo di tempo prolungato

Al termine di una connessione, il server mantiene lo stato della sessione per alcune ore. Durante questa finestra, puoi riprendere la sessione stabilendo una nuova connessione utilizzando l'handle della sessione. Tieni presente che questa finestra è diversa per i due Gemini API provider: Gemini Developer API è di 2 ore, mentre Vertex AI Gemini API è di 24 ore.

Per impostazione predefinita, la ripresa della sessione è disattivata. Per attivare la ripresa della sessione, passa una configurazione di ripresa vuota quando stabilisci una nuova connessione. Quando è attivato, il server invia periodicamente aggiornamenti contenenti un handle di ripresa della sessione. Se la sessione viene disconnessa, puoi riconnetterti e passare questo handle per riprendere la sessione con il contesto intatto.

Gli esempi seguenti mostrano due opzioni per riprendere la sessione:

Swift

// Local variable to save the active session handle
var activeSessionHandle: String?

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = try await liveModel.connect(
  sessionResumption: SessionResumptionConfig()
)

// Start receiving responses
for try await message in session.responses {
  // Check for new session handles inside your message handling loop
  switch message.payload {
  case let .sessionResumptionUpdate(updateMessage):
    guard let newHandle = updateMessage.newHandle, updateMessage.resumable else {
      continue
    }
    activeSessionHandle = newHandle
    print("SessionResumptionUpdate: handle \(newHandle)")
  // ... handle other LiveServerMessage types ...
  default:
    break
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if let handle = activeSessionHandle {
  session = try await liveModel.connect(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

// Option 2: Resume the session directly on an existing session object
if let handle = activeSessionHandle {
  try await session.resumeSession(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

Kotlin

// Local variable to save the active session handle
var activeSessionHandle: String? = null

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = liveModel.connect(
    sessionResumption = SessionResumptionConfig()
)

// Start receiving responses
session.receive().collect { message ->
    // Process other received response types...

    // Check for new session handles inside your message handling loop
    if (message is LiveSessionResumptionUpdate) {
        if (message.resumable == true && message.newHandle != null) {
            activeSessionHandle = message.newHandle
            Log.d("TAG", "SessionResumptionUpdate: handle ${message.newHandle}")
        }
    }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
activeSessionHandle?.let { handle ->
    session = liveModel.connect(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

// Option 2: Resume the session directly on an existing session object
activeSessionHandle?.let { handle ->
    session.resumeSession(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

Java

For Java, session resumption is not yet supported. Check back soon!

Web

// Local variable to save the active session handle
let activeSessionHandle = null;

// Initialize the session. Passing an empty object requests the server to send SessionResumptionUpdate
let session = await liveModel.connect({});

// Start receiving responses
for await (const message of session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message.type === 'sessionResumptionUpdate') {
    if (message.resumable && message.newHandle) {
      activeSessionHandle = message.newHandle;
      console.log(`SessionResumptionUpdate: handle ${activeSessionHandle}`);
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (activeSessionHandle) {
  session = await liveModel.connect({
    handle: activeSessionHandle
  });
}

// Option 2: Resume the session directly on an existing session object
if (activeSessionHandle) {
  await session.resumeSession({
    handle: activeSessionHandle
  });
}

Dart

// Local variable to save the active session handle
String? _activeSessionHandle;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var _session = await _liveModel.connect(
  sessionResumption: SessionResumptionConfig(),
);

// Start receiving responses
await for (final message in _session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message is SessionResumptionUpdate &&
      message.resumable != null &&
      message.resumable!) {
    _activeSessionHandle = message.newHandle;
    log('SessionResumptionUpdate: handle ${message.newHandle}');
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (_activeSessionHandle != null) {
  _session = await _liveModel.connect(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

// Option 2: Alternatively, resume the session directly on an existing session object
if (_activeSessionHandle != null) {
  await _session.resumeSession(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

Unity

// Local variable to save the active session handle
string activeSessionHandle = null;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = await liveModel.ConnectAsync(
    sessionResumption: new SessionResumptionConfig()
);

// Start receiving responses
await foreach (var response in session.ReceiveAsync())
{
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (response.Message is LiveSessionResumptionUpdate updateMessage)
  {
    if (updateMessage.Resumable == true && !string.IsNullOrEmpty(updateMessage.NewHandle))
    {
      activeSessionHandle = updateMessage.NewHandle;
      Debug.Log($"SessionResumptionUpdate: handle {activeSessionHandle}");
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  session = await liveModel.ConnectAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}

// Option 2: Resume the session directly on an existing session object
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  await session.ResumeSessionAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}

Gestisci le sessioni per l'API Live Mantieni tutto organizzato con le raccolte Salva e classifica i contenuti in base alle tue preferenze.

Limiti per le sessioni

Avvia una sessione

Aggiorna a metà sessione

Aggiungi aggiornamenti incrementali dei contenuti

Swift

Kotlin

Java

Web

Dart

Unity

Aggiorna le istruzioni di sistema a metà sessione

Swift

Kotlin

Java

Web

Dart

Unity

Comprimi la finestra contestuale

Swift

Kotlin

Java

Web

Dart

Unity

Rileva quando una sessione sta per terminare

Swift

Kotlin

Java

Web

Dart

Unity

Riprendi una sessione

Swift

Kotlin

Java

Web

Dart

Unity

Gestisci le sessioni per l'API Live