The latest Gemini models, like Gemini 3.5 Flash, are available to use with Firebase AI Logic! Learn more.

All Imagen models will shut down as early as June 30, 2026. Learn about migrating your apps to use Nano Banana.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

管理 Live API 的工作階段

Gemini Live API 會處理稱為「工作階段」的連續音訊或文字串流。您可以管理工作階段生命週期，包括初始信號交換和正常終止。

工作階段限制

如果是 Live API，工作階段是指持續連線，透過連線持續串流輸入和輸出內容。

如果工作階段超出任何下列限制，連線就會終止。不過請注意，Live API 提供了一些選項 (請參閱下文)，可處理這些工作階段相關限制。

工作階段脈絡窗口最多只能有 12.8 萬個權杖。

由於有脈絡窗口限制，以下是根據輸入模式估算的對話長度上限：
- 純語音輸入工作階段的時間上限為 15 分鐘。
- 影片和音訊輸入內容的長度上限為 2 分鐘。
連線時間上限約為 10 分鐘。

連線結束前 60 秒，你會收到即將中斷通知。

以下是處理工作階段相關限制的幾種方式：

壓縮工作階段內容視窗，讓伺服器自動將內容大小維持在限制範圍內。
繼續工作階段：如果網路短暫中斷，或收到「即將離開」通知，請繼續工作階段，以免遺失對話內容。

開始練習

請參閱入門指南，瞭解如何啟動工作階段的完整程式碼片段。Live API

在工作階段中更新

Live API 模型支援下列進階功能，可進行工作階段中更新：

新增增量內容更新
更新系統指令 (僅適用於 Vertex AI Gemini API)

新增增量內容更新

您可以在有效的工作階段中新增增量更新。您可以使用這項功能傳送文字輸入內容、建立工作階段脈絡或還原工作階段脈絡。

如果對話內容較長，建議提供單一訊息摘要，為後續互動釋出對話視窗空間。
如果是簡短的脈絡，您可以傳送逐輪互動，代表確切的事件順序，如下方程式碼片段所示。

Swift

// Define initial turns (history/context).
let turns: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of France?")]),
  ModelContent(role: "model", parts: [TextPart("Paris")]),
]

// Send history, keeping the conversational turn OPEN (false).
await session.sendContent(turns, turnComplete: false)

// Define the new user query.
let newTurn: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of Germany?")]),
]

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.sendContent(newTurn, turnComplete: true)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

const turns = [{ text: "Hello from the user!" }];

await session.send(
  turns,
  false // turnComplete: false
);

console.log("Sent history. Waiting for next input...");

// Define the new user query.
const newTurn [{ text: "And what is the capital of Germany?" }];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
    newTurn,
    true // turnComplete: true
);
console.log("Sent final query. Model response expected now.");

Dart

// Define initial turns (history/context).
final List turns = [
  Content(
    "user",
    [Part.text("What is the capital of France?")],
  ),
  Content(
    "model",
    [Part.text("Paris")],
  ),
];

// Send history, keeping the conversational turn OPEN (false).
await session.send(
  input: turns,
  turnComplete: false,
);

// Define the new user query.
final List newTurn = [
  Content(
    "user",
    [Part.text("What is the capital of Germany?")],
  ),
];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
  input: newTurn,
  turnComplete: true,
);

Unity

// Define initial turns (history/context).
List turns = new List {
    new ModelContent("user", new ModelContent.TextPart("What is the capital of France?") ),
    new ModelContent("model", new ModelContent.TextPart("Paris") ),
};

// Send history, keeping the conversational turn OPEN (false).
foreach (ModelContent turn in turns)
{
    await session.SendAsync(
        content: turn,
        turnComplete: false
    );
}

// Define the new user query.
ModelContent newTurn = ModelContent.Text("What is the capital of Germany?");

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.SendAsync(
    content: newTurn,
    turnComplete: true
);

在工作階段中更新系統指令

只有在使用 Vertex AI Gemini API 做為 API 供應商時，才能使用這項功能。

在工作階段進行期間，你可以更新系統指示。您可以使用這項功能調整模型的回覆，例如變更回覆語言或修改語氣。

如要在工作階段中更新系統指令，可以傳送具有 system 角色的文字內容。更新後的系統指令會在工作階段的其餘時間內維持有效。

Swift

await session.sendContent(
  [ModelContent(
    role: "system",
    parts: [TextPart("new system instruction")]
  )],
  turnComplete: false
)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

Not yet supported for Web apps - check back soon!

Dart

try {
  await _session.send(
    input: Content(
      'system',
      [Part.text('new system instruction')],
    ),
    turnComplete: false,
  );
} catch (e) {
  print('Failed to update system instructions: $e');
}

Unity

try
{
    await session.SendAsync(
        content: new ModelContent(
            "system",
            new ModelContent.TextPart("new system instruction")
        ),
        turnComplete: false
    );
}
catch (Exception e)
{
    Debug.LogError($"Failed to update system instructions: {e.Message}");
}

壓縮脈絡窗口

按一下 Gemini API 供應商，即可在這個頁面查看供應商專屬內容和程式碼。

Live API 工作階段脈絡窗口會儲存即時串流資料 (音訊為每秒 25 個權杖，影片為每秒 258 個權杖)，以及其他內容，包括文字輸入內容和模型輸出內容。所有 Live API 模型的工作階段脈絡窗口上限為 12.8 萬個權杖。

根據預設，由於有脈絡窗口限制，以下是根據輸入模式得出的約略工作階段長度上限：

純語音輸入工作階段的時間上限為 15 分鐘。
影片和音訊輸入內容的長度上限為 2 分鐘。

在長時間的對話中，隨著對話進行，音訊和/或視訊權杖的記錄會不斷累積。如果這類記錄超出模型上限，模型可能會生成「幻覺」、速度變慢，或強制終止對話。

如要啟用較長的對話，請在 LiveGenerationConfig 中設定 contextWindowCompression 欄位，啟用脈絡視窗壓縮功能。啟用後，伺服器會使用滑動視窗機制，自動捨棄或摘要說明最舊的對話輪次，將內容大小維持在預設或指定限制內。系統指令不會遭到捨棄，且一律會保留在內容視窗的開頭。

從使用者的角度來看，由於「記憶體」會持續管理，因此理論上工作階段的持續時間無限。

您可以設定滑動視窗機制，以及選擇性設定觸發壓縮的權杖數量 (請參閱下方的可用設定和值)。以下是使用這些設定時，需要考量的一些高層級事項：

將 targetTokens 設為非常低的值，可為連續串流釋出更多脈絡空間，但模型會迅速「忘記」較早的對話輪次。
將 targetTokens 設為接近 triggerTokens 可保留更多記憶體，但會更頻繁地觸發壓縮常式。

設定	如果未在設定中設定，滑動視窗的預設值	最小值	最大值
`triggerTokens` 觸發壓縮前的脈絡長度	模型脈絡窗口限制的 80%	5,000	128,000
`targetTokens` 要保留的權杖目標數量	`triggerTokens` 值的 50% 如果沒有明確設定 `triggerTokens`，`targetTokens` 預設為預設 `triggerTokens` 值的 50%。 `targetTokens` 值必須小於 `triggerTokens` 值。	0	128,000

Swift


// Initialize the Gemini Developer API backend service
let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(
        targetTokens: 2000,
      )
    )
  )
)

Kotlin


// Initialize the Gemini Developer API backend service
val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO,
        contextWindowCompression = ContextWindowCompressionConfig(
            triggerTokens = 10000,
            slidingWindow = SlidingWindow(targetTokens = 2000)
        )
    }
)

Java


// Initialize the Gemini Developer API backend service
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.5-flash-native-audio-preview-12-2025",
        // Enable context window compression.
        // (Optional) Configure the number of tokens in the context window that triggers the compression.
        new LiveGenerationConfig.Builder()
                .setResponseModality(ResponseModality.AUDIO)
                .setContextWindowCompression(
                        new ContextWindowCompressionConfig(10000, new SlidingWindow(2000))
                )
                .build()
);

Web


const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    contextWindowCompression: {
      triggerTokens: 10000,
      slidingWindow: {
        targetTokens: 2000,
      },
    },
  },
});

Dart


final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(targetTokens: 2000),
    ),
  ),
);

Unity


var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        contextWindowCompression: new ContextWindowCompressionConfig(
            triggerTokens: 10000,
            slidingWindow: new SlidingWindow(targetTokens: 2000)
        )
    )
);

偵測工作階段何時結束

單一 WebSocket 連續連線的最長持續時間約為 10 分鐘。連線結束前 60 秒，系統會向用戶端傳送即將結束通知，方便您採取進一步行動 (例如繼續工作階段)。

以下範例說明如何監聽 going away 通知，偵測即將終止的連線：

Swift

for try await response in session.responses {
  switch response.payload {

  case .goingAwayNotice(let goingAwayNotice):
    // Prepare for the session to close soon
    if let timeLeft = goingAwayNotice.timeLeft {
        print("Server going away in \(timeLeft) seconds")
    }
  }
}

Kotlin

for (response in session.responses) {
    when (val message = response.payload) {
        is LiveServerGoAway -> {
            // Prepare for the session to close soon
            val remaining = message.timeLeft
            logger.info("Server going away in $remaining")
        }
    }
}

Java

session.getResponses().forEach(response -> {
    if (response.getPayload() instanceof LiveServerResponse.GoingAwayNotice) {
        LiveServerResponse.GoingAwayNotice notice = (LiveServerResponse.GoingAwayNotice) response.getPayload();
        // Prepare for the session to close soon
        Duration timeLeft = notice.getTimeLeft();
    }
});

Web

for await (const message of session.receive()) {
  switch (message.type) {

  ...
  case "goingAwayNotice":
    console.log("Server going away. Time left:", message.timeLeft);
    break;
  }
}

Dart

Future _handleLiveServerMessage(LiveServerResponse response) async {
  final message = response.message;
  if (message is GoingAwayNotice) {
     // Prepare for the session to close soon
     developer.log('Server going away. Time left: ${message.timeLeft}');
  }
}

Unity

foreach (var response in session.Responses) {
    if (response.Payload is LiveSessionGoingAway notice) {
        // Prepare for the session to close soon
        TimeSpan timeLeft = notice.TimeLeft;
        Debug.Log($"Server going away notice received. Remaining: {timeLeft}");
    }
}

繼續工作階段

Live API支援繼續工作階段，避免遺失對話內容。每個工作階段都有控制代碼，可用於下列用途：

在連線時間限制到期前維持工作階段

單一 WebSocket 連續連線的最長持續時間約為 10 分鐘。您可以監聽 going away 通知，偵測連線即將結束的時間，然後使用工作階段控制代碼建立新連線，延長工作階段。
連線中斷後立即繼續工作階段

如果連線在達到連線時間上限前終止或中斷 (例如從 Wi-Fi 切換至 5G)，伺服器會將工作階段狀態保留約 10 分鐘。在這段期間內，您可以使用工作階段控制代碼建立新連線，以繼續工作階段。
在延長時間範圍後繼續工作階段

連線結束後，伺服器會將工作階段狀態保留幾小時。在這段期間，您可以使用工作階段控制代碼建立新連線，以繼續工作階段。請注意，這兩個Gemini API供應商的回溯期不同：Gemini Developer API為 2 小時，Vertex AI Gemini API為 24 小時。

根據預設，工作階段續傳功能會停用。如要啟用工作階段續傳功能，請在建立新連線時傳遞空白的續傳設定。啟用後，伺服器會定期傳送包含工作階段繼續作業控制碼的更新。如果工作階段中斷連線，您可以重新連線並傳遞這個控制代碼，以繼續進行工作階段，且內容不會遺失。

以下範例顯示了兩種繼續工作階段的方式：

Swift

// Local variable to save the active session handle
var activeSessionHandle: String?

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = try await liveModel.connect(
  sessionResumption: SessionResumptionConfig()
)

// Start receiving responses
for try await message in session.responses {
  // Check for new session handles inside your message handling loop
  switch message.payload {
  case let .sessionResumptionUpdate(updateMessage):
    guard let newHandle = updateMessage.newHandle, updateMessage.resumable else {
      continue
    }
    activeSessionHandle = newHandle
    print("SessionResumptionUpdate: handle \(newHandle)")
  // ... handle other LiveServerMessage types ...
  default:
    break
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if let handle = activeSessionHandle {
  session = try await liveModel.connect(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

// Option 2: Resume the session directly on an existing session object
if let handle = activeSessionHandle {
  try await session.resumeSession(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

Kotlin

// Local variable to save the active session handle
var activeSessionHandle: String? = null

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = liveModel.connect(
    sessionResumption = SessionResumptionConfig()
)

// Start receiving responses
session.receive().collect { message ->
    // Process other received response types...

    // Check for new session handles inside your message handling loop
    if (message is LiveSessionResumptionUpdate) {
        if (message.resumable == true && message.newHandle != null) {
            activeSessionHandle = message.newHandle
            Log.d("TAG", "SessionResumptionUpdate: handle ${message.newHandle}")
        }
    }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
activeSessionHandle?.let { handle ->
    session = liveModel.connect(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

// Option 2: Resume the session directly on an existing session object
activeSessionHandle?.let { handle ->
    session.resumeSession(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

Java

For Java, session resumption is not yet supported. Check back soon!

Web

// Local variable to save the active session handle
let activeSessionHandle = null;

// Initialize the session. Passing an empty object requests the server to send SessionResumptionUpdate
let session = await liveModel.connect({});

// Start receiving responses
for await (const message of session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message.type === 'sessionResumptionUpdate') {
    if (message.resumable && message.newHandle) {
      activeSessionHandle = message.newHandle;
      console.log(`SessionResumptionUpdate: handle ${activeSessionHandle}`);
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (activeSessionHandle) {
  session = await liveModel.connect({
    handle: activeSessionHandle
  });
}

// Option 2: Resume the session directly on an existing session object
if (activeSessionHandle) {
  await session.resumeSession({
    handle: activeSessionHandle
  });
}

Dart

// Local variable to save the active session handle
String? _activeSessionHandle;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var _session = await _liveModel.connect(
  sessionResumption: SessionResumptionConfig(),
);

// Start receiving responses
await for (final message in _session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message is SessionResumptionUpdate &&
      message.resumable != null &&
      message.resumable!) {
    _activeSessionHandle = message.newHandle;
    log('SessionResumptionUpdate: handle ${message.newHandle}');
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (_activeSessionHandle != null) {
  _session = await _liveModel.connect(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

// Option 2: Alternatively, resume the session directly on an existing session object
if (_activeSessionHandle != null) {
  await _session.resumeSession(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

Unity

// Local variable to save the active session handle
string activeSessionHandle = null;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = await liveModel.ConnectAsync(
    sessionResumption: new SessionResumptionConfig()
);

// Start receiving responses
await foreach (var response in session.ReceiveAsync())
{
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (response.Message is LiveSessionResumptionUpdate updateMessage)
  {
    if (updateMessage.Resumable == true && !string.IsNullOrEmpty(updateMessage.NewHandle))
    {
      activeSessionHandle = updateMessage.NewHandle;
      Debug.Log($"SessionResumptionUpdate: handle {activeSessionHandle}");
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  session = await liveModel.ConnectAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}

// Option 2: Resume the session directly on an existing session object
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  await session.ResumeSessionAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}

管理 Live API 的工作階段 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

工作階段限制

開始練習

在工作階段中更新

新增增量內容更新

Swift

Kotlin

Java

Web

Dart

Unity

在工作階段中更新系統指令

Swift

Kotlin

Java

Web

Dart

Unity

壓縮脈絡窗口

Swift

Kotlin

Java

Web

Dart

Unity

偵測工作階段何時結束

Swift

Kotlin

Java

Web

Dart

Unity

繼續工作階段

Swift

Kotlin

Java

Web

Dart

Unity

管理 Live API 的工作階段