管理 Live API 的工作階段

Gemini Live API 會處理稱為「工作階段」的連續音訊或文字串流。您可以管理工作階段生命週期,從初始信號交換到正常終止。

工作階段限制

如果是 Live API工作階段是指持續連線,透過連線持續串流輸入和輸出內容。

如果工作階段超出任何下列限制,連線就會終止。不過,Live API 提供了一些選項 (請參閱下文),可處理這些工作階段相關限制。

  • 工作階段脈絡窗口最多只能有 128,000 個權杖。

    由於有脈絡窗口限制,以下是根據輸入模式估算的對話長度上限:

    • 純語音輸入工作階段的時間上限為 15 分鐘
    • 影片和音訊輸入內容的長度上限為 2 分鐘
  • 連線時間上限為約 10 分鐘

    連線結束前 60 秒,您會收到即將結束通知

以下是處理工作階段相關限制的幾種方式:

開始練習

請參閱入門指南,瞭解如何啟動工作階段的完整程式碼片段。Live API

在工作階段中更新

Live API 模型支援下列進階功能,可進行工作階段中更新

新增增量內容更新

您可以在有效的工作階段中新增遞增更新。可用於傳送文字輸入內容、建立工作階段脈絡或還原工作階段脈絡。

  • 如要提供較長的背景資訊,建議您提供單一訊息摘要,為後續互動釋出背景資訊視窗。

  • 如果是簡短的脈絡,您可以傳送即時路線互動,代表確切的事件順序,如下方程式碼片段所示。

Swift

// Define initial turns (history/context).
let turns: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of France?")]),
  ModelContent(role: "model", parts: [TextPart("Paris")]),
]

// Send history, keeping the conversational turn OPEN (false).
await session.sendContent(turns, turnComplete: false)

// Define the new user query.
let newTurn: [ModelContent] = [
  ModelContent(role: "user", parts: [TextPart("What is the capital of Germany?")]),
]

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.sendContent(newTurn, turnComplete: true)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

const turns = [{ text: "Hello from the user!" }];

await session.send(
  turns,
  false // turnComplete: false
);

console.log("Sent history. Waiting for next input...");

// Define the new user query.
const newTurn [{ text: "And what is the capital of Germany?" }];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
    newTurn,
    true // turnComplete: true
);
console.log("Sent final query. Model response expected now.");

Dart

// Define initial turns (history/context).
final List turns = [
  Content(
    "user",
    [Part.text("What is the capital of France?")],
  ),
  Content(
    "model",
    [Part.text("Paris")],
  ),
];

// Send history, keeping the conversational turn OPEN (false).
await session.send(
  input: turns,
  turnComplete: false,
);

// Define the new user query.
final List newTurn = [
  Content(
    "user",
    [Part.text("What is the capital of Germany?")],
  ),
];

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
  input: newTurn,
  turnComplete: true,
);

Unity

// Define initial turns (history/context).
List turns = new List {
    new ModelContent("user", new ModelContent.TextPart("What is the capital of France?") ),
    new ModelContent("model", new ModelContent.TextPart("Paris") ),
};

// Send history, keeping the conversational turn OPEN (false).
foreach (ModelContent turn in turns)
{
    await session.SendAsync(
        content: turn,
        turnComplete: false
    );
}

// Define the new user query.
ModelContent newTurn = ModelContent.Text("What is the capital of Germany?");

// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.SendAsync(
    content: newTurn,
    turnComplete: true
);

在工作階段中更新系統指令

只有在使用 Vertex AI Gemini API 做為 API 供應商時,才能使用這項功能。

在工作階段進行期間,你可以更新系統指示。您可以使用這項功能調整模型的回覆,例如變更回覆語言或修改語氣。

如要在工作階段中更新系統指令,可以傳送 system 角色的文字內容。更新後的系統指令會在工作階段的其餘時間內維持有效。

Swift

await session.sendContent(
  [ModelContent(
    role: "system",
    parts: [TextPart("new system instruction")]
  )],
  turnComplete: false
)

Kotlin

Not yet supported for Android apps - check back soon!

Java

Not yet supported for Android apps - check back soon!

Web

Not yet supported for Web apps - check back soon!

Dart

try {
  await _session.send(
    input: Content(
      'system',
      [Part.text('new system instruction')],
    ),
    turnComplete: false,
  );
} catch (e) {
  print('Failed to update system instructions: $e');
}

Unity

try
{
    await session.SendAsync(
        content: new ModelContent(
            "system",
            new ModelContent.TextPart("new system instruction")
        ),
        turnComplete: false
    );
}
catch (Exception e)
{
    Debug.LogError($"Failed to update system instructions: {e.Message}");
}

壓縮脈絡窗口

按一下 Gemini API 供應商,即可在這個頁面查看供應商專屬內容和程式碼。

Live API 工作階段脈絡窗口會儲存即時串流資料 (音訊每秒 25 個符記,影片每秒 258 個符記) 和其他內容,包括文字輸入內容和模型輸出內容。所有 Live API 模型的工作階段脈絡窗口上限為 128,000 個權杖。

根據預設,由於有脈絡窗口限制,以下是根據輸入模式得出的約略工作階段長度上限:

  • 純語音輸入工作階段的時間上限為 15 分鐘
  • 影片和音訊輸入內容的長度上限為 2 分鐘

在長時間的對話中,隨著對話進行,音訊和/或視訊權杖的記錄會不斷累積。如果這類記錄超出模型限制,模型可能會產生幻覺、速度變慢,或強制終止工作階段。

如要啟用較長的對話,請在 LiveGenerationConfig 中設定 contextWindowCompression 欄位,啟用脈絡視窗壓縮功能。啟用後,伺服器會使用滑動視窗機制,自動捨棄或摘要最舊的輪流對話,將內容大小維持在預設或指定限制內。系統指令不會遭到捨棄,且一律會保留在內容視窗的開頭。

從使用者的角度來看,由於「記憶體」會持續管理,因此理論上工作階段的持續時間無限。

您可以設定滑動視窗機制,以及選擇性設定觸發壓縮的權杖數量 (請參閱下方的可用設定和值)。以下是使用這些設定時需要考量的高層級事項:

  • targetTokens 設為非常低,可為連續串流釋出更多脈絡空間,但模型會迅速「忘記」較早的對話輪次。

  • targetTokens 設為接近 triggerTokens 可保留更多記憶體,但會更頻繁地觸發壓縮常式。

設定 如果未在設定中設定,滑動視窗的預設值 最小值 最大值
triggerTokens
觸發壓縮前的脈絡長度
模型脈絡窗口限制的 80% 5,000 128,000
targetTokens
要保留的權杖目標數量
triggerTokens 值的 50%
  • 如果沒有明確設定 triggerTokenstargetTokens 預設為預設 triggerTokens 值的 50%。
  • targetTokens 值必須小於 triggerTokens 值。
0 128,000

Swift


// Initialize the Gemini Developer API backend service
let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(
        targetTokens: 2000,
      )
    )
  )
)

Kotlin


// Initialize the Gemini Developer API backend service
val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO,
        contextWindowCompression = ContextWindowCompressionConfig(
            triggerTokens = 10000,
            slidingWindow = SlidingWindow(targetTokens = 2000)
        )
    }
)

Java


// Initialize the Gemini Developer API backend service
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.5-flash-native-audio-preview-12-2025",
        // Enable context window compression.
        // (Optional) Configure the number of tokens in the context window that triggers the compression.
        new LiveGenerationConfig.Builder()
                .setResponseModality(ResponseModality.AUDIO)
                .setContextWindowCompression(
                        new ContextWindowCompressionConfig(10000, new SlidingWindow(2000))
                )
                .build()
);

Web


const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    contextWindowCompression: {
      triggerTokens: 10000,
      slidingWindow: {
        targetTokens: 2000,
      },
    },
  },
});

Dart


final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Enable context window compression.
  // (Optional) Configure the number of tokens in the context window that triggers the compression.
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: 10000,
      slidingWindow: SlidingWindow(targetTokens: 2000),
    ),
  ),
);

Unity


var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Enable context window compression.
    // (Optional) Configure the number of tokens in the context window that triggers the compression.
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        contextWindowCompression: new ContextWindowCompressionConfig(
            triggerTokens: 10000,
            slidingWindow: new SlidingWindow(targetTokens: 2000)
        )
    )
);

偵測工作階段何時結束

單一 WebSocket 連續連線的最長持續時間約為 10 分鐘。連線結束前 60 秒,系統會向用戶端傳送即將結束通知,方便您採取進一步行動 (例如繼續工作階段)。

以下範例說明如何監聽 going away 通知,偵測即將終止的連線:

Swift

for try await response in session.responses {
  switch response.payload {

  case .goingAwayNotice(let goingAwayNotice):
    // Prepare for the session to close soon
    if let timeLeft = goingAwayNotice.timeLeft {
        print("Server going away in \(timeLeft) seconds")
    }
  }
}

Kotlin

for (response in session.responses) {
    when (val message = response.payload) {
        is LiveServerGoAway -> {
            // Prepare for the session to close soon
            val remaining = message.timeLeft
            logger.info("Server going away in $remaining")
        }
    }
}

Java

session.getResponses().forEach(response -> {
    if (response.getPayload() instanceof LiveServerResponse.GoingAwayNotice) {
        LiveServerResponse.GoingAwayNotice notice = (LiveServerResponse.GoingAwayNotice) response.getPayload();
        // Prepare for the session to close soon
        Duration timeLeft = notice.getTimeLeft();
    }
});

Web

for await (const message of session.receive()) {
  switch (message.type) {

  ...
  case "goingAwayNotice":
    console.log("Server going away. Time left:", message.timeLeft);
    break;
  }
}

Dart

Future _handleLiveServerMessage(LiveServerResponse response) async {
  final message = response.message;
  if (message is GoingAwayNotice) {
     // Prepare for the session to close soon
     developer.log('Server going away. Time left: ${message.timeLeft}');
  }
}

Unity

foreach (var response in session.Responses) {
    if (response.Payload is LiveSessionGoingAway notice) {
        // Prepare for the session to close soon
        TimeSpan timeLeft = notice.TimeLeft;
        Debug.Log($"Server going away notice received. Remaining: {timeLeft}");
    }
}

繼續工作階段

Live API支援繼續工作階段,避免遺失對話內容。每個工作階段都有控制代碼,可用於下列用途:

  • 在連線時間限制到期前維持工作階段

    單一 WebSocket 連續連線的最長持續時間約為 10 分鐘。您可以監聽 going away 通知,偵測連線即將結束的時間,然後使用工作階段控制代碼建立新連線,延長工作階段時間。

  • 連線中斷後立即繼續工作階段

    如果連線在達到連線時間上限前終止或中斷 (例如從 Wi-Fi 切換至 5G),伺服器會將工作階段狀態保留約 10 分鐘。在這段期間內,您可以使用工作階段控制代碼建立新連線,以恢復工作階段。

  • 在延長時間範圍後繼續工作階段

    連線結束後,伺服器會將工作階段狀態保留幾小時。 在這段時間內,您可以使用工作階段控制代碼建立新連線,以繼續工作階段。請注意,這兩個Gemini API供應商的回溯期不同:Gemini Developer API2 小時Vertex AI Gemini API 則為 24 小時

根據預設,系統會停用工作階段續傳功能。如要啟用工作階段續傳功能,請在建立新連線時傳遞空白的續傳設定。啟用後,伺服器會定期傳送更新,其中包含工作階段繼續作業控制代碼。如果工作階段中斷,您可以重新連線並傳遞這個控制代碼,以繼續進行工作階段,且內容不會遺失。

以下範例顯示了兩種繼續工作階段的方式:

Swift

// Local variable to save the active session handle
var activeSessionHandle: String?

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = try await liveModel.connect(
  sessionResumption: SessionResumptionConfig()
)

// Start receiving responses
for try await message in session.responses {
  // Check for new session handles inside your message handling loop
  switch message.payload {
  case let .sessionResumptionUpdate(updateMessage):
    guard let newHandle = updateMessage.newHandle, updateMessage.resumable else {
      continue
    }
    activeSessionHandle = newHandle
    print("SessionResumptionUpdate: handle \(newHandle)")
  // ... handle other LiveServerMessage types ...
  default:
    break
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if let handle = activeSessionHandle {
  session = try await liveModel.connect(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

// Option 2: Resume the session directly on an existing session object
if let handle = activeSessionHandle {
  try await session.resumeSession(
    sessionResumption: SessionResumptionConfig(handle: handle)
  )
}

Kotlin

// Local variable to save the active session handle
var activeSessionHandle: String? = null

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = liveModel.connect(
    sessionResumption = SessionResumptionConfig()
)

// Start receiving responses
session.receive().collect { message ->
    // Process other received response types...

    // Check for new session handles inside your message handling loop
    if (message is LiveSessionResumptionUpdate) {
        if (message.resumable == true && message.newHandle != null) {
            activeSessionHandle = message.newHandle
            Log.d("TAG", "SessionResumptionUpdate: handle ${message.newHandle}")
        }
    }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
activeSessionHandle?.let { handle ->
    session = liveModel.connect(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

// Option 2: Resume the session directly on an existing session object
activeSessionHandle?.let { handle ->
    session.resumeSession(
        sessionResumption = SessionResumptionConfig(handle = handle)
    )
}

Java

For Java, session resumption is not yet supported. Check back soon!

Web

// Local variable to save the active session handle
let activeSessionHandle = null;

// Initialize the session. Passing an empty object requests the server to send SessionResumptionUpdate
let session = await liveModel.connect({});

// Start receiving responses
for await (const message of session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message.type === 'sessionResumptionUpdate') {
    if (message.resumable && message.newHandle) {
      activeSessionHandle = message.newHandle;
      console.log(`SessionResumptionUpdate: handle ${activeSessionHandle}`);
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (activeSessionHandle) {
  session = await liveModel.connect({
    handle: activeSessionHandle
  });
}

// Option 2: Resume the session directly on an existing session object
if (activeSessionHandle) {
  await session.resumeSession({
    handle: activeSessionHandle
  });
}

Dart

// Local variable to save the active session handle
String? _activeSessionHandle;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var _session = await _liveModel.connect(
  sessionResumption: SessionResumptionConfig(),
);

// Start receiving responses
await for (final message in _session.receive()) {
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (message is SessionResumptionUpdate &&
      message.resumable != null &&
      message.resumable!) {
    _activeSessionHandle = message.newHandle;
    log('SessionResumptionUpdate: handle ${message.newHandle}');
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (_activeSessionHandle != null) {
  _session = await _liveModel.connect(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

// Option 2: Alternatively, resume the session directly on an existing session object
if (_activeSessionHandle != null) {
  await _session.resumeSession(
    sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
  );
}

Unity

// Local variable to save the active session handle
string activeSessionHandle = null;

// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = await liveModel.ConnectAsync(
    sessionResumption: new SessionResumptionConfig()
);

// Start receiving responses
await foreach (var response in session.ReceiveAsync())
{
  // Process other received response types...

  // Check for new session handles inside your message handling loop
  if (response.Message is LiveSessionResumptionUpdate updateMessage)
  {
    if (updateMessage.Resumable == true && !string.IsNullOrEmpty(updateMessage.NewHandle))
    {
      activeSessionHandle = updateMessage.NewHandle;
      Debug.Log($"SessionResumptionUpdate: handle {activeSessionHandle}");
    }
  }
}

// The following are alternative options to resume a session. Choose only one.

// Option 1: Create and connect a session to resume with the saved handle
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  session = await liveModel.ConnectAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}

// Option 2: Resume the session directly on an existing session object
if (!string.IsNullOrEmpty(activeSessionHandle)) {
  await session.ResumeSessionAsync(
      sessionResumption: new SessionResumptionConfig(activeSessionHandle)
  );
}