Gemini Live API 會處理稱為「工作階段」的連續音訊或文字串流。您可以管理工作階段生命週期,從初始信號交換到正常終止。
工作階段限制
如果是 Live API,工作階段是指持續連線,透過連線持續串流輸入和輸出內容。
如果工作階段超出任何下列限制,連線就會終止。不過,Live API 提供了一些選項 (請參閱下文),可處理這些工作階段相關限制。
工作階段脈絡窗口最多只能有 128,000 個權杖。
由於有脈絡窗口限制,以下是根據輸入模式估算的對話長度上限:
- 純語音輸入工作階段的時間上限為
15 分鐘 。 - 影片和音訊輸入內容的長度上限為
2 分鐘 。
- 純語音輸入工作階段的時間上限為
連線時間上限為約
10 分鐘 。連線結束前
60 秒 ,您會收到即將結束通知。
以下是處理工作階段相關限制的幾種方式:
壓縮工作階段內容視窗,讓伺服器自動將內容大小維持在限制範圍內。
繼續工作階段: 如果網路短暫中斷,或收到「即將離開」通知,請繼續工作階段,以免遺失對話內容。
開始練習
請參閱入門指南,瞭解如何啟動工作階段的完整程式碼片段。Live API
在工作階段中更新
Live API 模型支援下列進階功能,可進行工作階段中更新:
新增增量內容更新
您可以在有效的工作階段中新增遞增更新。可用於傳送文字輸入內容、建立工作階段脈絡或還原工作階段脈絡。
如要提供較長的背景資訊,建議您提供單一訊息摘要,為後續互動釋出背景資訊視窗。
如果是簡短的脈絡,您可以傳送即時路線互動,代表確切的事件順序,如下方程式碼片段所示。
Swift
// Define initial turns (history/context).
let turns: [ModelContent] = [
ModelContent(role: "user", parts: [TextPart("What is the capital of France?")]),
ModelContent(role: "model", parts: [TextPart("Paris")]),
]
// Send history, keeping the conversational turn OPEN (false).
await session.sendContent(turns, turnComplete: false)
// Define the new user query.
let newTurn: [ModelContent] = [
ModelContent(role: "user", parts: [TextPart("What is the capital of Germany?")]),
]
// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.sendContent(newTurn, turnComplete: true)
Kotlin
Not yet supported for Android apps - check back soon!
Java
Not yet supported for Android apps - check back soon!
Web
const turns = [{ text: "Hello from the user!" }];
await session.send(
turns,
false // turnComplete: false
);
console.log("Sent history. Waiting for next input...");
// Define the new user query.
const newTurn [{ text: "And what is the capital of Germany?" }];
// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
newTurn,
true // turnComplete: true
);
console.log("Sent final query. Model response expected now.");
Dart
// Define initial turns (history/context).
final List turns = [
Content(
"user",
[Part.text("What is the capital of France?")],
),
Content(
"model",
[Part.text("Paris")],
),
];
// Send history, keeping the conversational turn OPEN (false).
await session.send(
input: turns,
turnComplete: false,
);
// Define the new user query.
final List newTurn = [
Content(
"user",
[Part.text("What is the capital of Germany?")],
),
];
// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.send(
input: newTurn,
turnComplete: true,
);
Unity
// Define initial turns (history/context).
List turns = new List {
new ModelContent("user", new ModelContent.TextPart("What is the capital of France?") ),
new ModelContent("model", new ModelContent.TextPart("Paris") ),
};
// Send history, keeping the conversational turn OPEN (false).
foreach (ModelContent turn in turns)
{
await session.SendAsync(
content: turn,
turnComplete: false
);
}
// Define the new user query.
ModelContent newTurn = ModelContent.Text("What is the capital of Germany?");
// Send the final query, CLOSING the turn (true) to trigger the model response.
await session.SendAsync(
content: newTurn,
turnComplete: true
);
在工作階段中更新系統指令
| 只有在使用 Vertex AI Gemini API 做為 API 供應商時,才能使用這項功能。 |
在工作階段進行期間,你可以更新系統指示。您可以使用這項功能調整模型的回覆,例如變更回覆語言或修改語氣。
如要在工作階段中更新系統指令,可以傳送 system 角色的文字內容。更新後的系統指令會在工作階段的其餘時間內維持有效。
Swift
await session.sendContent(
[ModelContent(
role: "system",
parts: [TextPart("new system instruction")]
)],
turnComplete: false
)
Kotlin
Not yet supported for Android apps - check back soon!
Java
Not yet supported for Android apps - check back soon!
Web
Not yet supported for Web apps - check back soon!
Dart
try {
await _session.send(
input: Content(
'system',
[Part.text('new system instruction')],
),
turnComplete: false,
);
} catch (e) {
print('Failed to update system instructions: $e');
}
Unity
try
{
await session.SendAsync(
content: new ModelContent(
"system",
new ModelContent.TextPart("new system instruction")
),
turnComplete: false
);
}
catch (Exception e)
{
Debug.LogError($"Failed to update system instructions: {e.Message}");
}
壓縮脈絡窗口
|
按一下 Gemini API 供應商,即可在這個頁面查看供應商專屬內容和程式碼。 |
Live API 工作階段脈絡窗口會儲存即時串流資料 (音訊每秒 25 個符記,影片每秒 258 個符記) 和其他內容,包括文字輸入內容和模型輸出內容。所有 Live API 模型的工作階段脈絡窗口上限為 128,000 個權杖。
根據預設,由於有脈絡窗口限制,以下是根據輸入模式得出的約略工作階段長度上限:
- 純語音輸入工作階段的時間上限為
15 分鐘 。 - 影片和音訊輸入內容的長度上限為
2 分鐘 。
在長時間的對話中,隨著對話進行,音訊和/或視訊權杖的記錄會不斷累積。如果這類記錄超出模型限制,模型可能會產生幻覺、速度變慢,或強制終止工作階段。
如要啟用較長的對話,請在 LiveGenerationConfig 中設定 contextWindowCompression 欄位,啟用脈絡視窗壓縮功能。啟用後,伺服器會使用滑動視窗機制,自動捨棄或摘要最舊的輪流對話,將內容大小維持在預設或指定限制內。系統指令不會遭到捨棄,且一律會保留在內容視窗的開頭。
從使用者的角度來看,由於「記憶體」會持續管理,因此理論上工作階段的持續時間無限。
您可以設定滑動視窗機制,以及選擇性設定觸發壓縮的權杖數量 (請參閱下方的可用設定和值)。以下是使用這些設定時需要考量的高層級事項:
將
targetTokens設為非常低,可為連續串流釋出更多脈絡空間,但模型會迅速「忘記」較早的對話輪次。將
targetTokens設為接近triggerTokens可保留更多記憶體,但會更頻繁地觸發壓縮常式。
| 設定 | 如果未在設定中設定,滑動視窗的預設值 | 最小值 | 最大值 |
|---|---|---|---|
triggerTokens觸發壓縮前的脈絡長度 |
模型脈絡窗口限制的 80% | 5,000 | 128,000 |
targetTokens要保留的權杖目標數量 |
triggerTokens 值的 50%
|
0 | 128,000 |
Swift
// Initialize the Gemini Developer API backend service
let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig: LiveGenerationConfig(
responseModalities: [.audio],
contextWindowCompression: ContextWindowCompressionConfig(
triggerTokens: 10000,
slidingWindow: SlidingWindow(
targetTokens: 2000,
)
)
)
)
Kotlin
// Initialize the Gemini Developer API backend service
val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig = liveGenerationConfig {
responseModality = ResponseModality.AUDIO,
contextWindowCompression = ContextWindowCompressionConfig(
triggerTokens = 10000,
slidingWindow = SlidingWindow(targetTokens = 2000)
)
}
)
Java
// Initialize the Gemini Developer API backend service
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
"gemini-2.5-flash-native-audio-preview-12-2025",
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
new LiveGenerationConfig.Builder()
.setResponseModality(ResponseModality.AUDIO)
.setContextWindowCompression(
new ContextWindowCompressionConfig(10000, new SlidingWindow(2000))
)
.build()
);
Web
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
const liveModel = getLiveGenerativeModel(ai, {
model: "gemini-2.5-flash-native-audio-preview-12-2025",
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig: {
responseModalities: [ResponseModality.AUDIO],
contextWindowCompression: {
triggerTokens: 10000,
slidingWindow: {
targetTokens: 2000,
},
},
},
});
Dart
final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
model: 'gemini-2.5-flash-native-audio-preview-12-2025',
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
liveGenerationConfig: LiveGenerationConfig(
responseModalities: [ResponseModalities.audio],
contextWindowCompression: ContextWindowCompressionConfig(
triggerTokens: 10000,
slidingWindow: SlidingWindow(targetTokens: 2000),
),
),
);
Unity
var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
liveGenerationConfig: new LiveGenerationConfig(
responseModalities: new[] { ResponseModality.Audio },
contextWindowCompression: new ContextWindowCompressionConfig(
triggerTokens: 10000,
slidingWindow: new SlidingWindow(targetTokens: 2000)
)
)
);
偵測工作階段何時結束
單一 WebSocket 連續連線的最長持續時間約為
以下範例說明如何監聽 going away 通知,偵測即將終止的連線:
Swift
for try await response in session.responses {
switch response.payload {
case .goingAwayNotice(let goingAwayNotice):
// Prepare for the session to close soon
if let timeLeft = goingAwayNotice.timeLeft {
print("Server going away in \(timeLeft) seconds")
}
}
}
Kotlin
for (response in session.responses) {
when (val message = response.payload) {
is LiveServerGoAway -> {
// Prepare for the session to close soon
val remaining = message.timeLeft
logger.info("Server going away in $remaining")
}
}
}
Java
session.getResponses().forEach(response -> {
if (response.getPayload() instanceof LiveServerResponse.GoingAwayNotice) {
LiveServerResponse.GoingAwayNotice notice = (LiveServerResponse.GoingAwayNotice) response.getPayload();
// Prepare for the session to close soon
Duration timeLeft = notice.getTimeLeft();
}
});
Web
for await (const message of session.receive()) {
switch (message.type) {
...
case "goingAwayNotice":
console.log("Server going away. Time left:", message.timeLeft);
break;
}
}
Dart
Future _handleLiveServerMessage(LiveServerResponse response) async {
final message = response.message;
if (message is GoingAwayNotice) {
// Prepare for the session to close soon
developer.log('Server going away. Time left: ${message.timeLeft}');
}
}
Unity
foreach (var response in session.Responses) {
if (response.Payload is LiveSessionGoingAway notice) {
// Prepare for the session to close soon
TimeSpan timeLeft = notice.TimeLeft;
Debug.Log($"Server going away notice received. Remaining: {timeLeft}");
}
}
繼續工作階段
Live API支援繼續工作階段,避免遺失對話內容。每個工作階段都有控制代碼,可用於下列用途:
在連線時間限制到期前維持工作階段
單一 WebSocket 連續連線的最長持續時間約為
10 分鐘 。您可以監聽 going away 通知,偵測連線即將結束的時間,然後使用工作階段控制代碼建立新連線,延長工作階段時間。連線中斷後立即繼續工作階段
如果連線在達到連線時間上限前終止或中斷 (例如從 Wi-Fi 切換至 5G),伺服器會將工作階段狀態保留約
10 分鐘 。在這段期間內,您可以使用工作階段控制代碼建立新連線,以恢復工作階段。在延長時間範圍後繼續工作階段
連線結束後,伺服器會將工作階段狀態保留幾小時。 在這段時間內,您可以使用工作階段控制代碼建立新連線,以繼續工作階段。請注意,這兩個Gemini API供應商的回溯期不同:Gemini Developer API 為
2 小時 ,Vertex AI Gemini API 則為24 小時 。
根據預設,系統會停用工作階段續傳功能。如要啟用工作階段續傳功能,請在建立新連線時傳遞空白的續傳設定。啟用後,伺服器會定期傳送更新,其中包含工作階段繼續作業控制代碼。如果工作階段中斷,您可以重新連線並傳遞這個控制代碼,以繼續進行工作階段,且內容不會遺失。
以下範例顯示了兩種繼續工作階段的方式:
Swift
// Local variable to save the active session handle
var activeSessionHandle: String?
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = try await liveModel.connect(
sessionResumption: SessionResumptionConfig()
)
// Start receiving responses
for try await message in session.responses {
// Check for new session handles inside your message handling loop
switch message.payload {
case let .sessionResumptionUpdate(updateMessage):
guard let newHandle = updateMessage.newHandle, updateMessage.resumable else {
continue
}
activeSessionHandle = newHandle
print("SessionResumptionUpdate: handle \(newHandle)")
// ... handle other LiveServerMessage types ...
default:
break
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if let handle = activeSessionHandle {
session = try await liveModel.connect(
sessionResumption: SessionResumptionConfig(handle: handle)
)
}
// Option 2: Resume the session directly on an existing session object
if let handle = activeSessionHandle {
try await session.resumeSession(
sessionResumption: SessionResumptionConfig(handle: handle)
)
}
Kotlin
// Local variable to save the active session handle
var activeSessionHandle: String? = null
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = liveModel.connect(
sessionResumption = SessionResumptionConfig()
)
// Start receiving responses
session.receive().collect { message ->
// Process other received response types...
// Check for new session handles inside your message handling loop
if (message is LiveSessionResumptionUpdate) {
if (message.resumable == true && message.newHandle != null) {
activeSessionHandle = message.newHandle
Log.d("TAG", "SessionResumptionUpdate: handle ${message.newHandle}")
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
activeSessionHandle?.let { handle ->
session = liveModel.connect(
sessionResumption = SessionResumptionConfig(handle = handle)
)
}
// Option 2: Resume the session directly on an existing session object
activeSessionHandle?.let { handle ->
session.resumeSession(
sessionResumption = SessionResumptionConfig(handle = handle)
)
}
Java
For Java, session resumption is not yet supported. Check back soon!
Web
// Local variable to save the active session handle
let activeSessionHandle = null;
// Initialize the session. Passing an empty object requests the server to send SessionResumptionUpdate
let session = await liveModel.connect({});
// Start receiving responses
for await (const message of session.receive()) {
// Process other received response types...
// Check for new session handles inside your message handling loop
if (message.type === 'sessionResumptionUpdate') {
if (message.resumable && message.newHandle) {
activeSessionHandle = message.newHandle;
console.log(`SessionResumptionUpdate: handle ${activeSessionHandle}`);
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if (activeSessionHandle) {
session = await liveModel.connect({
handle: activeSessionHandle
});
}
// Option 2: Resume the session directly on an existing session object
if (activeSessionHandle) {
await session.resumeSession({
handle: activeSessionHandle
});
}
Dart
// Local variable to save the active session handle
String? _activeSessionHandle;
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var _session = await _liveModel.connect(
sessionResumption: SessionResumptionConfig(),
);
// Start receiving responses
await for (final message in _session.receive()) {
// Process other received response types...
// Check for new session handles inside your message handling loop
if (message is SessionResumptionUpdate &&
message.resumable != null &&
message.resumable!) {
_activeSessionHandle = message.newHandle;
log('SessionResumptionUpdate: handle ${message.newHandle}');
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if (_activeSessionHandle != null) {
_session = await _liveModel.connect(
sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
);
}
// Option 2: Alternatively, resume the session directly on an existing session object
if (_activeSessionHandle != null) {
await _session.resumeSession(
sessionResumption: SessionResumptionConfig.resume(_activeSessionHandle!),
);
}
Unity
// Local variable to save the active session handle
string activeSessionHandle = null;
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var session = await liveModel.ConnectAsync(
sessionResumption: new SessionResumptionConfig()
);
// Start receiving responses
await foreach (var response in session.ReceiveAsync())
{
// Process other received response types...
// Check for new session handles inside your message handling loop
if (response.Message is LiveSessionResumptionUpdate updateMessage)
{
if (updateMessage.Resumable == true && !string.IsNullOrEmpty(updateMessage.NewHandle))
{
activeSessionHandle = updateMessage.NewHandle;
Debug.Log($"SessionResumptionUpdate: handle {activeSessionHandle}");
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if (!string.IsNullOrEmpty(activeSessionHandle)) {
session = await liveModel.ConnectAsync(
sessionResumption: new SessionResumptionConfig(activeSessionHandle)
);
}
// Option 2: Resume the session directly on an existing session object
if (!string.IsNullOrEmpty(activeSessionHandle)) {
await session.ResumeSessionAsync(
sessionResumption: new SessionResumptionConfig(activeSessionHandle)
);
}