The latest Gemini models, like Gemini 3.5 Flash, are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models were shut down on June 1, 2026. To avoid service disruption, update to a newer model like gemini-3.1-flash-lite. Learn more.

All Imagen models will shut down on June 24, 2026. Learn about migrating your apps to use Nano Banana.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Live API 的配置选项

Firebase AI LogicGemini Live API

即使采用 Live API 的基本实现，您也可以为用户打造引人入胜的强大互动体验。您还可以选择使用以下配置选项进一步自定义体验：

回答语音和语言
音频输入和输出的转写
语音活动检测 (VAD)
会话管理

回答语音和语言

您可以让模型以特定声音回答，还可以让模型以不同语言回答。

指定回答语音

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

Live API 使用 Chirp 3 来支持采用高清语音的合成语音回答。

如果您未指定回答语音，则默认为 Puck。

查看回答语音选项列表

如需查看每种语音的演示，请参阅 Chirp 3：高清语音。

Zephyr -- 活泼
Kore -- 坚定
Orus -- 坚定
Autonoe -- 活泼
Umbriel -- 轻松
Erinome -- 清晰
Laomedeia -- 欢快
Schedar -- 平稳
Achird -- 友好
Sadachbia -- 活泼 Puck -- 欢快
Fenrir -- 兴奋
Aoede -- 轻快
Enceladus -- 气声
Algieba -- 流畅
Algenib -- 沙哑
Achernar -- 柔和
Gacrux -- 成熟
Zubenelgenubi -- 随意
Sadaltager -- 专业 Charon -- 信息丰富
Leda -- 青春活力
Callirrhoe -- 轻松随意
Iapetus -- 清晰明了
Despina -- 流畅自然
Rasalgethi -- 信息丰富
Alnilam -- 坚定有力
Pulcherrima -- 积极向上
Vindemiatrix -- 温柔舒缓
Sulafat -- 温暖亲切

如需指定回答语音，请在 speechConfig 对象中设置语音名称，作为模型配置的一部分。

Swift


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    speech: SpeechConfig(voiceName: "VOICE_NAME")
  )
)

// ...

Kotlin


// ...

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voice("VOICE_NAME"))
    }
)

// ...

Java


// ...

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModality(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(new Voice("VOICE_NAME")))
        .build()
);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    speechConfig: {
      voiceConfig: {
        prebuiltVoiceConfig: { voiceName: "VOICE_NAME" },
      },
    },
  },
});

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to use a specific voice for its audio response
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    speechConfig: SpeechConfig(voiceName: 'VOICE_NAME'),
  ),
);

// ...

Unity


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        speechConfig: SpeechConfig.UsePrebuiltVoice("VOICE_NAME")
    )
);

// ...

影响回答语言

重要提示： Live API 模型不支持在配置中明确设置回答语言。相反，这些模型会自动选择合适的语言来生成回答。

Live API 模型会自动选择合适的语言来生成回答。

查看支持的语言列表

语言	BCP-47 代码	语言	BCP-47 代码
阿拉伯语（埃及语）	ar-EG	德语（德国）	de-DE
英语（美国）	en-US	西班牙语（美国）	es-US
法语（法国）	fr-FR	印地语（印度）	hi-IN
印度尼西亚语（印度尼西亚）	id-ID	意大利语（意大利）	it-IT
日语（日本）	ja-JP	韩语（韩国）	ko-KR
葡萄牙语（巴西）	pt-BR	俄语（俄罗斯）	ru-RU
荷兰语（荷兰）	nl-NL	波兰语（波兰）	pl-PL
泰语（泰国）	th-TH	土耳其语（土耳其）	tr-TR
越南语（越南）	vi-VN	罗马尼亚语（罗马尼亚）	ro-RO
乌克兰语（乌克兰）	uk-UA	孟加拉语（孟加拉）	bn-BD
英语（印度）	en-IN 和 hi-IN 捆绑包	马拉地语（印度）	mr-IN
泰米尔语（印度）	ta-IN	泰卢固语（印度）	te-IN

如果您希望模型以非英语语言回答，或者始终以特定语言回答，可以使用系统指令来影响模型的回答，如以下示例所示：

向模型强调非英语语言可能适合

Listen to the speaker carefully. If you detect a non-English language, respond
in the language you hear from the speaker. You must respond unmistakably in the
speaker's language.

让模型始终以特定语言回答

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

音频输入和输出的转写

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

作为模型回答的一部分，您可以收到音频输入和模型音频回答的转写内容。您可以将此配置设置为模型配置的一部分。

如需转写音频输入，请添加 inputAudioTranscription。
如需转写模型的音频回答，请添加 outputAudioTranscription。

请注意以下几点：

您可以将模型配置为返回输入和输出的转写内容（如以下示例所示），也可以将其配置为仅返回其中一种。
转写内容会与音频一起进行流式传输，因此最好像收集每次对话的文本部分一样收集转写内容。
转写语言是从音频输入和模型的音频回答中推断出来的。

Swift


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig()
  )
)

var inputTranscript: String = ""
var outputTranscript: String = ""

do {
  let session = try await liveModel.connect()
  for try await response in session.responses {
    if case let .content(content) = response.payload {
      if let inputText = content.inputAudioTranscription?.text {
        // Handle transcription text of the audio input
        inputTranscript += inputText
      }

      if let outputText = content.outputAudioTranscription?.text {
        // Handle transcription text of the audio output
        outputTranscript += outputText
      }

      if content.isTurnComplete {
        // Log the transcripts after the current turn is complete
        print("Input audio: \(inputTranscript)")
        print("Output audio: \(outputTranscript)")

        // Reset the transcripts for the next turn
        inputTranscript = ""
        outputTranscript = ""
      }
    }
  }


} catch {
  // Handle error
}

// ...

Kotlin


// ...

val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        inputAudioTranscription = AudioTranscriptionConfig()
        outputAudioTranscription = AudioTranscriptionConfig()
   }
)

val liveSession = liveModel.connect()

fun handleTranscription(input: Transcription?, output: Transcription?) {
    input?.text?.let { text ->
        // Handle transcription text of the audio input
        println("Input Transcription: $text")
    }
    output?.text?.let { text ->
        // Handle transcription text of the audio output
        println("Output Transcription: $text")
    }
}

liveSession.startAudioConversation(null, ::handleTranscription)

// ...

Java


// ...

ExecutorService executor = Executors.newFixedThreadPool(1);

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    new LiveGenerationConfig.Builder()
            .setResponseModality(ResponseModality.AUDIO)
            .setInputAudioTranscription(new AudioTranscriptionConfig())
            .setOutputAudioTranscription(new AudioTranscriptionConfig())
            .build()
    );

LiveModelFutures liveModel = LiveModelFutures.from(lm);
ListenableFuture sessionFuture = liveModel.connect();

Futures.addCallback(sessionFuture, new FutureCallback() {
    @Override
    public void onSuccess(LiveSessionFutures ses) {
        LiveSessionFutures session = ses;
        session.startAudioConversation((Transcription input, Transcription output) -> {
            if (input != null) {
                // Handle transcription text of the audio input
                System.out.println("Input Transcription: " + input.getText());
            }
            if (output != null) {
                // Handle transcription text of the audio output
                System.out.println("Output Transcription: " + output.getText());
            }
            return null;
        });
    }

    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
        t.printStackTrace();
    }
}, executor);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    inputAudioTranscription: {},
    outputAudioTranscription: {},
  },
});

const liveSession = await liveModel.connect();

liveSession.sendAudioRealtime({ data, mimeType: "audio/pcm" });

const messages = liveSession.receive();
for await (const message of messages) {
  switch (message.type) {
    case 'serverContent':
      if (message.inputTranscription) {
        // Handle transcription text of the audio input
        console.log(`Input transcription: ${message.inputTranscription.text}`);
      }
      if (message.outputTranscription) {
        // Handle transcription text of the audio output
        console.log(`Output transcription: ${message.outputTranscription.text}`);
      } else {
      	 // Handle other message types (modelTurn, turnComplete, interruption)
      }
    default:
      // Handle other message types (toolCall, toolCallCancellation)
  }
}

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig(),
  ),
);

final LiveSession _session = _liveModel.connect();

await for (final response in _session.receive()) {
  LiveServerContent message = response.message;
  if (message.inputTranscription?.text case final inputText?) {
    // Handle transcription text of the audio input
    print('Input: $inputText');
  }

  if (message.outputTranscription?.text case final outputText?) {
    // Handle transcription text of the audio output
    print('Output: $outputText');
  }
}

// ...

Unity


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        inputAudioTranscription: new AudioTranscriptionConfig(),
        outputAudioTranscription: new AudioTranscriptionConfig()
    )
);

try
{
    var session = await liveModel.ConnectAsync();
    var stream = session.ReceiveAsync();
    await foreach (var response in stream) {
        if (response.Message is LiveSessionContent sessionContent) {
            if (!string.IsNullOrEmpty(sessionContent.InputTranscription?.Text)) {
              // handle transcription text of input audio
            }

            if (!string.IsNullOrEmpty(sessionContent.OutputTranscription?.Text)) {
              // handle transcription text of output audio
            }
        }
    }
}
catch (Exception e)
{
    // Handle error
}

// ...

语音活动检测 (VAD)

模型会自动对连续的音频输入流执行语音活动检测 (VAD)。VAD 默认处于启用状态。

会话管理

了解以下与会话相关的主题：

高级功能，包括：
- 在会话中途更新系统指令
- 添加增量内容更新
与会话相关的限制，包括连接和会话时长限制、会话上下文窗口限制和速率限制。
用于处理会话限制的选项，包括：
- 压缩上下文窗口
- 恢复会话

Live API 的配置选项 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

回答语音和语言

指定回答语音

Swift

Kotlin

Java

Web

Dart

Unity

影响回答语言

音频输入和输出的转写

Swift

Kotlin

Java

Web

Dart

Unity

语音活动检测 (VAD)

会话管理

Live API 的配置选项