The latest Gemini models, like Gemini 3.1 Flash Image (Nano Banana 2), are available to use with Firebase AI Logic on all platforms!

Gemini 2.0 Flash and Flash-Lite models will be retired on June 1, 2026. To avoid service disruption, update to a newer model like gemini-2.5-flash-lite. Also, Gemini 3 Pro Preview (gemini-3-pro-preview) will be retired on March 9, 2026 (update to Gemini 3.1 Pro Preview: gemini-3.1-pro-preview). Learn more.

Эта страница переведена с помощью Cloud Translation API.

Параметры конфигурации для Live API

Даже с базовой реализацией Live API вы можете создавать привлекательные и эффективные интерактивные возможности для своих пользователей. При желании вы можете дополнительно настроить пользовательский опыт, используя следующие параметры конфигурации:

Голос и язык ответа
Транскрипции для аудиовхода и выхода
Обнаружение голосовой активности (VAD)
Управление сессиями

Голос и язык ответа

Вы можете настроить модель на ответ определенным голосом , а также повлиять на ее способность отвечать на разных языках .

Укажите вариант ответа.

Чтобы просмотреть контент и код, относящиеся к вашему поставщику API Gemini , нажмите на него.

Live API использует Chirp 3 для поддержки синтезированных речевых ответов в формате HD.

Если вы не укажете голос для ответа, по умолчанию будет использоваться Puck .

Просмотреть список вариантов голосового ответа

Чтобы посмотреть демонстрацию звучания каждого голоса, см. Chirp 3: HD voices .

Zephyr -- Яркий
Kore -- Фирма
Orus — Фирма
Autonoe — Яркое
Umbriel — добродушный
Erinome -- Чистый
Laomedeia — оптимистичная
Schedar — даже
Achird — Дружелюбный
Sadachbia -- Оживлённый Puck — оптимистичный
Fenrir — Возбудимый
Aoede -- Бризи
Enceladus — Хрипловатый
Algieba -- Гладкая
Algenib -- Грейвли
Achernar — Мягкий
Gacrux — зрелый
Zubenelgenubi -- Повседневный
Sadaltager — знающий специалист Charon — информативный
Leda — Юная
Callirrhoe — добродушный
Iapetus — Ясный
Despina -- Гладкая
Rasalgethi — информативный
Alnilam -- Фирма
Pulcherrima -- Нападающий
Vindemiatrix -- Нежная
Sulafat -- Теплый

Чтобы указать голос для ответа, задайте имя голоса в объекте speechConfig в рамках конфигурации модели .

Быстрый


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    speech: SpeechConfig(voiceName: "VOICE_NAME")
  )
)

// ...

Kotlin


// ...

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voice("VOICE_NAME"))
    }
)

// ...

Java


// ...

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModality(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(new Voice("VOICE_NAME")))
        .build()
);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    speechConfig: {
      voiceConfig: {
        prebuiltVoiceConfig: { voiceName: "VOICE_NAME" },
      },
    },
  },
});

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to use a specific voice for its audio response
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    speechConfig: SpeechConfig(voiceName: 'VOICE_NAME'),
  ),
);

// ...

Единство


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        speechConfig: SpeechConfig.UsePrebuiltVoice("VOICE_NAME")
    )
);

// ...

Влияйте на формулировку ответа.

Модели Live API автоматически выбирают подходящий язык для своих ответов.

Просмотреть список поддерживаемых языков

Язык	Код BCP-47	Язык	Код BCP-47
Арабский (египетский)	ар-ЭГ	Немецкий (Германия)	де-ДЕ
Английский (США)	en-US	Испанский (США)	es-US
Французский (Франция)	фр-ФР	Хинди (Индия)	хай-ИН
Индонезийский (Индонезия)	я сделал	Итальянский (Италия)	ИТ-ИТ
Японский (Япония)	ja-JP	Корейский (Корея)	ко-КР
Португальский (Бразилия)	пт-БР	Русский (Россия)	ру-RU
Голландский (Нидерланды)	nl-NL	Польский (Польша)	пл-ПЛ
Тайский (Таиланд)	th-TH	Турецкий (Турция)	тр-ТР
Вьетнамский (Вьетнам)	vi-VN	Румынский (Румыния)	ро-ро
Украинский (Украина)	Великобритания-Украина	Бенгальский (Бангладеш)	бн-БД
Английский (Индия)	en-IN & hi-IN bundle	Маратхи (Индия)	мистер-ИН
Тамильский (Индия)	та-ИН	Телугу (Индия)	те-ИН

Если вы хотите, чтобы модель отвечала на языке, отличном от английского, или всегда на определенном языке, вы можете повлиять на ответы модели, используя системные инструкции, подобные приведенным примерам:

Подчеркните модели, что использование неанглийского языка может быть уместным.

Listen to the speaker carefully. If you detect a non-English language, respond
in the language you hear from the speaker. You must respond unmistakably in the
speaker's language.

Укажите модели всегда отвечать на определенном языке.
```
RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.
```

Транскрипции для аудиовхода и выхода

Чтобы просмотреть контент и код, относящиеся к вашему поставщику API Gemini , нажмите на него.

В ответ на запрос модели вы можете получить расшифровку аудиовхода и аудиоответа модели. Эта настройка выполняется в рамках конфигурации модели .

Для транскрипции аудиовхода добавьте inputAudioTranscription .
Для транскрипции аудиоответа модели добавьте outputAudioTranscription .

Обратите внимание на следующее:

Вы можете настроить модель так, чтобы она возвращала транскрипции как входных, так и выходных данных (как показано в следующем примере), или же настроить ее так, чтобы она возвращала только один из этих типов данных.
Текстовые расшифровки передаются одновременно со звуком, поэтому лучше всего собирать их так же, как и текстовые фрагменты, с каждым ходом.
Язык транскрипции определяется на основе входного аудиосигнала и аудиоответа модели.

Быстрый


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig()
  )
)

var inputTranscript: String = ""
var outputTranscript: String = ""

do {
  let session = try await liveModel.connect()
  for try await response in session.responses {
    if case let .content(content) = response.payload {
      if let inputText = content.inputAudioTranscription?.text {
        // Handle transcription text of the audio input
        inputTranscript += inputText
      }

      if let outputText = content.outputAudioTranscription?.text {
        // Handle transcription text of the audio output
        outputTranscript += outputText
      }

      if content.isTurnComplete {
        // Log the transcripts after the current turn is complete
        print("Input audio: \(inputTranscript)")
        print("Output audio: \(outputTranscript)")

        // Reset the transcripts for the next turn
        inputTranscript = ""
        outputTranscript = ""
      }
    }
  }


} catch {
  // Handle error
}

// ...

Kotlin


// ...

val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        inputAudioTranscription = AudioTranscriptionConfig()
        outputAudioTranscription = AudioTranscriptionConfig()
   }
)

val liveSession = liveModel.connect()

fun handleTranscription(input: Transcription?, output: Transcription?) {
    input?.text?.let { text ->
        // Handle transcription text of the audio input
        println("Input Transcription: $text")
    }
    output?.text?.let { text ->
        // Handle transcription text of the audio output
        println("Output Transcription: $text")
    }
}

liveSession.startAudioConversation(null, ::handleTranscription)

// ...

Java


// ...

ExecutorService executor = Executors.newFixedThreadPool(1);

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    new LiveGenerationConfig.Builder()
            .setResponseModality(ResponseModality.AUDIO)
            .setInputAudioTranscription(new AudioTranscriptionConfig())
            .setOutputAudioTranscription(new AudioTranscriptionConfig())
            .build()
    );

LiveModelFutures liveModel = LiveModelFutures.from(lm);
ListenableFuture sessionFuture = liveModel.connect();

Futures.addCallback(sessionFuture, new FutureCallback() {
    @Override
    public void onSuccess(LiveSessionFutures ses) {
        LiveSessionFutures session = ses;
        session.startAudioConversation((Transcription input, Transcription output) -> {
            if (input != null) {
                // Handle transcription text of the audio input
                System.out.println("Input Transcription: " + input.getText());
            }
            if (output != null) {
                // Handle transcription text of the audio output
                System.out.println("Output Transcription: " + output.getText());
            }
            return null;
        });
    }

    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
        t.printStackTrace();
    }
}, executor);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    inputAudioTranscription: {},
    outputAudioTranscription: {},
  },
});

const liveSession = await liveModel.connect();

liveSession.sendAudioRealtime({ data, mimeType: "audio/pcm" });

const messages = liveSession.receive();
for await (const message of messages) {
  switch (message.type) {
    case 'serverContent':
      if (message.inputTranscription) {
        // Handle transcription text of the audio input
        console.log(`Input transcription: ${message.inputTranscription.text}`);
      }
      if (message.outputTranscription) {
        // Handle transcription text of the audio output
        console.log(`Output transcription: ${message.outputTranscription.text}`);
      } else {
      	 // Handle other message types (modelTurn, turnComplete, interruption)
      }
    default:
      // Handle other message types (toolCall, toolCallCancellation)
  }
}

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig(),
  ),
);

final LiveSession _session = _liveModel.connect();

await for (final response in _session.receive()) {
  LiveServerContent message = response.message;
  if (message.inputTranscription?.text case final inputText?) {
    // Handle transcription text of the audio input
    print('Input: $inputText');
  }

  if (message.outputTranscription?.text case final outputText?) {
    // Handle transcription text of the audio output
    print('Output: $outputText');
  }
}

// ...

Единство


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        inputAudioTranscription: new AudioTranscriptionConfig(),
        outputAudioTranscription: new AudioTranscriptionConfig()
    )
);

try
{
    var session = await liveModel.ConnectAsync();
    var stream = session.ReceiveAsync();
    await foreach (var response in stream) {
        if (response.Message is LiveSessionContent sessionContent) {
            if (!string.IsNullOrEmpty(sessionContent.InputTranscription?.Text)) {
              // handle transcription text of input audio
            }

            if (!string.IsNullOrEmpty(sessionContent.OutputTranscription?.Text)) {
              // handle transcription text of output audio
            }
        }
    }
}
catch (Exception e)
{
    // Handle error
}

// ...

Обнаружение голосовой активности (VAD)

Модель автоматически выполняет обнаружение голосовой активности (VAD) в непрерывном аудиопотоке. Функция VAD включена по умолчанию.

Управление сессиями

Узнайте больше о следующих темах, связанных с этими сессиями:
- Расширенные возможности, в том числе:
  - Обновление инструкций по системе в середине сеанса.
  - Добавление поэтапных обновлений контента
- Ограничения, связанные с сессиями , включая ограничения на количество подключений и продолжительность сессий, ограничения на контекстное окно сессии и ограничения на скорость соединения.
В настоящее время Firebase AI Logic пока не поддерживает следующие функции управления сессиями. Информация появится позже!
- Обработка прерываний
- Увеличение продолжительности сеанса
- Возобновление сессии
- Сохранение контекста между сессиями и запросами.
- Сжатие контекстного окна

Параметры конфигурации для Live API Оптимизируйте свои подборки Сохраняйте и классифицируйте контент в соответствии со своими настройками.

Голос и язык ответа

Укажите вариант ответа.

Быстрый

Kotlin

Java

Web

Dart

Единство

Влияйте на формулировку ответа.

Транскрипции для аудиовхода и выхода

Быстрый

Kotlin

Java

Web

Dart

Единство

Обнаружение голосовой активности (VAD)

Управление сессиями

Параметры конфигурации для Live API