The latest Gemini models, like Gemini 3.1 Flash Image (Nano Banana 2), are available to use with Firebase AI Logic on all platforms!

Gemini 2.0 Flash and Flash-Lite models will be retired on June 1, 2026. To avoid service disruption, update to a newer model like gemini-2.5-flash-lite. Also, Gemini 3 Pro Preview (gemini-3-pro-preview) will be retired on March 9, 2026 (update to Gemini 3.1 Pro Preview: gemini-3.1-pro-preview). Learn more.

این صفحه به‌وسیله ‏Cloud Translation API‏ ترجمه شده است.

گزینه‌های پیکربندی برای رابط برنامه‌نویسی زنده (Live API)

حتی با پیاده‌سازی اولیه‌ی Live API ، می‌توانید تعاملات جذاب و قدرتمندی برای کاربران خود ایجاد کنید. می‌توانید با استفاده از گزینه‌های پیکربندی زیر، تجربه را حتی بیشتر سفارشی کنید:

صدا و زبان پاسخ
رونوشت‌ها برای ورودی و خروجی صدا
تشخیص فعالیت صوتی (VAD)
مدیریت جلسه

صدا و زبان پاسخ

شما می‌توانید مدل را وادار کنید که با صدای خاصی پاسخ دهد و او را طوری تنظیم کنید که به زبان‌های مختلف پاسخ دهد .

صدای پاسخ را مشخص کنید

برای مشاهده محتوا و کد مخصوص ارائه‌دهنده در این صفحه، روی ارائه‌دهنده API Gemini خود کلیک کنید.

Live API از Chirp 3 برای پشتیبانی از پاسخ‌های گفتاری سنتز شده در صداهای HD استفاده می‌کند.

اگر صدای پاسخ را مشخص نکنید، صدای پیش‌فرض Puck است.

مشاهده لیست گزینه‌های صدای پاسخ

برای نمایش دموهایی از اینکه هر صدا چگونه به نظر می‌رسد، به Chirp 3: HD voices مراجعه کنید.

Zephyr -- روشن
Kore -- شرکت
Orus -- شرکت
Autonoe -- روشن
Umbriel -- آسان‌گیر
Erinome -- پاک
Laomedeia -- خوش‌بین
Schedar -- حتی
Achird -- دوستانه
Sadachbia -- سرزنده Puck -- خوش‌بین
Fenrir -- هیجان‌انگیز
Aoede -- نسیم ملایم
Enceladus -- نفس‌گیر
Algieba -- صاف
Algenib -- شنی
Achernar -- نرم
Gacrux -- بالغ
Zubenelgenubi -- غیررسمی
Sadaltager - آگاه Charon -- آموزنده
Leda -- جوان
Callirrhoe -- آسان‌گیر
Iapetus -- شفاف
Despina -- صاف
Rasalgethi -- آموزنده
Alnilam -- شرکت
Pulcherrima -- مهاجم
Vindemiatrix -- ملایم
Sulafat -- گرم

برای مشخص کردن صدای پاسخ، نام صدا را در شیء speechConfig به عنوان بخشی از پیکربندی مدل تنظیم کنید.

سویفت


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    speech: SpeechConfig(voiceName: "VOICE_NAME")
  )
)

// ...

Kotlin


// ...

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voice("VOICE_NAME"))
    }
)

// ...

Java


// ...

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModality(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(new Voice("VOICE_NAME")))
        .build()
);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to use a specific voice for its audio response
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    speechConfig: {
      voiceConfig: {
        prebuiltVoiceConfig: { voiceName: "VOICE_NAME" },
      },
    },
  },
});

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to use a specific voice for its audio response
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    speechConfig: SpeechConfig(voiceName: 'VOICE_NAME'),
  ),
);

// ...

وحدت


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to use a specific voice for its audio response
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        speechConfig: SpeechConfig.UsePrebuiltVoice("VOICE_NAME")
    )
);

// ...

بر زبان پاسخ تأثیر بگذارید

مدل‌های Live API به طور خودکار زبان مناسب را برای پاسخ‌های خود انتخاب می‌کنند.

مشاهده لیست زبان‌های پشتیبانی‌شده

زبان	کد BCP-47	زبان	کد BCP-47
عربی (مصری)	ar-EG	آلمانی (آلمان)	د-DE
انگلیسی (آمریکایی)	انگلیسی-آمریکایی	اسپانیایی (آمریکایی)	es-US
فرانسوی (فرانسه)	fr-FR	هندی (هند)	سلام-ورودی
اندونزیایی (اندونزیایی)	شناسه-شناسه	ایتالیایی (ایتالیا)	فناوری اطلاعات
ژاپنی (ژاپن)	جا-جی پی	کره‌ای (کره)	کو-کی‌آر
پرتغالی (برزیل)	پی تی-بی آر	روسی (روسیه)	ru-RU
هلندی (هلند)	nl-NL	لهستانی (لهستان)	پی ال-پی ال
تایلندی (تایلند)	ام-ام	ترکی (ترکیه)	تر-تی‌آر
ویتنامی (ویتنام)	vi-VN	رومانیایی (رومانیایی)	ro-RO
اوکراینی (اوکراین)	انگلستان-آمریکا	بنگالی (بنگلادش)	بی ان-بی دی
انگلیسی (هند)	بسته en-IN و hi-IN	مراتی (هند)	آقای-IN
تامیل (هند)	تا-این	تلوگو (هند)	te-IN

اگر می‌خواهید مدل به زبانی غیر از انگلیسی یا همیشه به یک زبان خاص پاسخ دهد، می‌توانید با استفاده از دستورالعمل‌های سیستمی مانند این مثال‌ها، پاسخ‌های مدل را تحت تأثیر قرار دهید:

این مدل را تقویت کنید که یک زبان غیر انگلیسی ممکن است مناسب باشد

Listen to the speaker carefully. If you detect a non-English language, respond
in the language you hear from the speaker. You must respond unmistakably in the
speaker's language.

به مدل بگویید که همیشه به یک زبان خاص پاسخ دهد
```
RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.
```

رونوشت‌ها برای ورودی و خروجی صدا

برای مشاهده محتوا و کد مخصوص ارائه‌دهنده در این صفحه، روی ارائه‌دهنده API Gemini خود کلیک کنید.

به عنوان بخشی از پاسخ مدل، می‌توانید رونوشت‌هایی از ورودی صوتی و پاسخ صوتی مدل را دریافت کنید. شما این پیکربندی را به عنوان بخشی از پیکربندی مدل تنظیم می‌کنید.

برای رونویسی از ورودی صدا، inputAudioTranscription را اضافه کنید.
برای رونویسی پاسخ صوتی مدل، outputAudioTranscription را اضافه کنید.

به موارد زیر توجه کنید:

می‌توانید مدل را طوری پیکربندی کنید که رونوشت‌هایی از هر دو ورودی و خروجی را برگرداند (همانطور که در مثال زیر نشان داده شده است)، یا می‌توانید آن را طوری پیکربندی کنید که فقط یکی از آنها را برگرداند.
متن‌ها به همراه صدا پخش می‌شوند، بنابراین بهتر است آنها را مانند بخش‌های متنی در هر نوبت جمع‌آوری کنید.
زبان رونویسی از ورودی صوتی و پاسخ صوتی مدل استنباط می‌شود.

سویفت


// ...

let liveModel = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig()
  )
)

var inputTranscript: String = ""
var outputTranscript: String = ""

do {
  let session = try await liveModel.connect()
  for try await response in session.responses {
    if case let .content(content) = response.payload {
      if let inputText = content.inputAudioTranscription?.text {
        // Handle transcription text of the audio input
        inputTranscript += inputText
      }

      if let outputText = content.outputAudioTranscription?.text {
        // Handle transcription text of the audio output
        outputTranscript += outputText
      }

      if content.isTurnComplete {
        // Log the transcripts after the current turn is complete
        print("Input audio: \(inputTranscript)")
        print("Output audio: \(outputTranscript)")

        // Reset the transcripts for the next turn
        inputTranscript = ""
        outputTranscript = ""
      }
    }
  }


} catch {
  // Handle error
}

// ...

Kotlin


// ...

val liveModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        inputAudioTranscription = AudioTranscriptionConfig()
        outputAudioTranscription = AudioTranscriptionConfig()
   }
)

val liveSession = liveModel.connect()

fun handleTranscription(input: Transcription?, output: Transcription?) {
    input?.text?.let { text ->
        // Handle transcription text of the audio input
        println("Input Transcription: $text")
    }
    output?.text?.let { text ->
        // Handle transcription text of the audio output
        println("Output Transcription: $text")
    }
}

liveSession.startAudioConversation(null, ::handleTranscription)

// ...

Java


// ...

ExecutorService executor = Executors.newFixedThreadPool(1);

LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    new LiveGenerationConfig.Builder()
            .setResponseModality(ResponseModality.AUDIO)
            .setInputAudioTranscription(new AudioTranscriptionConfig())
            .setOutputAudioTranscription(new AudioTranscriptionConfig())
            .build()
    );

LiveModelFutures liveModel = LiveModelFutures.from(lm);
ListenableFuture sessionFuture = liveModel.connect();

Futures.addCallback(sessionFuture, new FutureCallback() {
    @Override
    public void onSuccess(LiveSessionFutures ses) {
        LiveSessionFutures session = ses;
        session.startAudioConversation((Transcription input, Transcription output) -> {
            if (input != null) {
                // Handle transcription text of the audio input
                System.out.println("Input Transcription: " + input.getText());
            }
            if (output != null) {
                // Handle transcription text of the audio output
                System.out.println("Output Transcription: " + output.getText());
            }
            return null;
        });
    }

    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
        t.printStackTrace();
    }
}, executor);

// ...

Web


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

const liveModel = getLiveGenerativeModel(ai, {
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    inputAudioTranscription: {},
    outputAudioTranscription: {},
  },
});

const liveSession = await liveModel.connect();

liveSession.sendAudioRealtime({ data, mimeType: "audio/pcm" });

const messages = liveSession.receive();
for await (const message of messages) {
  switch (message.type) {
    case 'serverContent':
      if (message.inputTranscription) {
        // Handle transcription text of the audio input
        console.log(`Input transcription: ${message.inputTranscription.text}`);
      }
      if (message.outputTranscription) {
        // Handle transcription text of the audio output
        console.log(`Output transcription: ${message.outputTranscription.text}`);
      } else {
      	 // Handle other message types (modelTurn, turnComplete, interruption)
      }
    default:
      // Handle other message types (toolCall, toolCallCancellation)
  }
}

// ...

Dart


// ...

final _liveModel = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.5-flash-native-audio-preview-12-2025',
  // Configure the model to return transcriptions of the audio input and output
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: [ResponseModalities.audio],
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig(),
  ),
);

final LiveSession _session = _liveModel.connect();

await for (final response in _session.receive()) {
  LiveServerContent message = response.message;
  if (message.inputTranscription?.text case final inputText?) {
    // Handle transcription text of the audio input
    print('Input: $inputText');
  }

  if (message.outputTranscription?.text case final outputText?) {
    // Handle transcription text of the audio output
    print('Output: $outputText');
  }
}

// ...

وحدت


// ...

var liveModel = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.5-flash-native-audio-preview-12-2025",
    // Configure the model to return transcriptions of the audio input and output
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio },
        inputAudioTranscription: new AudioTranscriptionConfig(),
        outputAudioTranscription: new AudioTranscriptionConfig()
    )
);

try
{
    var session = await liveModel.ConnectAsync();
    var stream = session.ReceiveAsync();
    await foreach (var response in stream) {
        if (response.Message is LiveSessionContent sessionContent) {
            if (!string.IsNullOrEmpty(sessionContent.InputTranscription?.Text)) {
              // handle transcription text of input audio
            }

            if (!string.IsNullOrEmpty(sessionContent.OutputTranscription?.Text)) {
              // handle transcription text of output audio
            }
        }
    }
}
catch (Exception e)
{
    // Handle error
}

// ...

تشخیص فعالیت صوتی (VAD)

این مدل به طور خودکار تشخیص فعالیت صوتی (VAD) را روی یک جریان ورودی صوتی پیوسته انجام می‌دهد. VAD به طور پیش‌فرض فعال است.

مدیریت جلسه

درباره مباحث مرتبط با جلسات زیر اطلاعات کسب کنید:
- قابلیت‌های پیشرفته، از جمله:
  - به‌روزرسانی دستورالعمل‌های سیستم در اواسط جلسه
  - اضافه کردن به‌روزرسانی‌های تدریجی محتوا
- محدودیت‌های مربوط به جلسه ، شامل محدودیت‌های اتصال و طول جلسه، محدودیت‌های پنجره‌ی زمینه‌ی جلسه و محدودیت‌های سرعت.
Firebase AI Logic هنوز از ویژگی‌های زیر برای مدیریت جلسه پشتیبانی نمی‌کند . به زودی دوباره بررسی کنید!
- مدیریت وقفه‌ها
- افزایش طول جلسه
- از سرگیری یک جلسه
- حفظ زمینه در جلسات و درخواست‌ها
- فشرده‌سازی پنجره زمینه

گزینه‌های پیکربندی برای رابط برنامه‌نویسی زنده (Live API) با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.

صدا و زبان پاسخ

صدای پاسخ را مشخص کنید

سویفت

Kotlin

Java

Web

Dart

وحدت

بر زبان پاسخ تأثیر بگذارید

رونوشت‌ها برای ورودی و خروجی صدا

سویفت

Kotlin

Java

Web

Dart

وحدت

تشخیص فعالیت صوتی (VAD)

مدیریت جلسه

گزینه‌های پیکربندی برای رابط برنامه‌نویسی زنده (Live API)