The latest Gemini models, like Gemini 3.5 Flash, are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models were shut down on June 1, 2026. To avoid service disruption, update to a newer model like gemini-3.1-flash-lite. Learn more.

All Imagen models will shut down on June 24, 2026. Learn about migrating your apps to use Nano Banana.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Apple 应用中混合体验的配置选项

本页介绍了混合体验和设备端体验的以下配置选项：

设置“推理模式”。
检查设备端模型是否可用。
确定是使用设备端推理还是云端推理。
使用模型配置来控制回答（例如温度）。

请确保您已完成构建混合体验的入门指南。

设置“推理模式”

入门指南中的示例展示了如何实现先尝试设备端推理，然后回退到云端托管模型。这只是您可以实现的可用“推理模式”之一。

混合推理

首选设备端推理：将 primary 设置为“系统”模型，并将 secondary 设置为云端模型。

尝试使用设备端模型（如果可用且支持请求类型）。否则，请在设备上记录错误，然后自动 回退到云端托管模型。

// Imports + initialization of Gemini API backend service
// ...

// Initialize a cloud model that supports your use case
let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME")
// Initialize an on-device model that supports your use case
let systemModel = FirebaseAI.SystemLanguageModel.default

// Create a GenerativeModelSession with a hybrid model.
// Provide your preferred model as `primary` and your fallback model as `secondary`
// Attempt to use the on-device model; otherwise, fall back to the cloud-hosted model.
let session = ai.generativeModelSession(
  model: .hybridModel(primary: systemModel, secondary: cloudModel)
)

首选云端推理：将 primary 设置为云端模型，并将 secondary 设置为“系统”模型。

尝试使用云端托管模型（如果设备处于在线状态且模型可用）。如果设备处于离线状态，请 回退到设备端模型。在所有其他失败情况下，请 抛出异常。

// Imports + initialization of Gemini API backend service
// ...

// Initialize a cloud model that supports your use case
let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME")
// Initialize an on-device model that supports your use case
let systemModel = FirebaseAI.SystemLanguageModel.default

// Create a GenerativeModelSession with a hybrid model.
// Provide your preferred model as `primary` and your fallback model as `secondary`
// Attempt to use the cloud-hosted model; otherwise, fall back to the on-device model.
let session = ai.generativeModelSession(
  model: .hybridModel(primary: cloudModel, secondary: systemModel)
)

仅设备端推理或仅云端推理

SDK 支持仅设置单个 model，这意味着 SDK 将仅尝试设备端推理或云端推理。此外，您不会为此用例创建 HybridModel。但是，对于混合体验，您需要创建 HybridModel 并设置 primary 和 secondary 模型（如上所述）。

仅设备端推理：将 model 设置为“系统”模型。您不会为此用例创建 HybridModel。

尝试使用设备端模型（如果可用且支持请求类型）。否则，请 抛出异常。

// Imports + initialization of Gemini API backend service
// ...

// Initialize an on-device model that supports your use case
let systemModel = FirebaseAI.SystemLanguageModel.default

// Create a GenerativeModelSession with the on-device model.
let session = ai.generativeModelSession(
  model: systemModel
)

仅云端推理：将 model 设置为云端模型。您不会为此用例创建 HybridModel。

尝试使用云端托管模型（如果设备处于在线状态且模型可用）。否则，请 抛出异常。

// Imports + initialization of Gemini API backend service
// ...

// Initialize a cloud model that supports your use case
let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME")

// Create a GenerativeModelSession with a cloud model.
let session = ai.generativeModelSession(
  model: cloudModel
)

检查设备端模型是否可用

如果您想向用户显示该信息或要求最终用户采取行动来下载设备端模型，则只需手动检查设备端可用性。如果设备端模型不可用，并且您已将 primary 设置为设备端模型，并将 secondary 设置为云端模型，则 SDK 将自动回退到使用云端托管模型。

如需手动检查设备端模型是否实际可用，请检查 isAvailable 属性：

if FirebaseAI.SystemLanguageModel.default.isAvailable {
  // The on-device model is ready to use.
} else {
  // The on-device model is unavailable.
}

如需检查特定设备端模型的可用性原因，请检查 availability 属性：

switch FirebaseAI.SystemLanguageModel.default.availability {
case .available:
  // The on-device model is ready to use.
  break
case .unavailable(.deviceNotEligible):
  // This device does not support Apple Intelligence.
  break
case .unavailable(.appleIntelligenceNotEnabled):
  // The user has not enabled Apple Intelligence in Settings.
  break
case .unavailable(.modelNotReady):
  // The model is still being downloaded.
  break
case let .unavailable(reason):
  // The model is unavailable due to the specified `reason`.
  break
}

确定是使用设备端推理还是云端推理

如果您使用 HybridModel（并设置 primary 和 secondary 模型），那么了解给定请求使用了哪个模型可能会很有帮助。此信息由每个回答中的 rawResponse 的 modelVersion 属性提供。

当您访问此属性时，返回的值将是以下其中一项：

使用的云端托管模型：模型名称，例如 gemini-3.1-flash-lite
使用的设备端模型：apple-foundation-models-system-language-model

// let response = try await session.respond(to: ...

print("You used: \(response.rawResponse.modelVersion)")

print(response.content)

使用模型配置来控制回答

在向模型发出的每个请求中，您都可以发送模型配置，以控制模型如何生成回答。云端托管模型和设备端模型提供不同的配置选项（云端参数与设备端参数）。

云端托管模型：在 GenerationConfig中设置其配置。
设备端模型：在 FirebaseAI.GenerationOptions中设置其配置。

这些选项是针对向模型发出的每个请求进行配置的。

以下示例展示了如何为混合推理设置云端托管模型和设备端模型的配置：

// ...

let response = try await session.respond(
  to: "Why is the sky blue?",
  options: .hybrid(
    // Config for cloud-hosted model
    gemini: GenerationConfig(
      temperature: 0.8,
      topP: 0.9,
      thinkingConfig: ThinkingConfig(thinkingLevel: .high)
    ),
    // Config for on-device model
    foundationModels: FirebaseAI.GenerationOptions(
      sampling: .random(probabilityThreshold: 0.9),
      temperature: 0.8
    )
  )
)

// ...

提供反馈有关您的使用体验Firebase AI Logic