The latest Gemini models, like Gemini 3.5 Flash, are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models were shut down on June 1, 2026. To avoid service disruption, update to a newer model like gemini-3.1-flash-lite. Learn more.

All Imagen models will shut down on June 24, 2026. Learn about migrating your apps to use Nano Banana.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

在 Android 应用中构建混合体验，使用设备端模型和云端托管模型

您可以使用 Firebase AI Logic 通过混合推理构建 AI 赋能的 Android 应用和功能。混合推理支持在有设备端模型时使用设备端模型运行推理，否则无缝回退到云端托管的模型（反之亦然）。

本页面介绍了如何开始使用客户端 SDK，并展示了其他配置选项和功能，例如温度。

请注意，对于使用 Firebase AI Logic SDK v17.10.0 及更高版本（BoM v34.10.0 及更高版本）且在特定设备上运行的 Android 应用，支持通过 Firebase AI Logic 进行设备端推理。它受机器学习套件条款以及与机器学习套件的生成式 AI 方面相关的特定条款的约束。

支持的 API：
- 云端推理使用您选择的 Gemini API 提供方（Gemini Developer API 或 Vertex AI Gemini API）。
- 设备端推理使用 ML Kit 中的 Prompt API，该 API 处于 Beta 版阶段，仅在特定设备上可用。
  
  设备端模型的使用受机器学习套件条款以及与机器学习套件的生成式 AI 方面相关的特定条款的约束。
本页介绍了如何开始使用。

完成此标准设置后，请查看其他配置选项和功能（例如设置温度）。

支持的 Android 设备及其设备端模型

对于设备端推理（使用机器学习套件中的 Prompt API），您可以在机器学习套件文档中找到支持的设备及其设备端模型的列表。

开始使用

这些入门步骤介绍了您要发送的任何受支持的提示请求所需的一般设置。

第 1 步：设置 Firebase 项目并将应用连接到 Firebase

登录 Firebase 控制台，然后选择您的 Firebase 项目。
还没有 Firebase 项目？

如果您还没有 Firebase 项目，请点击相应按钮以创建新的 Firebase 项目，然后使用以下任一选项：
- 方法 1：在工作流程的第一步中输入新项目名称，创建一个全新的 Firebase 项目（该操作会自动创建相应的底层 Google Cloud 项目）。
- 方法 2：点击将 Firebase 添加到 Google Cloud 项目（位于页面底部），将 Firebase 添加到现有的 Google Cloud 项目。在工作流程的第一步中，开始输入现有项目的项目名称，然后从显示的列表中选择该项目。
完成屏幕上工作流程的其余步骤，以创建 Firebase 项目。请注意，如果系统提示您是否要设置 Google Analytics，您无需进行此项设置即可使用 Firebase AI Logic SDK。
在 Firebase 控制台中，依次前往 AI 服务 > AI 逻辑。
点击开始，启动引导式工作流，帮助您为项目设置必需的 API 和资源。
设置项目以使用“Gemini API”提供方。

我们建议您先使用 Gemini Developer API。 您可以随时设置 Vertex AI Gemini API（以及其结算要求）。

对于 Gemini Developer API，控制台将在您的项目中启用所需的 API 并创建 Gemini API 密钥。
请勿将此 Gemini API 密钥添加到应用的代码库中。 了解详情。
如果控制台的工作流程中出现提示，请按照屏幕上的说明注册您的应用并将其与 Firebase 关联。
继续执行本指南中的下一步，将 SDK 添加到您的应用。

第 2 步：添加所需的 SDK

Firebase AI Logic Android 版 SDK (firebase-ai) 与 Firebase AI Logic On-Device SDK (firebase-ai-ondevice) 一起提供对 API 的访问权限，以便与生成式模型进行交互。

在您的模块（应用级）Gradle 文件（例如 <project>/<app-module>/build.gradle.kts）中，为 Android 添加 Firebase AI Logic 库的依赖项：

Kotlin

dependencies {
  // ... other androidx dependencies

  // Add the dependencies for the Firebase AI Logic libraries
  // Note that the on-device SDK is not yet included in the Firebase Android BoM
  implementation("com.google.firebase:firebase-ai:17.12.1")
  implementation("com.google.firebase:firebase-ai-ondevice:16.0.0-beta02")
}

Java

对于 Java，您需要添加两个额外的库。

dependencies {
  // ... other androidx dependencies

  // Add the dependencies for the Firebase AI Logic libraries
  // Note that the on-device SDK is not yet included in the Firebase Android BoM
  implementation("com.google.firebase:firebase-ai:17.12.1")
  implementation("com.google.firebase:firebase-ai-ondevice:16.0.0-beta02")

  // Required for one-shot operations (to use `ListenableFuture` from Guava Android)
  implementation("com.google.guava:guava:31.0.1-android")

  // Required for streaming operations (to use `Publisher` from Reactive Streams)
  implementation("org.reactivestreams:reactive-streams:1.0.4")
}

第 3 步：检查设备端模型是否可用

使用 FirebaseAIOnDevice 检查设备端模型是否可用，如果不可用，则下载该模型。

下载完成后，AICore 会自动保持模型处于最新状态。如需详细了解 AICore 和管理设备端模型下载，请查看代码段后面的注释。

Kotlin

val status = FirebaseAIOnDevice.checkStatus()
when (status) {
  OnDeviceModelStatus.UNAVAILABLE -> {
    Log.w(TAG, "On-device model is unavailable")
  }

  OnDeviceModelStatus.DOWNLOADABLE -> {
    FirebaseAIOnDevice.download().collect { status ->
      when (status) {
        is DownloadStatus.DownloadStarted ->
          Log.w(TAG, "Starting download - ${status.bytesToDownload}")

        is DownloadStatus.DownloadInProgress ->
          Log.w(TAG, "Download in progress ${status.totalBytesDownloaded} bytes downloaded")

        is DownloadStatus.DownloadCompleted ->
          Log.w(TAG, "On-device model download complete")

        is DownloadStatus.DownloadFailed ->
          Log.e(TAG, "Download failed ${status}")
      }
    }
  }
  OnDeviceModelStatus.DOWNLOADING -> {
    Log.w(TAG, "On-device model is being downloaded")
  }

  OnDeviceModelStatus.AVAILABLE -> {
    Log.w(TAG, "On-device model is available")
  }
}

Java

Checking for and downloading the model is not yet available for Java.

However, all other APIs and interactions in this guide are available for Java.

请注意以下有关下载设备端模型的说明：

下载设备端模型所需的时间取决于多种因素，包括您的网络。
如果您的代码使用设备端模型进行主要推理或回退推理，请确保在应用生命周期的早期下载该模型，以便在最终用户遇到应用中的代码之前，设备端模型可用。
如果发出设备端推理请求时设备端模型不可用，SDK 将不会自动触发设备端模型的下载。SDK 将回退到云托管模型或抛出异常（请参阅有关推理模式行为的详细信息）。
AICore（一项 Android 系统服务）会为您管理下载的模型和版本，并保持模型处于最新状态等。请注意，设备只会下载一个模型，因此如果设备上的另一个应用之前已成功下载设备端模型，则此检查会返回模型可用的结果。

延迟时间优化

为了针对首次推理调用进行优化，您可以让应用调用 warmup()。这会将设备端模型加载到内存中并初始化运行时组件。

第 4 步：初始化服务并创建模型实例

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

在向模型发送提示请求之前，请先设置以下内容。

为所选的 API 提供商初始化服务。
创建 GenerativeModel 实例，并将 mode 设置为以下值之一。此处的说明非常概括，但您可以在设置推理模式中详细了解这些模式的行为。
- PREFER_ON_DEVICE：尝试使用设备端模型；否则，回退到云托管模型。
- ONLY_ON_DEVICE：尝试使用设备端模型；否则，抛出异常。
- PREFER_IN_CLOUD：尝试使用云托管模型；否则，回退到设备端模型。
- ONLY_IN_CLOUD：尝试使用云托管模型；否则，抛出异常。
注意：请注意以下几点：
- 如需使用设备端模型，请务必查看本页底部列出的尚未提供的功能。
- 如需使用云托管模型，设备必须处于在线状态，并且您必须明确指定要使用的云托管模型。
- 在响应中，SDK 会告知您是使用设备端推理还是云端推理。

Kotlin

// Using this SDK to access on-device inference is an Experimental release and requires opt-in
@OptIn(PublicPreviewAPI::class)

// ...

// Initialize the Gemini Developer API backend service
// Create a GenerativeModel instance with a model that supports your use case
// Set the inference mode (like PREFER_ON_DEVICE to use the on-device model if available)
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
    .generativeModel(
        modelName = "MODEL_NAME",
        onDeviceConfig = OnDeviceConfig(mode = InferenceMode.PREFER_ON_DEVICE)
    )

Java

// Initialize the Gemini Developer API backend service
// Create a GenerativeModel instance with a model that supports your use case
// Set the inference mode (like PREFER_ON_DEVICE to use the on-device model if available)
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
    .generativeModel(
        "MODEL_NAME",
        new OnDeviceConfig(InferenceMode.PREFER_ON_DEVICE)
    );

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);

第 5 步：向模型发送提示请求

本部分将介绍如何发送各种类型的输入来生成不同类型的输出，包括：

根据纯文本输入生成文本
根据文本和图片（多模态）输入生成文本

根据纯文本输入生成文本

在尝试此示例之前，请确保您已完成本指南的使用入门部分。

您可以使用 generateContent() 根据包含文本的提示生成文本：

Kotlin

// Imports + initialization of Gemini API backend service + creation of model instance

// Provide a prompt that contains text
val prompt = "Write a story about a magic backpack."

// To generate text output, call generateContent with the text input
val response = model.generateContent(prompt)
print(response.text)

Java

// Imports + initialization of Gemini API backend service + creation of model instance

// Provide a prompt that contains text
Content prompt = new Content.Builder()
    .addText("Write a story about a magic backpack.")
    .build();

// To generate text output, call generateContent with the text input
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

请注意，Firebase AI Logic 还支持使用 generateContentStream（而非 generateContent）对文本响应进行流式传输。

根据文本和图片（多模态）输入生成文本

在尝试此示例之前，请确保您已完成本指南的使用入门部分。

您可以使用 generateContent() 从包含文本和最多一个图片文件（仅限位图）的提示生成文本，并提供每个输入文件的 mimeType 和文件本身。

Kotlin

// Imports + initialization of Gemini API backend service + creation of model instance

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = model.generateContent(prompt)
print(response.text)

Java

// Imports + initialization of Gemini API backend service + creation of model instance

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

请注意，Firebase AI Logic 还支持使用 generateContentStream（而非 generateContent）对文本响应进行流式传输。

您还可以做些什么？

您可以为混合体验使用各种其他配置选项和功能：

尚不适用于设备端推理的功能

作为实验性版本，云模型并非所有功能都可用于设备端推理。

本部分列出的功能尚不适用于设备端推理。如果您想使用上述任何功能，建议您使用 ONLY_IN_CLOUD 推理模式，以获得更一致的体验。

生成结构化输出（例如 JSON 或枚举）
从除位图（加载到内存中的图片）以外的图片文件输入类型生成文本
根据多个图片文件生成文本
根据音频、视频和文档（例如 PDF）输入生成文本
使用 Gemini 或 Imagen 模型生成图片
在多模态请求中使用网址提供文件。您必须以内嵌数据的形式向设备端模型提供文件
发送的请求超过 4,000 个 token（或大约 3,000 个英文单词）。
多轮对话
为模型提供工具，帮助其生成回答（例如函数调用、代码执行、网址上下文、依托 Google Search 进行接地和依托 Google Maps 进行接地）

Firebase 控制台中的 AI 监控功能不会显示任何有关设备端推理（包括设备端日志）的数据。不过，任何使用云托管模型的推理都可以像其他推理一样通过 Firebase AI Logic 进行监控。

其他限制

除了上述限制之外，设备端推理还存在以下限制（如需了解详情，请参阅机器学习套件文档）：

应用最终用户必须使用支持的设备才能进行设备端推理。
您的应用只能在前台运行时在设备上运行推理。
只有英语和韩语经过了设备端推理验证。
整个设备端推理请求的令牌数量上限为 4,000 个。如果您的请求可能会超出此限制，请务必配置可以使用云托管模型的推理模式。
我们建议避免使用需要长输出（超过 256 个令牌）的设备端推理应用场景。
AICore（一种用于管理设备端模型的 Android 系统服务）会强制执行每个应用的推理配额。如果在短时间内发出过多的 API 请求，则会导致 ErrorCode.BUSY 响应。如果您收到此错误，请考虑使用指数退避算法重试请求。此外，如果应用超出长时间配额（例如每日配额），则可能会返回 ErrorCode.PER_APP_BATTERY_USE_QUOTA_EXCEEDED。

就您使用 Firebase AI Logic 的体验提供反馈

在 Android 应用中构建混合体验，使用设备端模型和云端托管模型

推荐的使用场景和支持的功能

推荐的用例

设备端推理支持的功能

准备工作

支持的 Android 设备及其设备端模型

开始使用

第 1 步：设置 Firebase 项目并将应用连接到 Firebase

第 2 步：添加所需的 SDK

Kotlin

Java

第 3 步：检查设备端模型是否可用

Kotlin

Java

延迟时间优化

第 4 步：初始化服务并创建模型实例

Kotlin

Java

第 5 步：向模型发送提示请求

根据纯文本输入生成文本

Kotlin

Java

根据文本和图片（多模态）输入生成文本

Kotlin

Java

您还可以做些什么？

尚不适用于设备端推理的功能

其他限制

在 Android 应用中构建混合体验，使用设备端模型和云端托管模型 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

推荐的使用场景和支持的功能

推荐的用例

设备端推理支持的功能

准备工作

支持的 Android 设备及其设备端模型

开始使用

第 1 步：设置 Firebase 项目并将应用连接到 Firebase

第 2 步：添加所需的 SDK

Kotlin

Java

第 3 步：检查设备端模型是否可用

Kotlin

Java

延迟时间优化

第 4 步：初始化服务并创建模型实例

Kotlin

Java

第 5 步：向模型发送提示请求

根据纯文本输入生成文本

Kotlin

Java

根据文本和图片（多模态）输入生成文本

Kotlin

Java

您还可以做些什么？

尚不适用于设备端推理的功能

其他限制

在 Android 应用中构建混合体验，使用设备端模型和云端托管模型