Firebase AI Logic supports Gemini 3 Pro and Gemini 3 Pro Image (nano banana pro) for use on all platforms (in preview).

此页面由 Cloud Translation API 翻译。

使用 Imagen 根据指定的主题自定义图片

本页介绍了如何使用 Imagen 中的自定义功能，通过 Firebase AI Logic SDK 根据指定的主题修改或生成图片。

运作方式：您提供文本提示和至少一张显示特定主题（例如商品、人物或宠物）的参考图片。模型会使用这些输入，根据参考图片中指定的对象生成新图片。

例如，您可以让模型将一张儿童照片处理成卡通风格，或更改图片中自行车的颜色。

跳转到代码

准备工作

仅在将 Vertex AI Gemini API 用作 API 提供方时可用。

如果您尚未完成入门指南，请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 API 提供方初始化后端服务，以及创建 ImagenModel 实例。

支持此功能的模型

Imagen 通过其 capability 模型提供图片编辑功能：

imagen-3.0-capability-001

请注意，对于 Imagen 模型，不支持 global 位置。

发送主题自定义请求

以下示例展示了一个主题自定义请求，该请求要求模型根据提供的参考图片（在本例中为猫）生成新图片。由于猫是动物，因此它使用主题类型 ImagenSubjectReferenceType.ANIMAL。

如果您的主题是人或产品，您也可以使用此示例，但需进行以下更改：

如果您的主题是人，请使用主题类型 ImagenSubjectReferenceType.PERSON。发送此类请求时可包含或不包含人脸网格控制图片，以进一步引导图片生成。
如果您的主题是商品，请使用主题类型 ImagenSubjectReferenceType.PRODUCT。

请在本页后面部分查看提示模板，了解如何撰写提示以及如何在提示中使用参考图片。

Swift

Swift 不支持使用 Imagen 模型进行图片编辑。今年晚些时候再回来查看！

Kotlin

// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun customizeImage() {
    // Initialize the Vertex AI Gemini API backend service
    // Optionally specify the location to access the model (for example, `us-central1`)
    val ai = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1"))

    // Create an `ImagenModel` instance with an Imagen "capability" model
    val model = ai.imagenModel("imagen-3.0-capability-001")

    // This example assumes 'referenceImage' is a pre-loaded Bitmap.
    // In a real app, this might come from the user's device or a URL.
    val referenceImage: Bitmap = TODO("Load your reference image Bitmap here")

    // Define the subject reference using the reference image.
    val subjectReference = ImagenSubjectReference(
        image = referenceImage,
        referenceID = 1,
        description = "cat",
        subjectType = ImagenSubjectReferenceType.ANIMAL
    )

    // Provide a prompt that describes the final image.
    // The "[1]" links the prompt to the subject reference with ID 1.
    val prompt = "A cat[1] flying through outer space"

    // Use the editImage API to perform the subject customization.
    // Pass the list of references, the prompt, and an editing configuration.
    val editedImage = model.editImage(
        referenceImages = listOf(subjectReference),
        prompt = prompt,
        config = ImagenEditingConfig(
            editSteps = 50 // Number of editing steps, a higher value can improve quality
        )
    )

    // Process the result
}

Java

// Initialize the Vertex AI Gemini API backend service
// Optionally specify the location to access the model (for example, `us-central1`)
// Create an `ImagenModel` instance with an Imagen "capability" model
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1"))
        .imagenModel(
                /* modelName */ "imagen-3.0-capability-001");

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// This example assumes 'referenceImage' is a pre-loaded Bitmap.
// In a real app, this might come from the user's device or a URL.
Bitmap referenceImage = null; // TODO("Load your image Bitmap here");

// Define the subject reference using the reference image.
ImagenSubjectReference subjectReference = new ImagenSubjectReference.Builder()
        .setImage(referenceImage)
        .setReferenceID(1)
        .setDescription("cat")
        .setSubjectType(ImagenSubjectReferenceType.ANIMAL)
        .build();

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
String prompt = "A cat[1] flying through outer space";

// Define the editing configuration.
ImagenEditingConfig imagenEditingConfig = new ImagenEditingConfig.Builder()
        .setEditSteps(50) // Number of editing steps, a higher value can improve quality
        .build();

// Use the editImage API to perform the subject customization.
// Pass the list of references, the prompt, and an editing configuration.
Futures.addCallback(model.editImage(Collections.singletonList(subjectReference), prompt, imagenEditingConfig), new FutureCallback<ImagenGenerationResponse>() {
    @Override
    public void onSuccess(ImagenGenerationResponse result) {
        if (result.getImages().isEmpty()) {
            Log.d("TAG", "No images generated");
        }
        Bitmap bitmap = ((ImagenInlineImage) result.getImages().get(0)).asBitmap();
        // Use the bitmap to display the image in your UI
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web

Web 应用不支持使用 Imagen 模型进行图片编辑。今年晚些时候再回来查看！

Dart

import 'dart:typed_data';
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Optionally specify a location to access the model (for example, `us-central1`)
final ai = FirebaseAI.vertexAI(location: 'us-central1');

// Create an `ImagenModel` instance with an Imagen "capability" model
final model = ai.imagenModel(model: 'imagen-3.0-capability-001');

// This example assumes 'referenceImage' is a pre-loaded Uint8List.
// In a real app, this might come from the user's device or a URL.
final Uint8List referenceImage = Uint8List(0); // TODO: Load your reference image data here

// Define the subject reference using the reference image.
final subjectReference = ImagenSubjectReference(
  image: referenceImage,
  referenceId: 1,
  description: 'cat',
  subjectType: ImagenSubjectReferenceType.animal,
);

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
final prompt = "A cat[1] flying through outer space.";

try {
  // Use the editImage API to perform the subject customization.
  // Pass the list of references, the prompt, and an editing configuration.
  final response = await model.editImage(
    [subjectReference],
    prompt,
    config: ImagenEditingConfig(
      editSteps: 50, // Number of editing steps, a higher value can improve quality
    ),
  );

  // Process the result.
  if (response.images.isNotEmpty) {
    final editedImage = response.images.first.bytes;
    // Use the editedImage (a Uint8List) to display the image, save it, etc.
    print('Image successfully generated!');
  } else {
    // Handle the case where no images were generated.
    print('Error: No images were generated.');
  }
} catch (e) {
  // Handle any potential errors during the API call.
  print('An error occurred: $e');
}

Unity

Unity 不支持使用 Imagen 模型进行图片编辑。今年晚些时候再回来查看！

提示模板

在请求中，您可以通过定义 ImagenSubjectReference 来提供参考图片（最多 4 张），其中指定了图片的参考 ID（还可以选择性地指定拍摄对象说明）。请注意，多张图片可以具有相同的参考 ID（例如，同一只猫的多张照片）。

然后，在编写提示时，您会引用这些 ID。例如，您可以在提示中使用 [1] 来引用参考 ID 为 1 的图片。如果您提供主题说明，也可以将其包含在提示中，以便人类更轻松地阅读提示。

下表介绍了提示模板，您可以从这些模板入手，撰写基于主题（例如产品、人物或宠物）的自定义提示。

使用场景	参考图片	提示模板	示例
产品图片风格化 - 广告	主题图片（最多 4 张）	Create an image about `SUBJECT_DESCRIPTION [1]` to match the description: ${PROMPT}	Create an image about `Luxe Elixir hair oil, golden liquid in glass bottle [1]` to match the description: A close-up, high-key image of a woman's hand holding `Luxe Elixir hair oil, golden liquid in glass bottle [1]` against a pure white background. The woman's hand is well-lit and the focus is sharp on the bottle, with a shallow depth of field blurring the background and emphasizing the product. The lighting is soft and diffused, creating a subtle glow around the bottle and hand. The overall composition is simple and elegant, highlighting the product's luxurious appeal.
产品图片风格化 - 属性更改	主题图片（最多 4 张）	Generate an image of a `SUBJECT_DESCRIPTION` but ${PROMPT}	Generate an image of a `Seiko watch [1]` but in blue.
不使用人脸网格输入的人物图片风格化处理	主题图片（最多 4 张）	Create an image about `SUBJECT_DESCRIPTION [1]` to match the description: a portrait of `SUBJECT_DESCRIPTION [1]` ${PROMPT}	Create an image about `a woman with short hair[1]` to match the description: a portrait of `a woman with short hair[1]` in 3d-cartoon style with blurred background. A cute and lovely character, with a smiling face, looking at the camera, pastel color tone ...
不使用人脸网格输入的人物图片风格化处理	主题图片（最多 4 张）	Create a `STYLE_DESCRIPTION [2]` image about `SUBJECT_DESCRIPTION [1]` to match the description: a portrait of `SUBJECT_DESCRIPTION [1]` STYLE_PROMPT	Create a `3d-cartoon style [2]` image about `a woman with short hair [1]` to match the description: a portrait of `a woman with short hair [1]` in 3d-cartoon style with blurred background. A cute and lovely character, with a smiling face, looking at the camera, pastel color tone ...
使用人脸网格输入的人物图片风格化处理	主题图片（最多 3 张）人脸网格控制图片 (1)	Generate an image of `SUBJECT_DESCRIPTION [1]` with the `Face mesh from the control image [2]`. ${PROMPT}	Generate an image of `the person [1]` with the `face mesh from the control image [2]`. The person should be looking straight ahead with a neutral expression. The background should be a ...
使用人脸网格输入的人物图片风格化处理	主题图片（最多 3 张）人脸网格控制图片 (1)	Create an image about `SUBJECT_DESCRIPTION [1]` in the pose of the `CONTROL_IMAGE [2]` to match the description: a portrait of `SUBJECT_DESCRIPTION [1]` ${PROMPT}	Create an image about `a woman with short hair [1]` in the pose of the `control image [2]` to match the description: a portrait of `a woman with short hair [1]` in 3d-cartoon style with blurred background. A cute and lovely character, with a smiling face, looking at the camera, pastel color tone ...
使用人脸网格输入的人物图片风格化处理	主题图片（最多 3 张）人脸网格控制图片 (1)	Create a `STYLE_DESCRIPTION [3]` image about `SUBJECT_DESCRIPTION [1]` in the pose of the `CONTROL_IMAGE [2]` to match the description: a portrait of `SUBJECT_DESCRIPTION [1]` ${PROMPT}	Create a `3d-cartoon style [3]` image about `a woman with short hair [1]` in the pose of the `control image [2]` to match the description: a portrait of `a woman with short hair [1]` in 3d-cartoon style with blurred background. A cute and lovely character, with a smiling face, looking at the camera, pastel color tone ...

最佳做法和限制

如果您以人为拍摄对象，我们建议参考图片中的人脸具有以下特征：

居中显示，且至少占据整个图片的一半
在正面视图中沿所有方向（横滚、俯仰和偏转）旋转
未被太阳镜或口罩等物体遮挡

使用场景

自定义功能可提供自由式提示，这可能会给人一种印象，即模型能完成的任务比训练时学到的更多。以下部分介绍了自定义功能的预期应用场景，以及一些并非详尽无遗的非预期应用场景示例。

我们建议您将此功能用于预期应用场景，因为我们已针对这些应用场景训练了模型，可期望获得良好的结果。反之，如果您让模型执行预期应用场景之外的任务，则应预料到结果不理想。

预期应用场景

以下是基于主题的自定义的预期应用场景：

对人物照片进行风格化处理。
对人物照片进行风格化处理，并保留人物的面部表情。
（成功率低）将沙发或饼干等产品以不同的产品角度置于不同的场景中。
生成不保留精确细节的产品变体。
对人像照片进行风格化处理，同时保留面部表情。

非预期应用场景示例

下面列出了基于主题的自定义功能的非预期应用场景（并非详尽无遗）。该模型未针对这些使用场景进行训练，因此很可能会生成不理想的结果。

将两个或更多人物置于不同的场景中，同时保留其标识特征。
将两个或更多人物置于不同的场景中，同时保留其标识特征，并使用示例图片作为风格输入来指定输出图片的风格。
对包含两个或更多人物的照片进行风格化处理，同时保留其标识特征。
将宠物置于不同的场景中，同时保留其标识特征。
对宠物照片进行风格化处理并将其转换为绘画。
对宠物照片进行风格化处理并将其转换为绘画，同时保留或指定图片的风格（例如水彩画）。
将宠物和人物置于不同的场景中，同时保留两者的标识特征。
对包含一个宠物以及一个或多个人物的照片进行风格化处理，并将其转换为绘画。
将两种产品以不同的产品角度置于不同的场景中。
将饼干或沙发等产品以不同的产品角度置于不同的场景中，同时遵循特定的图片风格（例如具有特定颜色、采光风格或动画效果的写实图片）。
将产品置于不同的场景中，同时保留控制图片所指定的特定场景构图。
将两种产品以不同的产品角度置于不同的场景中，同时使用特定图片作为输入（例如具有特定颜色、照明风格或动画效果的写实图片）。
将两种产品置于不同的场景中，同时保留控制图片所指定的特定场景构图。