Firebase AI Logic supports Gemini 3 Pro and Gemini 3 Pro Image (nano banana pro) for use on all platforms (in preview).

本頁面由 Cloud Translation API 翻譯而成。

使用 Imagen 根據指定主題自訂圖片

本頁面說明如何使用 Firebase AI Logic SDK，透過 Imagen 的自訂功能，根據指定主題編輯或生成圖片。

運作方式：提供文字提示和至少一張參考圖片，當中顯示特定主體 (例如產品、人物或寵物)。模型會根據參考圖片中指定的主體，使用這些輸入內容生成新圖片。

例如，你可以要求模型將兒童相片套用卡通風格，或是變更圖片中腳踏車的顏色。

跳至程式碼

事前準備

只有在使用 Vertex AI Gemini API 做為 API 供應商時，才能使用這項功能。

如果尚未完成，請參閱入門指南，瞭解如何設定 Firebase 專案、將應用程式連結至 Firebase、新增 SDK、初始化所選 API 供應商的後端服務，以及建立 ImagenModel 執行個體。

支援這項功能的模型

Imagen 可透過 capability 模型編輯圖片：

imagen-3.0-capability-001

請注意，Imagen 型號global不支援位置。

傳送主旨自訂要求

以下範例顯示主體自訂要求，要求模型根據提供的參考圖片 (在本例中為貓) 生成新圖片。由於貓是動物，因此會使用主體類型 ImagenSubjectReferenceType.ANIMAL。

如果主體是人或產品，您也可以使用這個範例，但須進行下列變更：

如果主體是人，請使用主體類型 ImagenSubjectReferenceType.PERSON。你可以傳送這類要求，並視需要附上臉部網格控制圖片，進一步引導系統生成圖片。
如果主體是產品，請使用主體類型 ImagenSubjectReferenceType.PRODUCT。

請參閱本頁稍後的「提示範本」一節，瞭解如何撰寫提示，以及如何在提示中使用參考圖片。

Swift

Swift 不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

Kotlin

// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun customizeImage() {
    // Initialize the Vertex AI Gemini API backend service
    // Optionally specify the location to access the model (for example, `us-central1`)
    val ai = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1"))

    // Create an `ImagenModel` instance with an Imagen "capability" model
    val model = ai.imagenModel("imagen-3.0-capability-001")

    // This example assumes 'referenceImage' is a pre-loaded Bitmap.
    // In a real app, this might come from the user's device or a URL.
    val referenceImage: Bitmap = TODO("Load your reference image Bitmap here")

    // Define the subject reference using the reference image.
    val subjectReference = ImagenSubjectReference(
        image = referenceImage,
        referenceID = 1,
        description = "cat",
        subjectType = ImagenSubjectReferenceType.ANIMAL
    )

    // Provide a prompt that describes the final image.
    // The "[1]" links the prompt to the subject reference with ID 1.
    val prompt = "A cat[1] flying through outer space"

    // Use the editImage API to perform the subject customization.
    // Pass the list of references, the prompt, and an editing configuration.
    val editedImage = model.editImage(
        referenceImages = listOf(subjectReference),
        prompt = prompt,
        config = ImagenEditingConfig(
            editSteps = 50 // Number of editing steps, a higher value can improve quality
        )
    )

    // Process the result
}

Java

// Initialize the Vertex AI Gemini API backend service
// Optionally specify the location to access the model (for example, `us-central1`)
// Create an `ImagenModel` instance with an Imagen "capability" model
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1"))
        .imagenModel(
                /* modelName */ "imagen-3.0-capability-001");

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// This example assumes 'referenceImage' is a pre-loaded Bitmap.
// In a real app, this might come from the user's device or a URL.
Bitmap referenceImage = null; // TODO("Load your image Bitmap here");

// Define the subject reference using the reference image.
ImagenSubjectReference subjectReference = new ImagenSubjectReference.Builder()
        .setImage(referenceImage)
        .setReferenceID(1)
        .setDescription("cat")
        .setSubjectType(ImagenSubjectReferenceType.ANIMAL)
        .build();

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
String prompt = "A cat[1] flying through outer space";

// Define the editing configuration.
ImagenEditingConfig imagenEditingConfig = new ImagenEditingConfig.Builder()
        .setEditSteps(50) // Number of editing steps, a higher value can improve quality
        .build();

// Use the editImage API to perform the subject customization.
// Pass the list of references, the prompt, and an editing configuration.
Futures.addCallback(model.editImage(Collections.singletonList(subjectReference), prompt, imagenEditingConfig), new FutureCallback<ImagenGenerationResponse>() {
    @Override
    public void onSuccess(ImagenGenerationResponse result) {
        if (result.getImages().isEmpty()) {
            Log.d("TAG", "No images generated");
        }
        Bitmap bitmap = ((ImagenInlineImage) result.getImages().get(0)).asBitmap();
        // Use the bitmap to display the image in your UI
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web

網頁應用程式不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

Dart

import 'dart:typed_data';
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Optionally specify a location to access the model (for example, `us-central1`)
final ai = FirebaseAI.vertexAI(location: 'us-central1');

// Create an `ImagenModel` instance with an Imagen "capability" model
final model = ai.imagenModel(model: 'imagen-3.0-capability-001');

// This example assumes 'referenceImage' is a pre-loaded Uint8List.
// In a real app, this might come from the user's device or a URL.
final Uint8List referenceImage = Uint8List(0); // TODO: Load your reference image data here

// Define the subject reference using the reference image.
final subjectReference = ImagenSubjectReference(
  image: referenceImage,
  referenceId: 1,
  description: 'cat',
  subjectType: ImagenSubjectReferenceType.animal,
);

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
final prompt = "A cat[1] flying through outer space.";

try {
  // Use the editImage API to perform the subject customization.
  // Pass the list of references, the prompt, and an editing configuration.
  final response = await model.editImage(
    [subjectReference],
    prompt,
    config: ImagenEditingConfig(
      editSteps: 50, // Number of editing steps, a higher value can improve quality
    ),
  );

  // Process the result.
  if (response.images.isNotEmpty) {
    final editedImage = response.images.first.bytes;
    // Use the editedImage (a Uint8List) to display the image, save it, etc.
    print('Image successfully generated!');
  } else {
    // Handle the case where no images were generated.
    print('Error: No images were generated.');
  }
} catch (e) {
  // Handle any potential errors during the API call.
  print('An error occurred: $e');
}

Unity

Unity 不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

提示範本

在要求中，您可以定義 ImagenSubjectReference，指定圖片的參照 ID (以及選用的主體說明)，藉此提供參照圖片 (最多 4 張)。請注意，多張圖片可以有相同的參照 ID (例如同一隻貓的多張相片)。

接著，撰寫提示時請參照這些 ID。舉例來說，您可以在提示中使用 [1]，參照參考 ID 為 1 的圖片。如果提供主體說明，也可以將其納入提示，方便使用者閱讀提示。

下表說明提示範本，可做為撰寫提示的起點，根據主題 (例如產品、人物或寵物) 自訂提示。

用途	參考圖片	提示範本	範例
產品圖片風格化 - 廣告	主體圖像 (最多 4 張)	請根據說明「${PROMPT}」，製作與 `SUBJECT_DESCRIPTION [1]` 相關的圖片。	根據以下說明建立 `Luxe Elixir hair oil, golden liquid in glass bottle [1]` 相關圖片：以純白色背景為襯，拍攝女子手持 `Luxe Elixir hair oil, golden liquid in glass bottle [1]` 的特寫高調圖片。女子的手部光線充足，焦點清晰地落在瓶子上，淺景深效果則模糊了背景，突顯產品。光線柔和，採用漫射光源，在瓶子和手周圍營造出微光。整體構圖簡單優雅，突顯產品的奢華魅力。
產品圖片風格化 - 屬性變更	主體圖像 (最多 4 張)	Generate an image of a `SUBJECT_DESCRIPTION` but ${PROMPT}	生成`Seiko watch [1]`的圖片，但顏色為藍色。
人物圖像風格化 (不含臉部網格輸入內容)	主體圖像 (最多 4 張)	請根據以下說明製作圖片：`SUBJECT_DESCRIPTION [1]`的肖像照 ${PROMPT} `SUBJECT_DESCRIPTION [1]`	根據以下說明建立「`a woman with short hair[1]`」的圖片：以 3D 卡通風格繪製「`a woman with short hair[1]`」的肖像，背景模糊處理。可愛討喜的角色，面帶微笑看向鏡頭，粉彩色調 ...
人物圖像風格化 (不含臉部網格輸入內容)	主體圖像 (最多 4 張)	根據以下說明建立 `STYLE_DESCRIPTION [2]` 圖片：`SUBJECT_DESCRIPTION [1]` 的肖像，`SUBJECT_DESCRIPTION [1]` STYLE_PROMPT	根據以下說明，以 `3d-cartoon style [2]` 3D 卡通風格繪製「`a woman with short hair [1]`」的肖像，並模糊處理背景：`3d-cartoon style [2]a woman with short hair [1]`可愛的角色，面帶微笑，看向鏡頭，粉彩色調 ...
以臉部網格輸入內容生成個人風格圖片	主體圖像 (最多 3 張) 臉部網格控制圖像 (1 張)	使用 `Face mesh from the control image [2]` 生成 `SUBJECT_DESCRIPTION [1]` 的圖片。${PROMPT}	使用 `face mesh from the control image [2]` 生成 `the person [1]` 的圖片。拍攝對象應面無表情地直視前方。背景應為 ...
以臉部網格輸入內容生成個人風格圖片	主體圖像 (最多 3 張) 臉部網格控制圖像 (1 張)	請以`SUBJECT_DESCRIPTION [1]`的姿勢生成`CONTROL_IMAGE [2]`圖片，並符合以下描述：`SUBJECT_DESCRIPTION [1]`的肖像照 ${PROMPT}	以`control image [2]`的姿勢繪製「`a woman with short hair [1]`」的圖片，符合以下描述：`a woman with short hair [1]`的肖像照，3D 卡通風格，背景模糊。可愛的角色，面帶微笑，看向鏡頭，粉彩色調 ...
以臉部網格輸入內容生成個人風格圖片	主體圖像 (最多 3 張) 臉部網格控制圖像 (1 張)	Create a `STYLE_DESCRIPTION [3]` image about `SUBJECT_DESCRIPTION [1]` in the pose of the `CONTROL_IMAGE [2]` to match the description: a portrait of `SUBJECT_DESCRIPTION [1]` ${PROMPT}	請以`control image [2]`的姿勢，製作`3d-cartoon style [3]`的`a woman with short hair [1]`圖片，符合以下描述：`a woman with short hair [1]`的 3D 卡通風格肖像，背景模糊。可愛又討喜的角色，面帶微笑看向鏡頭，色調柔和 ...

最佳做法和限制

如果使用人物做為主題，建議參考圖片中的臉部具有下列屬性：

置中顯示，且至少占據整張圖片的一半
在正面視角中朝各個方向旋轉 (側傾、俯仰和偏擺)
未被太陽眼鏡或口罩等物品遮住

用途

自訂功能提供自由形式的提示，可能會讓人誤以為模型能執行的工作超出訓練範圍。以下各節說明自訂功能的預期用途，以及非預期用途的範例 (僅列舉部分)。

我們建議您將這項功能用於預期用途，因為我們已針對這些用途訓練模型，預期能獲得良好結果。反之，如果強迫模型執行超出預期用途的工作，結果可能不盡理想。

預定用途

以下是根據主題自訂的預期用途：

為人物相片套用風格。
為人物相片套用風格，並保留人物的臉部表情。
(成功率低) 將產品 (例如沙發或餅乾) 放置在不同場景，並從不同角度拍攝。
生成產品變體，但不會保留確切詳細資料。
為人物相片套用風格，同時保留臉部表情。

非預期用途的例子

以下列舉部分非預期的用途，這些用途是根據主題進行自訂。模型未針對這些用途進行訓練，因此可能會產生不佳的結果。

將兩位以上的人員放置在不同場景，同時保留其身分。
將兩個人以上放在不同場景，同時保留身分，並使用範例圖片做為風格的輸入內容，指定輸出圖片的風格。
為兩張以上的人物相片套用風格，同時保留人物身分。
將寵物放入不同場景，同時保留其身分。
將寵物相片轉換為手繪風格。
將寵物相片轉換成手繪風格，同時保留或指定圖片風格 (例如水彩)。
將寵物和人物分別放入不同場景，同時保留兩者的身分。
將寵物和一或多人的相片套用風格，變成手繪圖。
將兩項產品放置在不同場景，並從不同角度拍攝。
將產品 (例如餅乾或沙發) 放置在不同場景，並從不同角度呈現產品，同時遵循特定圖片風格 (例如寫實風格，並搭配特定顏色、光線風格或動畫)。
將產品放入不同場景，同時保留控制圖片指定的特定場景構圖。
將兩項產品放置在不同場景，並從不同角度呈現產品，然後以特定圖片做為輸入內容 (例如特定顏色、光線風格或動畫的擬真圖片)。
將兩項產品放入不同場景，同時保留控制圖片指定的特定場景構圖。