Firebase AI Logic supports Gemini 3 Pro and Gemini 3 Pro Image (nano banana pro) for use on all platforms (in preview).

本頁面由 Cloud Translation API 翻譯而成。

使用 Imagen 根據控制項自訂圖片

本頁面說明如何使用 Imagen 的自訂功能，透過 Firebase AI Logic SDK 根據指定的控制項編輯或生成圖片。

運作方式：提供文字提示和至少一張控制參考圖片 (例如繪圖或 Canny 邊緣圖片)。模型會根據控制圖片，使用這些輸入內容生成新圖片。

舉例來說，你可以提供火箭和月球的繪圖，以及文字提示，讓模型根據繪圖製作水彩畫。

跳至程式碼

控制參考圖像的類型

受控自訂的參考圖片可以是塗鴉、Canny 邊緣圖片或臉部網格。

什麼是塗鴉？

塗鴉是手繪的草圖或輪廓，可為模型提供基本結構、空間配置和版面配置，文字提示會提供生成圖片的詳細資料、顏色和紋理。

舉例來說，您提供房屋、樹木和太陽的繪圖，以及「日出時分，一間小屋旁有棵大橡樹，以異想天開的水彩風格繪製。」等文字提示。模型就會根據描述的場景生成圖片，同時遵循繪圖中的一般版面配置。

什麼是 Canny 邊緣檢測後的圖像？

Canny 邊緣圖像：演算法 (具體來說是 Canny 邊緣偵測器) 會套用至來源圖像，以對應圖像中物件的邊緣。這些邊緣可協助模型在變更樣式、顏色或文字提示指定的其他屬性時，維持物件的精確結構。

例如，你有一張狗狗坐在沙發上的相片。您在相片上執行 Canny 邊緣偵測器，即可取得只有狗狗和沙發輪廓的圖片。接著，您可以使用這張邊緣地圖做為控制圖片，並輸入文字提示，例如「一隻坐在皮沙發上的黃金獵犬幼犬照片」。模型會生成新相片，相片中的狗狗會擺出與原相片中狗狗完全相同的姿勢，沙發的構圖也相同，但相片中的主體會換成黃金獵犬幼犬和皮製沙發。

什麼是臉部網格？

臉部網格是圖片，可協助模型瞭解及複製特定臉部。這是人臉的 3D 數位呈現方式，通常是相互連結的點 (頂點) 和三角形網路，可定義臉部的形狀和輪廓。這會為模型提供重要地標 (例如眼睛、鼻子和嘴巴) 和紋理。

事前準備

只有在使用 Vertex AI Gemini API 做為 API 供應商時，才能使用這項功能。

如果尚未完成，請參閱入門指南，瞭解如何設定 Firebase 專案、將應用程式連結至 Firebase、新增 SDK、初始化所選 API 供應商的後端服務，以及建立 ImagenModel 執行個體。

支援這項功能的模型

Imagen 可透過 capability 模型編輯圖片：

imagen-3.0-capability-001

請注意，Imagen 型號global不支援位置。

傳送受控自訂要求

以下範例顯示受控的自訂要求，要求模型根據提供的參考圖片 (在本例中為太空繪圖，例如火箭和月球) 生成新圖片。由於參照圖片是手繪的草圖或輪廓，因此使用 CONTROL_TYPE_SCRIBBLE 控制項類型。

如果參考圖像為 Canny 邊緣檢測後的圖像或臉部網格，您也可以使用這個範例，但須進行下列變更：

如果參考圖像為 Canny 邊緣檢測後的圖像，請使用 CONTROL_TYPE_CANNY 控制項類型。
如果參照圖片是臉部網格，請使用 CONTROL_TYPE_FACE_MESH 控制類型。這項控制選項只能用於人物主體自訂。

請參閱本頁稍後的「提示範本」一節，瞭解如何撰寫提示，以及如何在提示中使用參考圖片。

Swift

Swift 不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

Kotlin

// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun customizeImage() {
    // Initialize the Vertex AI Gemini API backend service
    // Optionally specify the location to access the model (for example, `us-central1`)
    val ai = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1"))

    // Create an `ImagenModel` instance with an Imagen "capability" model
    val model = ai.imagenModel("imagen-3.0-capability-001")

    // This example assumes 'referenceImage' is a pre-loaded Bitmap.
    // In a real app, this might come from the user's device or a URL.
    val referenceImage: Bitmap = TODO("Load your reference image Bitmap here")

    // Define the subject reference using the reference image.
    val controlReference = ImagenControlReference(
        image = referenceImage,
        referenceID = 1,
        controlType = CONTROL_TYPE_SCRIBBLE
    )

    // Provide a prompt that describes the final image.
    // The "[1]" links the prompt to the subject reference with ID 1.
    val prompt = "A cat flying through outer space arranged like the space scribble[1]"

    // Use the editImage API to perform the controlled customization.
    // Pass the list of references, the prompt, and an editing configuration.
    val editedImage = model.editImage(
        referenceImages = listOf(controlReference),
        prompt = prompt,
        config = ImagenEditingConfig(
            editSteps = 50 // Number of editing steps, a higher value can improve quality
        )
    )

    // Process the result
}

Java

// Initialize the Vertex AI Gemini API backend service
// Optionally specify the location to access the model (for example, `us-central1`)
// Create an `ImagenModel` instance with an Imagen "capability" model
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1"))
        .imagenModel(
                /* modelName */ "imagen-3.0-capability-001");

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// This example assumes 'referenceImage' is a pre-loaded Bitmap.
// In a real app, this might come from the user's device or a URL.
Bitmap referenceImage = null; // TODO("Load your image Bitmap here");

// Define the subject reference using the reference image.
ImagenControlReference controlReference = new ImagenControlReference.Builder()
        .setImage(referenceImage)
        .setReferenceID(1)
        .setControlType(CONTROL_TYPE_SCRIBBLE)
        .build();

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
String prompt = "A cat flying through outer space arranged like the space scribble[1]";

// Define the editing configuration.
ImagenEditingConfig imagenEditingConfig = new ImagenEditingConfig.Builder()
        .setEditSteps(50) // Number of editing steps, a higher value can improve quality
        .build();

// Use the editImage API to perform the controlled customization.
// Pass the list of references, the prompt, and an editing configuration.
Futures.addCallback(model.editImage(Collections.singletonList(controlReference), prompt, imagenEditingConfig), new FutureCallback<ImagenGenerationResponse>() {
    @Override
    public void onSuccess(ImagenGenerationResponse result) {
        if (result.getImages().isEmpty()) {
            Log.d("TAG", "No images generated");
        }
        Bitmap bitmap = ((ImagenInlineImage) result.getImages().get(0)).asBitmap();
        // Use the bitmap to display the image in your UI
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web

網頁應用程式不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

Dart

import 'dart:typed_data';
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Optionally specify a location to access the model (for example, `us-central1`)
final ai = FirebaseAI.vertexAI(location: 'us-central1');

// Create an `ImagenModel` instance with an Imagen "capability" model
final model = ai.imagenModel(model: 'imagen-3.0-capability-001');

// This example assumes 'referenceImage' is a pre-loaded Uint8List.
// In a real app, this might come from the user's device or a URL.
final Uint8List referenceImage = Uint8List(0); // TODO: Load your reference image data here

// Define the control reference using the reference image.
final controlReference = ImagenControlReference(
  image: referenceImage,
  referenceId: 1,
    controlType: ImagenControlType.scribble,
);

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
final prompt = "A cat flying through outer space arranged like the space scribble[1]";

try {
  // Use the editImage API to perform the controlled customization.
  // Pass the list of references, the prompt, and an editing configuration.
  final response = await model.editImage(
    [controlReference],
    prompt,
    config: ImagenEditingConfig(
      editSteps: 50, // Number of editing steps, a higher value can improve quality
    ),
  );

  // Process the result.
  if (response.images.isNotEmpty) {
    final editedImage = response.images.first.bytes;
    // Use the editedImage (a Uint8List) to display the image, save it, etc.
    print('Image successfully generated!');
  } else {
    // Handle the case where no images were generated.
    print('Error: No images were generated.');
  }
} catch (e) {
  // Handle any potential errors during the API call.
  print('An error occurred: $e');
}

Unity

Unity 不支援使用 Imagen 模型編輯圖片。請於今年稍晚再回來查看！

提示範本

在要求中，您可以定義 ImagenControlReference，指定圖片的參照 ID，藉此提供參考圖片 (最多 4 張)。請注意，多張圖片可以有相同的參照 ID (例如，同一概念的多個塗鴉)。

接著，撰寫提示時請參照這些 ID。舉例來說，您可以在提示中使用 [1]，參照參考 ID 為 1 的圖片。

下表提供提示範本，可做為撰寫提示的起點，根據控制項自訂提示。

用途	參考圖片	提示範本	範例
控管自訂項目	手繪地圖 (1)	生成符合 `scribble map [1]` 的圖片，與說明相符：${STYLE_PROMPT} ${PROMPT}。	生成符合 `scribble map [1]` 的圖片，以符合說明：圖片應為印象派油畫風格，筆觸輕鬆。這類照片通常會呈現自然光線，並可明顯看出筆觸。車輛的側面圖。車輛停在潮濕且會反射光線的路面上，水窪中映照著城市燈光。
控管自訂項目	Canny 控制圖片 (1)	生成與 `edge map [1]` 相符的圖片，以符合說明：${STYLE_PROMPT} ${PROMPT}	生成符合 `edge map [1]` 的圖片，與說明相符：圖片應為印象派油畫風格，筆觸輕鬆。這類圖片的氛圍自然明亮，筆觸也十分明顯。車輛的側面圖。車輛停在潮濕且會反射光線的路面，水窪中映照著城市燈光。
使用 FaceMesh 輸入內容，為人物圖片套用風格	主體圖片 (1-3) FaceMesh 控制圖片 (1)	請以`SUBJECT_DESCRIPTION [1]`的姿勢生成`CONTROL_IMAGE [2]`圖片，並符合以下描述：`SUBJECT_DESCRIPTION [1]`的肖像照 ${PROMPT}	以`control image [2]`的姿勢繪製「`a woman with short hair [1]`」的圖片，符合以下描述：`a woman with short hair [1]`的肖像照，採用 3D 卡通風格，背景模糊。可愛的角色，面帶微笑，看向鏡頭，粉彩色調 ...
使用 FaceMesh 輸入內容，為人物圖片套用風格	主體圖片 (1-3) FaceMesh 控制圖片 (1)	建立 ${STYLE_PROMPT} 圖片，內容為 `SUBJECT_DESCRIPTION [1]`，姿勢與 `CONTROL_IMAGE [2]` 相同，並符合以下說明：`SUBJECT_DESCRIPTION [1]` 的肖像照${PROMPT}	以 3D 卡通風格繪製 `a woman with short hair [1]` 的圖片，姿勢要與 `control image [2]` 相同，並符合以下說明：`a woman with short hair [1]` 的肖像照，3D 卡通風格，背景模糊。可愛討喜的角色，面帶微笑看向鏡頭，粉彩色調 ...

最佳做法和限制

用途

自訂功能提供自由形式的提示，可能會讓人誤以為模型能執行的工作超出訓練範圍。以下各節說明自訂功能的預期用途，以及非預期用途的範例 (僅列舉部分)。

我們建議您將這項功能用於預期用途，因為我們已針對這些用途訓練模型，預期能獲得良好結果。反之，如果強迫模型執行超出預期用途的工作，結果可能不盡理想。

預定用途

以下是根據控制項自訂的預期用途：

生成符合提示和 Canny 邊緣控制圖像的圖片。
根據提示和塗鴉生成圖片。
為人物相片套用風格，同時保留臉部表情。

非預期用途的例子

以下列舉部分非預期的用途，這些用途是根據控制項進行自訂。模型未針對這些用途進行訓練，因此可能會產生不佳的結果。

根據提示中指定的風格生成圖片。
從文字生成圖像，並根據參考圖片提供特定風格，同時使用控制圖片，在某種程度上控制圖像構圖。
根據參考圖片提供的特定風格，透過文字生成圖像，並使用控制塗鴉，在某種程度上控制圖像構圖。
從文字生成圖片，並遵循參考圖片提供的特定風格，同時使用控制圖片，在某種程度上控制圖片構圖。圖片中的人有特定臉部表情。
為兩張以上的人像照片套用風格，並保留臉部表情。
將寵物相片轉換成手繪風格，保留或指定圖片的構圖 (例如水彩)。