Firebase AI Logic supports Gemini 3 Pro and Gemini 3 Pro Image (nano banana pro) for use on all platforms (in preview).

このページは Cloud Translation API によって翻訳されました。

Imagen を使用して指定した被写体に基づいて画像をカスタマイズする

このページでは、Imagen のカスタマイズ機能を使用して、Firebase AI Logic SDK で指定された被写体に基づいて画像を編集または生成する方法について説明します。

仕組み: テキストプロンプトと、特定の被写体（商品、人物、ペットなど）が写っている参照画像を 1 枚以上指定します。モデルはこれらの入力を使用して、参照画像の指定された被写体に基づいて新しい画像を生成します。

たとえば、子供の写真に漫画風のスタイルを適用したり、写真に写っている自転車の色を変更したりできます。

コードに移動

始める前に

Vertex AI Gemini API を API プロバイダとして使用している場合にのみ使用できます。

まだ完了していない場合は、スタートガイドに沿って、記載されている手順（Firebase プロジェクトの設定、アプリと Firebase の連携、SDK の追加、選択した API プロバイダのバックエンドサービスの初期化、ImagenModel インスタンスの作成）を完了します。

この機能をサポートするモデル

Imagen は、capability モデルを通じて画像編集を提供します。

imagen-3.0-capability-001

Imagen モデルでは、global のロケーションはサポートされていません。

件名のカスタマイズリクエストを送信する

次のサンプルは、提供された参照画像（この例では猫）に基づいて新しい画像を生成するようにモデルにリクエストする被写体のカスタマイズリクエストを示しています。猫は動物であるため、サブジェクトタイプ ImagenSubjectReferenceType.ANIMAL を使用します。

対象が人物または商品の場合も、この例を使用できますが、次の変更が必要です。

対象が人物の場合は、対象タイプ ImagenSubjectReferenceType.PERSON を使用します。このタイプのリクエストは、フェイスメッシュ制御画像の有無にかかわらず送信でき、画像生成を詳細にガイドできます。
対象が商品の場合は、対象タイプ ImagenSubjectReferenceType.PRODUCT を使用します。

このページの後半でプロンプトテンプレートを確認して、プロンプトの作成方法と、プロンプト内で参照画像を使用する方法について学習してください。

Swift

Imagen モデルを使用した画像編集は、Swift ではサポートされていません。今年中にリリース予定です。

Kotlin

// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun customizeImage() {
    // Initialize the Vertex AI Gemini API backend service
    // Optionally specify the location to access the model (for example, `us-central1`)
    val ai = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1"))

    // Create an `ImagenModel` instance with an Imagen "capability" model
    val model = ai.imagenModel("imagen-3.0-capability-001")

    // This example assumes 'referenceImage' is a pre-loaded Bitmap.
    // In a real app, this might come from the user's device or a URL.
    val referenceImage: Bitmap = TODO("Load your reference image Bitmap here")

    // Define the subject reference using the reference image.
    val subjectReference = ImagenSubjectReference(
        image = referenceImage,
        referenceID = 1,
        description = "cat",
        subjectType = ImagenSubjectReferenceType.ANIMAL
    )

    // Provide a prompt that describes the final image.
    // The "[1]" links the prompt to the subject reference with ID 1.
    val prompt = "A cat[1] flying through outer space"

    // Use the editImage API to perform the subject customization.
    // Pass the list of references, the prompt, and an editing configuration.
    val editedImage = model.editImage(
        referenceImages = listOf(subjectReference),
        prompt = prompt,
        config = ImagenEditingConfig(
            editSteps = 50 // Number of editing steps, a higher value can improve quality
        )
    )

    // Process the result
}

Java

// Initialize the Vertex AI Gemini API backend service
// Optionally specify the location to access the model (for example, `us-central1`)
// Create an `ImagenModel` instance with an Imagen "capability" model
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1"))
        .imagenModel(
                /* modelName */ "imagen-3.0-capability-001");

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// This example assumes 'referenceImage' is a pre-loaded Bitmap.
// In a real app, this might come from the user's device or a URL.
Bitmap referenceImage = null; // TODO("Load your image Bitmap here");

// Define the subject reference using the reference image.
ImagenSubjectReference subjectReference = new ImagenSubjectReference.Builder()
        .setImage(referenceImage)
        .setReferenceID(1)
        .setDescription("cat")
        .setSubjectType(ImagenSubjectReferenceType.ANIMAL)
        .build();

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
String prompt = "A cat[1] flying through outer space";

// Define the editing configuration.
ImagenEditingConfig imagenEditingConfig = new ImagenEditingConfig.Builder()
        .setEditSteps(50) // Number of editing steps, a higher value can improve quality
        .build();

// Use the editImage API to perform the subject customization.
// Pass the list of references, the prompt, and an editing configuration.
Futures.addCallback(model.editImage(Collections.singletonList(subjectReference), prompt, imagenEditingConfig), new FutureCallback<ImagenGenerationResponse>() {
    @Override
    public void onSuccess(ImagenGenerationResponse result) {
        if (result.getImages().isEmpty()) {
            Log.d("TAG", "No images generated");
        }
        Bitmap bitmap = ((ImagenInlineImage) result.getImages().get(0)).asBitmap();
        // Use the bitmap to display the image in your UI
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web

Imagen モデルを使用した画像編集は、ウェブアプリではサポートされていません。今年中にリリース予定です。

Dart

import 'dart:typed_data';
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Optionally specify a location to access the model (for example, `us-central1`)
final ai = FirebaseAI.vertexAI(location: 'us-central1');

// Create an `ImagenModel` instance with an Imagen "capability" model
final model = ai.imagenModel(model: 'imagen-3.0-capability-001');

// This example assumes 'referenceImage' is a pre-loaded Uint8List.
// In a real app, this might come from the user's device or a URL.
final Uint8List referenceImage = Uint8List(0); // TODO: Load your reference image data here

// Define the subject reference using the reference image.
final subjectReference = ImagenSubjectReference(
  image: referenceImage,
  referenceId: 1,
  description: 'cat',
  subjectType: ImagenSubjectReferenceType.animal,
);

// Provide a prompt that describes the final image.
// The "[1]" links the prompt to the subject reference with ID 1.
final prompt = "A cat[1] flying through outer space.";

try {
  // Use the editImage API to perform the subject customization.
  // Pass the list of references, the prompt, and an editing configuration.
  final response = await model.editImage(
    [subjectReference],
    prompt,
    config: ImagenEditingConfig(
      editSteps: 50, // Number of editing steps, a higher value can improve quality
    ),
  );

  // Process the result.
  if (response.images.isNotEmpty) {
    final editedImage = response.images.first.bytes;
    // Use the editedImage (a Uint8List) to display the image, save it, etc.
    print('Image successfully generated!');
  } else {
    // Handle the case where no images were generated.
    print('Error: No images were generated.');
  }
} catch (e) {
  // Handle any potential errors during the API call.
  print('An error occurred: $e');
}

Unity

Unity では、Imagen モデルを使用した画像編集はサポートされていません。今年中にリリース予定です。

プロンプトテンプレート

リクエストでは、ImagenSubjectReference を定義して参照画像（最大 4 枚）を指定します。この ImagenSubjectReference で、画像の参照 ID（必要に応じて被写体の説明も）を指定します。複数の画像に同じ参照 ID を設定できます（同じ猫の複数の写真など）。

プロンプトを作成するときに、これらの ID を参照します。たとえば、プロンプトで [1] を使用して、参照 ID 1 の画像を参照します。件名に説明を入力した場合は、プロンプトに含めることもできます。そうすることで、プロンプトを人間が読みやすくなります。

次の表に、被写体（商品、人物、動物など）に基づくカスタマイズのプロンプトを作成する際の出発点となるプロンプトテンプレートを示します。

ユースケース	参照画像	プロンプトテンプレート	例
商品画像のスタイル設定 - 広告	被写体画像（最大 4 枚）	`SUBJECT_DESCRIPTION [1]` に関する画像を作成してください。説明は次のとおりです。「${PROMPT}。」	`Luxe Elixir hair oil, golden liquid in glass bottle [1]` に関する画像を説明に合うように作成してください。「真っ白な背景に `Luxe Elixir hair oil, golden liquid in glass bottle [1]` を持った女性の手の、ハイキーのクローズアップ画像。女性の手は明るく照らされ、瓶にしっかりとピントが合っている。浅い被写界深度が背景をぼかし、商品を強調している。照明は柔らかい拡散光で、瓶と手の周りに微妙な輝きを生み出している。全体的な構成はシンプルかつエレガントであり、商品の高級感を際立たせている。」
商品画像のスタイル設定 - 属性の変更	被写体画像（最大 4 枚）	`SUBJECT_DESCRIPTION` の画像を生成してください。ただし、${PROMPT}にしてください。	`Seiko watch [1]` の画像を生成してください。ただし、青色で。
フェイスメッシュ入力なしの人物画像のスタイル化	被写体画像（最大 4 枚）	`SUBJECT_DESCRIPTION [1]` に関する画像を説明に合うように作成してください。「`SUBJECT_DESCRIPTION [1]` のポートレート。${PROMPT}。」	`a woman with short hair[1]` に関する画像を説明に合うように作成してください。「背景がぼかされた 3D アニメスタイルの `a woman with short hair[1]` のポートレート。かわいらしくて愛らしいキャラクター、笑顔、カメラ目線、パステルカラーのトーン ...
フェイスメッシュ入力なしの人物画像のスタイル化	被写体画像（最大 4 枚）	`SUBJECT_DESCRIPTION [1]` に関する `STYLE_DESCRIPTION [2]` 画像を説明に合うように作成してください。「`SUBJECT_DESCRIPTION [1]` STYLE_PROMPT の肖像画」	`a woman with short hair [1]` に関する `3d-cartoon style [2]` 画像を説明に合うように作成してください。「背景がぼかされた 3D アニメスタイルの `a woman with short hair [1]` のポートレート。かわいらしくて愛らしいキャラクター、笑顔、カメラ目線、パステルカラーのトーン ...
フェイスメッシュ入力ありの人物画像のスタイル化	被写体画像（最大 3 枚）フェイスメッシュ制御画像（1 枚）	`Face mesh from the control image [2]` を使用して `SUBJECT_DESCRIPTION [1]` の画像を生成してください。「${PROMPT}」	`face mesh from the control image [2]` を使用して `the person [1]` の画像を生成してください。「その人物は無表情で正面を向いている。背景は...」
フェイスメッシュ入力ありの人物画像のスタイル化	被写体画像（最大 3 枚）フェイスメッシュ制御画像（1 枚）	`SUBJECT_DESCRIPTION [1]` に関する画像を `CONTROL_IMAGE [2]` のポーズで説明に合うように作成してください。「`SUBJECT_DESCRIPTION [1]` のポートレイト。${PROMPT}。」	`a woman with short hair [1]` に関する画像を `control image [2]` のポーズで説明に合うように作成してください。「背景がぼやけた 3D アニメスタイルの `a woman with short hair [1]` のポートレート。かわいらしくて愛らしいキャラクター、笑顔、カメラ目線、パステルカラーのトーン ...
フェイスメッシュ入力ありの人物画像のスタイル化	被写体画像（最大 3 枚）フェイスメッシュ制御画像（1 枚）	`SUBJECT_DESCRIPTION [1]` に関する `STYLE_DESCRIPTION [3]` 画像を `CONTROL_IMAGE [2]` のポーズで説明に合うように作成してください。「`SUBJECT_DESCRIPTION [1]` のポートレイト。${PROMPT}。」	`a woman with short hair [1]` に関する `3d-cartoon style [3]` 画像を `control image [2]` のポーズで説明に合うように作成してください。「背景がぼやけた 3D アニメスタイルの `a woman with short hair [1]` のポートレート。かわいらしくて愛らしいキャラクター、笑顔、カメラ目線、パステルカラーのトーン ...

ベストプラクティスと制限事項

人物を被写体として使用する場合は、参照画像の顔に次のプロパティを設定することをおすすめします。

中央に配置され、画像全体の少なくとも半分を占めている
正面図で全方向に回転している（ロール、ピッチ、ヨー）
サングラスやマスクなどの物体で隠れていない

ユースケース

カスタマイズ機能では、フリースタイルプロンプトを使用できます。これにより、モデルがトレーニングされた以上のことができるという印象を与える可能性があります。以降のセクションでは、カスタマイズの想定されるユースケースと、想定外のユースケースの例について説明します。

この機能は、想定されたユースケースで使用することをおすすめします。これらのユースケースでモデルをトレーニングしており、優れた結果が得られることが期待されるためです。逆に、想定したユースケース以外のことをモデルに実行させようとしても、良い結果は期待できません。

想定されるユースケース

被写体に基づくカスタマイズのユースケースは次のとおりです。

人物の写真をスタイル化する。
人物の写真をスタイル化し、人物の表情を保持する。
（成功率が低い）ソファやクッキーなどの商品を、さまざまな角度でさまざまなシーンに配置する。
詳細が正確に保持されない商品のバリエーションを生成する。
顔の表情を維持しながら人物の写真をスタイル化する。

想定外のユースケースの例

以下に、意図しないユースケースの例をいくつか示します。これらは、被写体に基づくカスタマイズの例です。モデルはこれらのユースケース用にトレーニングされていないため、結果が不十分になる可能性があります。

人物のアイデンティティを保持しながら、複数の人物を異なるシーンに配置する。
2 人以上の人物を異なるシーンに配置し、人物のアイデンティティを保持しながら、スタイル例の画像をスタイルの入力として使用して、出力画像のスタイルを指定する。
2 人以上の人物の写真を、その人物の個性を保ちながらスタイル化する。
ペットのアイデンティティを維持しながら、さまざまなシーンに配置する。
ペットの写真をスタイル化して絵画風に変換する。
ペットの写真をスタイル化して絵画風に変換し、画像のスタイル（水彩画など）を保持または指定する。
ペットと人物を別のシーンに配置し、両方のアイデンティティを保持する。
ペットと 1 人以上の人物の写真をスタイル化して、絵画風に変換する。
2 つの商品を異なるシーンに配置し、商品の角度を調整する。
クッキーやソファなどの商品を、さまざまな角度で、特定の画像スタイル（特定の色、照明スタイル、アニメーションを使用したフォトリアリスティックなど）に沿って、さまざまなシーンに配置する。
コントロール画像で指定されたシーンの特定の構図を維持しながら、商品を別のシーンに配置する。
特定の画像を入力として使用し（特定の色、照明スタイル、アニメーションを使用したフォトリアリスティックなど）、2 つの商品を異なるシーンに異なる角度で配置する。
2 つの商品を異なるシーンに配置し、コントロール画像で指定されたシーンの特定の構図を保持する。