你可以要求 Gemini 模型使用純文字提示,以及文字和圖片提示生成及編輯圖片。使用 Firebase AI Logic 時,您可以直接從應用程式提出這項要求。
這項功能可讓您執行下列操作:
- 透過自然語言對話反覆生成圖像,並在調整圖像時維持一致性和情境脈絡。 
- 生成圖片時,可算繪高品質文字,包括長字串。 
- 生成文字和圖片夾雜的內容。舉例來說,單一輪次中包含文字和圖片的網誌文章。過去,這需要將多個模型串連在一起。 
- 運用 Gemini 的世界知識和推理能力生成圖片。 
如需支援的模式和功能完整清單 (以及範例提示),請參閱本頁稍後內容。
| 請參閱其他指南,瞭解更多處理圖片的選項 分析圖片 在裝置上分析圖片 產生結構化輸出內容 | 
選擇 Gemini 和 Imagen 模型
Firebase AI Logic SDK 支援使用 Gemini 模型或 Imagen 模型生成及編輯圖片。
在大多數用途中,請先使用 Gemini,然後只在需要高圖片品質的特殊工作時選擇 Imagen。
如要執行下列操作,請選擇 Gemini:
- 運用世界知識和推理能力,生成與內容相關的圖片。
- 流暢混合文字和圖片,或交錯顯示文字和圖片輸出內容。
- 在長篇文字序列中嵌入準確的圖像。
- 以對話方式編輯圖片,同時保留情境。
如要執行下列操作,請選擇 Imagen:
- 優先考量圖片品質、擬真度、藝術細節或特定風格 (例如印象派或動漫)。
- 融入品牌、風格,或生成標誌和產品設計。
- 明確指定生成圖像的顯示比例或格式。
事前準備
| 按一下 Gemini API 供應商,即可在這個頁面查看供應商專屬內容和程式碼。 | 
如果尚未完成,請參閱入門指南,瞭解如何設定 Firebase 專案、將應用程式連結至 Firebase、新增 SDK、為所選Gemini API供應商初始化後端服務,以及建立 GenerativeModel 執行個體。
如要測試及疊代提示,建議使用 Google AI Studio。
支援這項功能的模型
- gemini-2.5-flash-image(又稱「奈米香蕉」)。
請注意,SDK 也支援使用 Imagen 模型生成圖片。
生成及編輯圖像
你可以使用 Gemini 模型生成及編輯圖片。
生成圖像 (僅輸入文字)
| 試用這個範例前,請先完成本指南的「事前準備」一節,設定專案和應用程式。 在該節中,您也會點選所選Gemini API供應商的按鈕,以便在本頁面查看供應商專屬內容。 | 
你可以使用文字提示,要求 Gemini 模型生成圖片。
請務必建立 GenerativeModel 例項、在模型設定中加入 responseModalities: ["TEXT", "IMAGE"]generateContent。
Swift
import FirebaseAILogic
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [.text, .image])
)
// Provide a text prompt instructing the model to generate an image
let prompt = "Generate an image of the Eiffel tower with fireworks in the background."
// To generate an image, call `generateContent` with the text input
let response = try await model.generateContent(prompt)
// Handle the generated image
guard let inlineDataPart = response.inlineDataParts.first else {
  fatalError("No image data in response.")
}
guard let uiImage = UIImage(data: inlineDataPart.data) else {
  fatalError("Failed to convert data to UIImage.")
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
    modelName = "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    generationConfig = generationConfig {
responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) }
)
// Provide a text prompt instructing the model to generate an image
val prompt = "Generate an image of the Eiffel tower with fireworks in the background."
// To generate image output, call `generateContent` with the text input
val generatedImageAsBitmap = model.generateContent(prompt)
    // Handle the generated image
    .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
    "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    new GenerationConfig.Builder()
        .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))
        .build()
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);
// Provide a text prompt instructing the model to generate an image
Content prompt = new Content.Builder()
        .addText("Generate an image of the Eiffel Tower with fireworks in the background.")
        .build();
// To generate an image, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) { 
        // iterate over all the parts in the first candidate in the result object
        for (Part part : result.getCandidates().get(0).getContent().getParts()) {
            if (part instanceof ImagePart) {
                ImagePart imagePart = (ImagePart) part;
                // The returned image as a bitmap
                Bitmap generatedImageAsBitmap = imagePart.getImage();
                break;
            }
        }
    }
    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);
Web
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};
// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, {
  model: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: {
    responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],
  },
});
// Provide a text prompt instructing the model to generate an image
const prompt = 'Generate an image of the Eiffel Tower with fireworks in the background.';
// To generate an image, call `generateContent` with the text input
const result = model.generateContent(prompt);
// Handle the generated image
try {
  const inlineDataParts = result.response.inlineDataParts();
  if (inlineDataParts?.[0]) {
    const image = inlineDataParts[0].inlineData;
    console.log(image.mimeType, image.data);
  }
} catch (err) {
  console.error('Prompt or candidate was blocked:', err);
}
Dart
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final model = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash-image',
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]),
);
// Provide a text prompt instructing the model to generate an image
final prompt = [Content.text('Generate an image of the Eiffel Tower with fireworks in the background.')];
// To generate an image, call `generateContent` with the text input
final response = await model.generateContent(prompt);
if (response.inlineDataParts.isNotEmpty) {
  final imageBytes = response.inlineDataParts[0].bytes;
  // Process the image
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}
Unity
using Firebase;
using Firebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: new GenerationConfig(
    responseModalities: new[] { ResponseModality.Text, ResponseModality.Image })
);
// Provide a text prompt instructing the model to generate an image
var prompt = "Generate an image of the Eiffel Tower with fireworks in the background.";
// To generate an image, call `GenerateContentAsync` with the text input
var response = await model.GenerateContentAsync(prompt);
var text = response.Text;
if (!string.IsNullOrWhiteSpace(text)) {
  // Do something with the text
}
// Handle the generated image
var imageParts = response.Candidates.First().Content.Parts
                         .OfType<ModelContent.InlineDataPart>()
                         .Where(part => part.MimeType == "image/png");
foreach (var imagePart in imageParts) {
  // Load the Image into a Unity Texture2D object
  UnityEngine.Texture2D texture2D = new(2, 2);
  if (texture2D.LoadImage(imagePart.Data.ToArray())) {
    // Do something with the image
  }
}
生成圖像與文字交雜的內容
| 試用這個範例前,請先完成本指南的「事前準備」一節,設定專案和應用程式。 在該節中,您也會點選所選Gemini API供應商的按鈕,以便在本頁面查看供應商專屬內容。 | 
你可以要求 Gemini 模型在文字回覆中穿插圖片。舉例來說,你可以生成食譜中每個步驟的圖片和說明,不必向模型或不同模型提出個別要求。
請務必建立 GenerativeModel 例項、在模型設定中加入 responseModalities: ["TEXT", "IMAGE"]generateContent。
Swift
import FirebaseAILogic
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [.text, .image])
)
// Provide a text prompt instructing the model to generate interleaved text and images
let prompt = """
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
// To generate interleaved text and images, call `generateContent` with the text input
let response = try await model.generateContent(prompt)
// Handle the generated text and image
guard let candidate = response.candidates.first else {
  fatalError("No candidates in response.")
}
for part in candidate.content.parts {
  switch part {
  case let textPart as TextPart:
    // Do something with the generated text
    let text = textPart.text
  case let inlineDataPart as InlineDataPart:
    // Do something with the generated image
    guard let uiImage = UIImage(data: inlineDataPart.data) else {
      fatalError("Failed to convert data to UIImage.")
    }
  default:
    fatalError("Unsupported part type: \(part)")
  }
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
    modelName = "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    generationConfig = generationConfig {
responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) }
)
// Provide a text prompt instructing the model to generate interleaved text and images
val prompt = """
    Generate an illustrated recipe for a paella.
    Create images to go alongside the text as you generate the recipe
    """.trimIndent()
// To generate interleaved text and images, call `generateContent` with the text input
val responseContent = model.generateContent(prompt).candidates.first().content
// The response will contain image and text parts interleaved
for (part in responseContent.parts) {
    when (part) {
        is ImagePart -> {
            // ImagePart as a bitmap
            val generatedImageAsBitmap: Bitmap? = part.asImageOrNull()
        }
        is TextPart -> {
            // Text content from the TextPart
            val text = part.text
        }
    }
}
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
    "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    new GenerationConfig.Builder()
        .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))
        .build()
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);
// Provide a text prompt instructing the model to generate interleaved text and images
Content prompt = new Content.Builder()
        .addText("Generate an illustrated recipe for a paella.\n" +
                 "Create images to go alongside the text as you generate the recipe")
        .build();
// To generate interleaved text and images, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        Content responseContent = result.getCandidates().get(0).getContent();
        // The response will contain image and text parts interleaved
        for (Part part : responseContent.getParts()) {
            if (part instanceof ImagePart) {
                // ImagePart as a bitmap
                Bitmap generatedImageAsBitmap = ((ImagePart) part).getImage();
            } else if (part instanceof TextPart){
                // Text content from the TextPart
                String text = ((TextPart) part).getText();
            }
        }
    }
    @Override
    public void onFailure(Throwable t) {
        System.err.println(t);
    }
}, executor);
Web
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};
// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, {
  model: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: {
    responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],
  },
});
// Provide a text prompt instructing the model to generate interleaved text and images
const prompt = 'Generate an illustrated recipe for a paella.\n.' +
  'Create images to go alongside the text as you generate the recipe';
// To generate interleaved text and images, call `generateContent` with the text input
const result = await model.generateContent(prompt);
// Handle the generated text and image
try {
  const response = result.response;
  if (response.candidates?.[0].content?.parts) {
    for (const part of response.candidates?.[0].content?.parts) {
      if (part.text) {
        // Do something with the text
        console.log(part.text)
      }
      if (part.inlineData) {
        // Do something with the image
        const image = part.inlineData;
        console.log(image.mimeType, image.data);
      }
    }
  }
} catch (err) {
  console.error('Prompt or candidate was blocked:', err);
}
Dart
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final model = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash-image',
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]),
);
// Provide a text prompt instructing the model to generate interleaved text and images
final prompt = [Content.text(
  'Generate an illustrated recipe for a paella\n ' +
  'Create images to go alongside the text as you generate the recipe'
)];
// To generate interleaved text and images, call `generateContent` with the text input
final response = await model.generateContent(prompt);
// Handle the generated text and image
final parts = response.candidates.firstOrNull?.content.parts
if (parts.isNotEmpty) {
  for (final part in parts) {
    if (part is TextPart) {
      // Do something with text part
      final text = part.text
    }
    if (part is InlineDataPart) {
      // Process image
      final imageBytes = part.bytes
    }
  }
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}
Unity
using Firebase;
using Firebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: new GenerationConfig(
    responseModalities: new[] { ResponseModality.Text, ResponseModality.Image })
);
// Provide a text prompt instructing the model to generate interleaved text and images
var prompt = "Generate an illustrated recipe for a paella \n" +
  "Create images to go alongside the text as you generate the recipe";
// To generate interleaved text and images, call `GenerateContentAsync` with the text input
var response = await model.GenerateContentAsync(prompt);
// Handle the generated text and image
foreach (var part in response.Candidates.First().Content.Parts) {
  if (part is ModelContent.TextPart textPart) {
    if (!string.IsNullOrWhiteSpace(textPart.Text)) {
      // Do something with the text
    }
  } else if (part is ModelContent.InlineDataPart dataPart) {
    if (dataPart.MimeType == "image/png") {
      // Load the Image into a Unity Texture2D object
      UnityEngine.Texture2D texture2D = new(2, 2);
      if (texture2D.LoadImage(dataPart.Data.ToArray())) {
        // Do something with the image
      }
    }
  }
}
編輯圖片 (輸入文字和圖片)
| 試用這個範例前,請先完成本指南的「事前準備」一節,設定專案和應用程式。 在該節中,您也會點選所選Gemini API供應商的按鈕,以便在本頁面查看供應商專屬內容。 | 
你可以使用文字提示和一或多張圖片,要求 Gemini 模型編輯圖片。
請務必建立 GenerativeModel 例項、在模型設定中加入 responseModalities: ["TEXT", "IMAGE"]generateContent。
Swift
import FirebaseAILogic
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [.text, .image])
)
// Provide an image for the model to edit
guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }
// Provide a text prompt instructing the model to edit the image
let prompt = "Edit this image to make it look like a cartoon"
// To edit the image, call `generateContent` with the image and text input
let response = try await model.generateContent(image, prompt)
// Handle the generated image
guard let inlineDataPart = response.inlineDataParts.first else {
  fatalError("No image data in response.")
}
guard let uiImage = UIImage(data: inlineDataPart.data) else {
  fatalError("Failed to convert data to UIImage.")
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
    modelName = "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    generationConfig = generationConfig {
responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) }
)
// Provide an image for the model to edit
val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)
// Provide a text prompt instructing the model to edit the image
val prompt = content {
    image(bitmap)
    text("Edit this image to make it look like a cartoon")
}
// To edit the image, call `generateContent` with the prompt (image and text input)
val generatedImageAsBitmap = model.generateContent(prompt)
    // Handle the generated text and image
    .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
    "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    new GenerationConfig.Builder()
        .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))
        .build()
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);
// Provide an image for the model to edit
Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);
// Provide a text prompt instructing the model to edit the image
Content promptcontent = new Content.Builder()
        .addImage(bitmap)
        .addText("Edit this image to make it look like a cartoon")
        .build();
// To edit the image, call `generateContent` with the prompt (image and text input)
ListenableFuture<GenerateContentResponse> response = model.generateContent(promptcontent);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        // iterate over all the parts in the first candidate in the result object
        for (Part part : result.getCandidates().get(0).getContent().getParts()) {
            if (part instanceof ImagePart) {
                ImagePart imagePart = (ImagePart) part;
                Bitmap generatedImageAsBitmap = imagePart.getImage();
                break;
            }
        }
    }
    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);
Web
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};
// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, {
  model: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: {
    responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],
  },
});
// Prepare an image for the model to edit
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}
// Provide a text prompt instructing the model to edit the image
const prompt = "Edit this image to make it look like a cartoon";
const fileInputEl = document.querySelector("input[type=file]");
const imagePart = await fileToGenerativePart(fileInputEl.files[0]);
// To edit the image, call `generateContent` with the image and text input
const result = await model.generateContent([prompt, imagePart]);
// Handle the generated image
try {
  const inlineDataParts = result.response.inlineDataParts();
  if (inlineDataParts?.[0]) {
    const image = inlineDataParts[0].inlineData;
    console.log(image.mimeType, image.data);
  }
} catch (err) {
  console.error('Prompt or candidate was blocked:', err);
}
Dart
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final model = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash-image',
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]),
);
// Prepare an image for the model to edit
final image = await File('scones.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);
// Provide a text prompt instructing the model to edit the image
final prompt = TextPart("Edit this image to make it look like a cartoon");
// To edit the image, call `generateContent` with the image and text input
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
// Handle the generated image
if (response.inlineDataParts.isNotEmpty) {
  final imageBytes = response.inlineDataParts[0].bytes;
  // Process the image
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}
Unity
using Firebase;
using Firebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: new GenerationConfig(
    responseModalities: new[] { ResponseModality.Text, ResponseModality.Image })
);
// Prepare an image for the model to edit
var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(
  UnityEngine.Application.streamingAssetsPath, "scones.jpg"));
var image = ModelContent.InlineData("image/jpeg", imageFile);
// Provide a text prompt instructing the model to edit the image
var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");
// To edit the image, call `GenerateContent` with the image and text input
var response = await model.GenerateContentAsync(new [] { prompt, image });
var text = response.Text;
if (!string.IsNullOrWhiteSpace(text)) {
  // Do something with the text
}
// Handle the generated image
var imageParts = response.Candidates.First().Content.Parts
                         .OfType<ModelContent.InlineDataPart>()
                         .Where(part => part.MimeType == "image/png");
foreach (var imagePart in imageParts) {
  // Load the Image into a Unity Texture2D object
  Texture2D texture2D = new Texture2D(2, 2);
  if (texture2D.LoadImage(imagePart.Data.ToArray())) {
    // Do something with the image
  }
}
透過多輪對話功能反覆編輯圖片
| 試用這個範例前,請先完成本指南的「事前準備」一節,設定專案和應用程式。 在該節中,您也會點選所選Gemini API供應商的按鈕,以便在本頁面查看供應商專屬內容。 | 
透過多輪對話,您可以與 Gemini 模型反覆互動,生成或提供圖片。
請務必建立 GenerativeModel 執行個體,在模型設定中加入 responseModalities: ["TEXT", "IMAGE"]startChat() 和 sendMessage() 傳送新的使用者訊息。
Swift
import FirebaseAILogic
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [.text, .image])
)
// Initialize the chat
let chat = model.startChat()
guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }
// Provide an initial text prompt instructing the model to edit the image
let prompt = "Edit this image to make it look like a cartoon"
// To generate an initial response, send a user message with the image and text prompt
let response = try await chat.sendMessage(image, prompt)
// Inspect the generated image
guard let inlineDataPart = response.inlineDataParts.first else {
  fatalError("No image data in response.")
}
guard let uiImage = UIImage(data: inlineDataPart.data) else {
  fatalError("Failed to convert data to UIImage.")
}
// Follow up requests do not need to specify the image again
let followUpResponse = try await chat.sendMessage("But make it old-school line drawing style")
// Inspect the edited image after the follow up request
guard let followUpInlineDataPart = followUpResponse.inlineDataParts.first else {
  fatalError("No image data in response.")
}
guard let followUpUIImage = UIImage(data: followUpInlineDataPart.data) else {
  fatalError("Failed to convert data to UIImage.")
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
    modelName = "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    generationConfig = generationConfig {
responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) }
)
// Provide an image for the model to edit
val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)
// Create the initial prompt instructing the model to edit the image
val prompt = content {
    image(bitmap)
    text("Edit this image to make it look like a cartoon")
}
// Initialize the chat
val chat = model.startChat()
// To generate an initial response, send a user message with the image and text prompt
var response = chat.sendMessage(prompt)
// Inspect the returned image
var generatedImageAsBitmap = response
    .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image
// Follow up requests do not need to specify the image again
response = chat.sendMessage("But make it old-school line drawing style")
generatedImageAsBitmap = response
    .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(
    "gemini-2.5-flash-image",
    // Configure the model to respond with text and images (required)
    new GenerationConfig.Builder()
        .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))
        .build()
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);
// Provide an image for the model to edit
Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);
// Initialize the chat
ChatFutures chat = model.startChat();
// Create the initial prompt instructing the model to edit the image
Content prompt = new Content.Builder()
        .setRole("user")
        .addImage(bitmap)
        .addText("Edit this image to make it look like a cartoon")
        .build();
// To generate an initial response, send a user message with the image and text prompt
ListenableFuture<GenerateContentResponse> response = chat.sendMessage(prompt);
// Extract the image from the initial response
ListenableFuture<@Nullable Bitmap> initialRequest = Futures.transform(response, result -> {
    for (Part part : result.getCandidates().get(0).getContent().getParts()) {
        if (part instanceof ImagePart) {
            ImagePart imagePart = (ImagePart) part;
            return imagePart.getImage();
        }
    }
    return null;
}, executor);
// Follow up requests do not need to specify the image again
ListenableFuture<GenerateContentResponse> modelResponseFuture = Futures.transformAsync(
        initialRequest,
        generatedImage -> {
            Content followUpPrompt = new Content.Builder()
                    .addText("But make it old-school line drawing style")
                    .build();
            return chat.sendMessage(followUpPrompt);
        },
        executor);
// Add a final callback to check the reworked image
Futures.addCallback(modelResponseFuture, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        for (Part part : result.getCandidates().get(0).getContent().getParts()) {
            if (part instanceof ImagePart) {
                ImagePart imagePart = (ImagePart) part;
                Bitmap generatedImageAsBitmap = imagePart.getImage();
                break;
            }
        }
    }
    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);
Web
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};
// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);
// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, {
  model: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: {
    responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],
  },
});
// Prepare an image for the model to edit
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}
const fileInputEl = document.querySelector("input[type=file]");
const imagePart = await fileToGenerativePart(fileInputEl.files[0]);
// Provide an initial text prompt instructing the model to edit the image
const prompt = "Edit this image to make it look like a cartoon";
// Initialize the chat
const chat = model.startChat();
// To generate an initial response, send a user message with the image and text prompt
const result = await chat.sendMessage([prompt, imagePart]);
// Request and inspect the generated image
try {
  const inlineDataParts = result.response.inlineDataParts();
  if (inlineDataParts?.[0]) {
    // Inspect the generated image
    const image = inlineDataParts[0].inlineData;
    console.log(image.mimeType, image.data);
  }
} catch (err) {
  console.error('Prompt or candidate was blocked:', err);
}
// Follow up requests do not need to specify the image again
const followUpResult = await chat.sendMessage("But make it old-school line drawing style");
// Request and inspect the returned image
try {
  const followUpInlineDataParts = followUpResult.response.inlineDataParts();
  if (followUpInlineDataParts?.[0]) {
    // Inspect the generated image
    const followUpImage = followUpInlineDataParts[0].inlineData;
    console.log(followUpImage.mimeType, followUpImage.data);
  }
} catch (err) {
  console.error('Prompt or candidate was blocked:', err);
}
Dart
import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final model = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash-image',
  // Configure the model to respond with text and images (required)
  generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]),
);
// Prepare an image for the model to edit
final image = await File('scones.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);
// Provide an initial text prompt instructing the model to edit the image
final prompt = TextPart("Edit this image to make it look like a cartoon");
// Initialize the chat
final chat = model.startChat();
// To generate an initial response, send a user message with the image and text prompt
final response = await chat.sendMessage([
  Content.multi([prompt,imagePart])
]);
// Inspect the returned image
if (response.inlineDataParts.isNotEmpty) {
  final imageBytes = response.inlineDataParts[0].bytes;
  // Process the image
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}
// Follow up requests do not need to specify the image again
final followUpResponse = await chat.sendMessage([
  Content.text("But make it old-school line drawing style")
]);
// Inspect the returned image
if (followUpResponse.inlineDataParts.isNotEmpty) {
  final followUpImageBytes = response.inlineDataParts[0].bytes;
  // Process the image
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}
Unity
using Firebase;
using Firebase.AI;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "gemini-2.5-flash-image",
  // Configure the model to respond with text and images (required)
  generationConfig: new GenerationConfig(
    responseModalities: new[] { ResponseModality.Text, ResponseModality.Image })
);
// Prepare an image for the model to edit
var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(
  UnityEngine.Application.streamingAssetsPath, "scones.jpg"));
var image = ModelContent.InlineData("image/jpeg", imageFile);
// Provide an initial text prompt instructing the model to edit the image
var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");
// Initialize the chat
var chat = model.StartChat();
// To generate an initial response, send a user message with the image and text prompt
var response = await chat.SendMessageAsync(new [] { prompt, image });
// Inspect the returned image
var imageParts = response.Candidates.First().Content.Parts
                         .OfType<ModelContent.InlineDataPart>()
                         .Where(part => part.MimeType == "image/png");
// Load the image into a Unity Texture2D object
UnityEngine.Texture2D texture2D = new(2, 2);
if (texture2D.LoadImage(imageParts.First().Data.ToArray())) {
  // Do something with the image
}
// Follow up requests do not need to specify the image again
var followUpResponse = await chat.SendMessageAsync("But make it old-school line drawing style");
// Inspect the returned image
var followUpImageParts = followUpResponse.Candidates.First().Content.Parts
                         .OfType<ModelContent.InlineDataPart>()
                         .Where(part => part.MimeType == "image/png");
// Load the image into a Unity Texture2D object
UnityEngine.Texture2D followUpTexture2D = new(2, 2);
if (followUpTexture2D.LoadImage(followUpImageParts.First().Data.ToArray())) {
  // Do something with the image
}
支援的功能、限制和最佳做法
支援的模態和功能
以下是 Gemini 模型支援的模式和功能,可輸出圖片。每項功能都會顯示範例提示,上方也有範例程式碼片段。
- 文字 圖片 (文字轉圖片) - 生成以煙火為背景的艾菲爾鐵塔圖片。
 
- 文字 圖片 (圖片中的文字算繪) - 生成大型建築的電影動態效果相片,並在建築物正面投影巨型文字。
 
- 文字 圖片和文字 (交錯) - 生成西班牙海鮮飯的插圖食譜。在生成食譜的同時建立圖片。 
- 以 3D 卡通動畫風格生成狗狗的故事。 為每個場景生成圖片。 
 
- 圖片和文字 圖片和文字 (交錯) - [image of a furnished room] + What other color sofas would work in my space? 可以更新圖片嗎?
 
- 圖像編輯 (文字和圖像生成圖像) - [司康餅圖片] + 將這張圖片編輯成卡通風格 
- [貓咪圖片] + [枕頭圖片] + Create a cross stitch of my cat on this pillow. (在這顆枕頭上製作我貓咪的十字繡。) 
 
- 多輪圖片編輯 (即時通訊) - [藍色汽車圖片] + 將這輛車變成敞篷車。,然後現在將顏色改成黃色。
 
限制與最佳做法
以下是使用 Gemini 模型生成圖片時的限制和最佳做法。
- 圖片生成 Gemini 模型支援下列功能: - 生成 PNG 圖片,最大尺寸為 1024 像素。
- 生成及編輯人物圖像。
- 使用安全篩選器,提供彈性且限制較少的使用者體驗。
 
- 圖片生成Gemini模型「不」支援下列項目: - 包括音訊或視訊輸入內容。
- 僅生成圖片。
 模型一律會同時傳回文字和圖片,且您必須在模型設定中加入responseModalities: ["TEXT", "IMAGE"]
 
- 為獲得最佳效能,請使用下列語言: - en、- es-mx、- ja-jp、- zh-cn、- hi-in。
- 系統不一定會生成圖像。以下是已知問題: - 模型可能只會輸出文字。 
 請明確要求生成圖片 (例如「生成圖片」、「在過程中提供圖片」、「更新圖片」)。
- 模型可能會中途停止生成內容。 
 請再試一次或改用其他提示。
- 模型可能會以圖片形式生成文字。 
 請明確要求輸出文字內容。例如「生成敘事文字和插圖」。
 
- 生成圖片的文字時,建議先生成文字,然後要求生成含有該文字的圖片,這樣 Gemini 的效果最好。