Overview of editing images with Imagen


Only available when using the Vertex AI Gemini API as your API provider.


The Firebase AI Logic SDKs give you access to the Imagen models (via the Imagen API) so that you can edit images using either:

  • Mask-based editing, like inserting and removing objects, expanding image content beyond original borders, and replacing backgrounds

  • Customization options based on style (like pattern, texture, or artist style), subject (like product, person, or animal), or control (like a hand-drawn sketch).

This page describes each editing option at a high level. Each option has its own separate page with more details and code samples.

Models that support this capability

Imagen offers image editing through its capability model:

  • imagen-3.0-capability-001

Note that for Imagen models, the global location is not supported.

Mask-based editing

Mask-based editing lets you make localized, precise changes to an image. The model makes changes exclusively within a defined masked area of the image. A mask is a digital overlay defining the specific area you want to edit. The masked area can either be auto-detected and created by the model or be defined in a masked image that you provide. Depending on the use case, the model may require a text prompt to know what changes to make.

Here are the common use cases for mask-based editing:

Insert objects (inpainting)

You can use inpainting to insert objects into an image.

How it works: You provide an original image and a corresponding masked image — either auto-generated or provided by you — that defines a mask over an area where you want to add new content. You also provide a text prompt describing what you want to add. The model then generates and adds new content within the masked area.

For example, you can mask a table and prompt the model to add a vase of flowers.

Remove objects (inpainting)

You can use inpainting to remove objects from an image.

How it works: You provide an original image and a corresponding masked image — either auto-generated or provided by you — that defines a mask over the object or subject that you want to remove. You can also optionally provide a text prompt describing what you want to remove, or the model can intelligently detect which object to remove. The model then removes the object and fills in the area with new, contextually appropriate content.

For example, you can mask a ball and replace it with a blank wall or a grassy field.

Expand an image beyond its original borders (outpainting)

You can use outpainting to expand an image beyond its original borders.

How it works: You provide an original image and a corresponding masked image — either auto-generated or provided by you — that defines a mask of the new, expanded area. You can also optionally provide a text prompt describing what you want in the expanded area, or the model can intelligently decide what will logically continue the existing scene. The model generates the new content and fills in the masked area.

For example, you can change an image's aspect ratio or add more background context.

Replace the background

You can replace the background of an image.

How it works: You provide an original image and a corresponding masked image that defines a mask over the background — either using automatic background detection or providing the mask of the background yourself. You also provide a text prompt describing what you want to change. The model then generates and applies a new background.

For example, you can change the setting around a subject or object without affecting the foreground (for example, in a product image).

Customization

Customization lets you edit or generate images using text prompts and reference images that guide the model to generate a new image based on a specified style, subject (like a product, person, or animal), or a control.

Customize based on a style

You can edit or generate images based on a specified style.

How it works: You provide a text prompt and at least one reference image that shows a specific style (like a pattern, texture, or design style). The model uses these inputs to generate a new image based on the specified style in the reference images.

For example, you can generate a new image of a kitchen based on an image from a popular retail catalog that you provide.

Customize based on a subject

You can edit or generate images based on a specified subject.

How it works: You provide a text prompt and at least one reference image that shows a specific subject (like a product, person, or animal companion). The model uses these inputs to generate a new image based on the specified subject in the reference images.

For example, you can ask the model to apply a cartoon style to a photo of a child or change the color of a bicycle in a picture.

Customize based on a control

You can edit or generate images based on a specified control.

How it works: You provide a text prompt and at least one control reference image (like a drawing or a Canny edge image). The model uses these inputs to generate a new image based on the control images.

For example, you can provide the model with a drawing of a rocket ship and the moon along with a text prompt to create a watercolor painting based on the drawing.


Give feedback about your experience with Firebase AI Logic