Learn about the Gemini models

The Gemini family of models are considered multimodal because they are capable of processing information from multiple modalities, including images, videos, and text. For example, you can send a Gemini model a photo of a plate of cookies and ask it to give you a recipe for those cookies.

Prompt Response

Give me a recipe for these cookies.

photograph of several chocolate chip cookies
**INGREDIENTS**
- 1 c. (2 sticks) unsalted butter, softened
- 3/4 c. granulated sugar
- 3/4 c. brown sugar, packed
- 1 tsp . vanilla extract
- 2 large eggs
- 2 1/4 c. all-purpose flour
- 1 tsp. baking soda
- 1 tsp. salt
...

You can interact with the Gemini family of models using the Gemini API provided by Vertex AI on Google Cloud. For mobile and web apps, you can use the Vertex AI for Firebase SDKs to call the Gemini API and interact with the Gemini models directly from your app.

This page provides the following information about the Gemini models:

  • High-level comparison of the use cases for the various Gemini models, including their supported input types.

  • Comparison of the details for each model, for example maximum input tokens or maximum length of video.

  • Description of how Gemini models are versioned, specifically their stable, auto-updated, and preview versions.

  • Lists of available model names to include in your code during initialization.

  • List of supported languages for the Gemini models.

Available models

You can use any of the following Gemini models with Vertex AI for Firebase:

  • Gemini 1.5 Flash: Multimodal model that supports the same input and output types as 1.5 Pro (as well as total token count), but 1.5 Flash is specifically designed for high-volume, cost-effective applications.

  • Gemini 1.5 Pro: Multimodal model that supports adding image, audio, video, and PDF files in text or chat prompts for a text or code response. Also, it supports long-context understanding with up to 1 million tokens.

  • Gemini 1.0 Pro Vision: Multimodal model designed to handle text plus images and video for a text or code response. Cannot be used for chat.

  • Gemini 1.0 Pro: Model designed to handle natural language tasks, multiturn chat with text and code, and code generation.

Jump to model names to include in your code

Use cases for each model

Gemini 1.5 Flash /
Gemini 1.5 Pro
Gemini 1.0 Pro Vision Gemini 1.0 Pro
Input types
Text
Code
Image
PDF
Video (frames only)
Video (frames and audio)
Audio
Output types
Text
Code
General use cases
Multimodal requests
Multi-turn chat

Learn more about the use cases for the Gemini models in the Google Cloud documentation:

Detailed information about each model

For all Gemini models, a token is equivalent to about 4 characters. 100 tokens are about 60-80 English words. You can determine the total count of tokens in your requests using countTokens.

Property Gemini 1.5 Flash /
Gemini 1.5 Pro
Gemini 1.0 Pro Vision Gemini 1.0 Pro
Total token limit (combined input and output) 1 million tokens 16,384 tokens 32,760 tokens
Output token limit 8,192 tokens 2,048 tokens 8,192 tokens
Maximum number of images per request 3,000 images 16 images N/A
Max base64 encoded image size 7 MB 7 MB N/A
Maximum PDF size 30 MB 30 MB N/A
Maximum number of video files per request 10 video files 1 video file N/A
Maximum video length (frames only) 60 minutes of video 2 minutes N/A
Maximum video length (frames and audio) ~45 minutes of video N/A N/A
Maximum number of audio files per request 1 audio file N/A N/A
Maximum audio length ~8.4 hours of audio N/A N/A

Here's where you can find even more detailed information about the models and input files:

Versioning of the models

The Gemini models are offered in stable, auto-updated, and preview versions.

  • Stable versions are considered Generally Available.

    • Stable versions have model names appended with a specific three digit version number, for example gemini-1.0-pro-001.
  • Auto-updated versions always point to the latest stable version of that model; if a new stable version is released, the auto-updated version automatically starts pointing to that new stable version.

    • Auto-updated versions have model names with no appendage, for example gemini-1.0-pro.
  • Preview versions have new capabilities and are considered not stable. Note that preview versions always point to the latest preview version of that model; if a new preview version is released, any existing preview version automatically starts pointing to that new preview version.

    • Preview versions have model names appended with -preview along with the model's initial release date (-MMDD), for example gemini-1.5-pro-preview-0409 (released on April 9, 2024).

Learn more about the available Gemini model versions and their lifecycle in the Google Cloud documentation.

Available model names

Model names are the explicit values that you include in your code during initialization of the generative model (which is a required step to call the Gemini API). For initialization examples for your language, see the getting started guide.

Gemini 1.5 Flash model names

Model name Description Release stage Initial release date Discontinuation date
gemini-1.5-flash-preview-0514 Latest preview version of Gemini 1.5 Flash Public Preview 2024-05-14 ---

Gemini 1.5 Pro model names

Model name Description Release stage Initial release date Discontinuation date
gemini-1.5-pro-preview-0514 Latest preview version of Gemini 1.5 Pro Public Preview 2024-05-14 ---
gemini-1.5-pro-preview-0409 Points to gemini-1.5-pro-preview-0514
(which is the latest preview version)
Public Preview 2024-04-09 2024-06-14

Gemini 1.0 Pro Vision model names

Model name Description Release stage Initial release date Discontinuation date
gemini-1.0-pro-vision-001 Latest stable version of Gemini 1.0 Pro Vision General Availability 2024-02-15 2025-02-15
gemini-1.0-pro-vision Points to gemini-1.0-pro-vision-001
(which is the latest stable version)
General Availability 2024-01-04 ---

Gemini 1.0 Pro model names

Model name Description Release stage Initial release date Discontinuation date
gemini-1.0-pro-002 Latest stable version of Gemini 1.0 Pro General Availability 2024-04-09 No earlier than 2025-04-09
gemini-1.0-pro-001 Stable version of Gemini 1.0 Pro General Availability 2024-02-15 2025-02-15
gemini-1.0-pro Points to gemini-1.0-pro-002
(which is the latest stable version)
General Availability 2024-02-15 ---

Supported languages

Gemini models support the following languages:

Arabic (ar), Bengali (bn), Bulgarian (bg), Chinese simplified and traditional (zh), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hebrew (iw), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swahili (sw), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi).

Next steps

Try out the capabilities of the Gemini API