The latest Gemini models, like Gemini 3.1 Flash Image (Nano Banana 2), are available to use with Firebase AI Logic on all platforms!

Gemini 2.0 Flash and Flash-Lite models will be retired on June 1, 2026. To avoid service disruption, update to a newer model like gemini-2.5-flash-lite. Also, Gemini 3 Pro Preview (gemini-3-pro-preview) will be retired on March 9, 2026 (update to Gemini 3.1 Pro Preview: gemini-3.1-pro-preview). Learn more.

Thinking

Gemini 2.5 e modelos mais recentes podem usar um "processo de pensamento" interno que melhora significativamente as habilidades de raciocínio e planejamento de várias etapas, tornando-os altamente eficazes para tarefas complexas, como programação, matemática avançada e análise de dados.

Os modelos de pensamento oferecem as seguintes configurações e opções:

Controle a quantidade de reflexão
Você pode configurar o quanto um modelo pode "pensar". Essa configuração é especialmente importante se a redução da latência ou do custo for uma prioridade. Além disso, confira a comparação de dificuldades das tarefas para decidir o quanto um modelo pode precisar da capacidade de raciocínio.

Controle essa configuração com níveis de pensamento (modelos Gemini 3 e mais recentes) ou com orçamentos de pensamento (modelos Gemini 2.5).
Receber resumos de ideias
É possível ativar os resumos de ideias para incluir na resposta gerada. Esses resumos são versões sintetizadas dos pensamentos brutos do modelo e oferecem insights sobre o processo de raciocínio interno dele.
Processar assinaturas de pensamento
Os SDKs do Firebase AI Logic processam automaticamente assinaturas de pensamento para você, o que garante que o modelo tenha acesso ao contexto de pensamento de turnos anteriores, principalmente ao usar a chamada de função.

Confira as práticas recomendadas e orientações de comandos para usar modelos de pensamento.

Usar um modelo de pensamento

Use um modelo de raciocínio como qualquer outro modelo Gemini.

Para aproveitar ao máximo os modelos de pensamento, confira as Práticas recomendadas e orientações de comandos para usar modelos de pensamento mais adiante nesta página.

Modelos compatíveis com essa funcionalidade

Apenas os modelos Gemini 3 e Gemini 2.5 são compatíveis com essa capacidade.

gemini-3.1-pro-preview
gemini-3-pro-image-preview (conhecido como "Nano Banana Pro")
gemini-3.1-flash-image-preview (também conhecido como "Nano Banana 2")
gemini-3-flash-preview
gemini-3.1-flash-lite-preview
gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite

Práticas recomendadas e orientações de comandos para usar modelos de pensamento

Recomendamos testar o comando em Google AI Studio ou Vertex AI Studio para conferir todo o processo de raciocínio. Você pode identificar áreas em que o modelo pode ter se desviado para refinar seus comandos e receber respostas mais consistentes e precisas.

Comece com um comando geral que descreva o resultado desejado e observe as ideias iniciais do modelo sobre como ele determina a resposta. Se a resposta não for a esperada, ajude o modelo a gerar uma resposta melhor usando uma das seguintes técnicas de comando:

Fornecer instruções detalhadas
Forneça vários exemplos de pares de entrada e saída
Forneça orientações sobre como a saída e as respostas devem ser formuladas e formatadas.
Fornecer etapas específicas de verificação

Além de comandos, considere usar estas recomendações:

Defina instruções do sistema, que são como um "preâmbulo" que você adiciona antes que o modelo seja exposto a outras instruções do comando ou do usuário final. Elas permitem orientar o comportamento do modelo com base nas suas necessidades e casos de uso específicos.
Defina um nível de pensamento (ou orçamento de pensamento para modelos Gemini 2.5) para controlar o quanto o modelo pode pensar. Se você definir um nível alto, o modelo poderá pensar mais, se necessário. Se você definir um valor menor, o modelo não vai "pensar demais" na resposta, e também vai reservar mais do limite total de saída de tokens para a resposta real, o que pode ajudar a reduzir a latência e o custo.
Ative o monitoramento de IA no console do Firebase para monitorar a contagem de tokens de pensamento e a latência das suas solicitações que têm o pensamento ativado. Se você tiver os resumos de ideias ativados, eles vão aparecer no console, onde é possível inspecionar o raciocínio detalhado do modelo para ajudar a depurar e refinar seus comandos.

Controlar a quantidade de pensamento

É possível configurar o quanto de "pensamento" e raciocínio um modelo pode fazer antes de retornar uma resposta. Essa configuração é especialmente importante se a redução da latência ou do custo for uma prioridade.

Confira a comparação de dificuldades das tarefas para decidir o quanto um modelo pode precisar da capacidade de raciocínio. Confira algumas orientações gerais:

Defina um valor de pensamento mais baixo para tarefas menos complexas ou se a redução da latência ou do custo for uma prioridade para você.
Defina um valor de pensamento mais alto para tarefas mais complexas.

Controle essa configuração com níveis de pensamento (modelos Gemini 3 e mais recentes) ou com orçamentos de pensamento (modelos Gemini 2.5).

Níveis de pensamento (modelos Gemini 3 e mais recentes)

Para controlar o quanto um modelo Gemini 3 e versões mais recentes podem pensar para gerar uma resposta, especifique um nível de pensamento para a quantidade de tokens de pensamento que ele pode usar.

Definir o nível de pensamento

Clique no seu provedor de Gemini API para conferir o conteúdo e o código específicos do provedor nesta página.

Defina o nível de pensamento em um GenerationConfig ao criar a instância GenerativeModel. A configuração é mantida durante todo o ciclo de vida da instância. Se você quiser usar níveis de pensamento diferentes para solicitações diferentes, crie instâncias GenerativeModel configuradas com cada nível.

Saiba mais sobre os valores compatíveis para o nível de pensamento mais adiante nesta seção.

Swift

Defina o nível de pensamento em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
let generationConfig = GenerationConfig(
  thinkingConfig: ThinkingConfig(thinkingLevel: .low)
)

// Specify the config as part of creating the `GenerativeModel` instance
let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "GEMINI_3_MODEL_NAME",
  generationConfig: generationConfig
)

// ...

Kotlin

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
val generationConfig = generationConfig {
  thinkingConfig = thinkingConfig {
      thinkingLevel = ThinkingLevel.LOW
  }
}

// Specify the config as part of creating the `GenerativeModel` instance
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
  modelName = "GEMINI_3_MODEL_NAME",
  generationConfig,
)

// ...

Java

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
ThinkingConfig thinkingConfig = new ThinkingConfig.Builder()
    .setThinkingLevel(ThinkingLevel.LOW)
    .build();

GenerationConfig generationConfig = GenerationConfig.builder()
    .setThinkingConfig(thinkingConfig)
    .build();

// Specify the config as part of creating the `GenerativeModel` instance
GenerativeModelFutures model = GenerativeModelFutures.from(
        FirebaseAI.getInstance(GenerativeBackend.googleAI())
                .generativeModel(
                  /* modelName */ "GEMINI_3_MODEL_NAME",
                  /* generationConfig */ generationConfig
                );
);

// ...

Web

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
const generationConfig = {
  thinkingConfig: {
    thinkingLevel: ThinkingLevel.LOW
  }
};

// Specify the config as part of creating the `GenerativeModel` instance
const model = getGenerativeModel(ai, { model: "GEMINI_3_MODEL_NAME", generationConfig });

// ...

Dart

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
final thinkingConfig = ThinkingConfig.withThinkingLevel(ThinkingLevel.low);

final generationConfig = GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
final model = FirebaseAI.googleAI().generativeModel(
  model: 'GEMINI_3_MODEL_NAME',
  config: generationConfig,
);

// ...

Unity

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking level value appropriate for your model (example value shown here)
var thinkingConfig = new ThinkingConfig(thinkingLevel: ThinkingLevel.Low);

var generationConfig = new GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "GEMINI_3_MODEL_NAME",
  generationConfig: generationConfig
);

// ...

Valores de nível de pensamento aceitos

A tabela a seguir lista os valores de nível de pensamento que podem ser definidos para cada modelo ao configurar o thinkingLevel do modelo.

	`MINIMAL`	`LOW`	`MEDIUM`	`HIGH`
	O modelo usa o mínimo de tokens possível, quase sem pensar. Tarefas de baixa complexidade	O modelo usa menos tokens, minimiza a latência e o custo Tarefas simples e de alta capacidade de processamento	O modelo usa uma abordagem equilibrada Tarefas de complexidade moderada	O modelo usa tokens até o nível máximo Comandos complexos que exigem raciocínio profundo
`gemini-3.1-pro-preview`				(padrão)
`gemini-3-flash-preview`				(padrão)
`gemini-3.1-flash-lite-preview`	(padrão)
`gemini-3-pro-image-preview` ("Nano Banana Pro")				(padrão)
`gemini-3.1-flash-image-preview` ("Nano Banana 2")				(padrão)

Orçamentos de pensamento (modelos Gemini 2.5)

Para controlar o quanto um modelo Gemini 2.5 pode pensar para gerar uma resposta, especifique um orçamento de pensamento para a quantidade de tokens de pensamento que ele pode usar.

Definir o orçamento de pensamento

Clique no seu provedor de Gemini API para conferir o conteúdo e o código específicos do provedor nesta página.

Defina o orçamento de pensamento em um GenerationConfig ao criar a instância GenerativeModel para um modelo Gemini 2.5. A configuração é mantida durante a vida útil da instância. Se quiser usar orçamentos de pensamento diferentes para solicitações diferentes, crie instâncias GenerativeModel configuradas com cada orçamento.

Saiba mais sobre os valores aceitos para o orçamento de pensamento nesta seção.

Swift

Defina o orçamento de pensamento em um GenerationConfig como parte da criação de uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
let generationConfig = GenerationConfig(
  thinkingConfig: ThinkingConfig(thinkingBudget: 1024)
)

// Specify the config as part of creating the `GenerativeModel` instance
let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "GEMINI_2.5_MODEL_NAME",
  generationConfig: generationConfig
)

// ...

Kotlin

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
val generationConfig = generationConfig {
  thinkingConfig = thinkingConfig {
      thinkingBudget = 1024
  }
}

// Specify the config as part of creating the `GenerativeModel` instance
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
  modelName = "GEMINI_2.5_MODEL_NAME",
  generationConfig,
)

// ...

Java

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
ThinkingConfig thinkingConfig = new ThinkingConfig.Builder()
    .setThinkingBudget(1024)
    .build();

GenerationConfig generationConfig = GenerationConfig.builder()
    .setThinkingConfig(thinkingConfig)
    .build();

// Specify the config as part of creating the `GenerativeModel` instance
GenerativeModelFutures model = GenerativeModelFutures.from(
        FirebaseAI.getInstance(GenerativeBackend.googleAI())
                .generativeModel(
                  /* modelName */ "GEMINI_2.5_MODEL_NAME",
                  /* generationConfig */ generationConfig
                );
);

// ...

Web

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
const generationConfig = {
  thinkingConfig: {
    thinkingBudget: 1024
  }
};

// Specify the config as part of creating the `GenerativeModel` instance
const model = getGenerativeModel(ai, { model: "GEMINI_2.5_MODEL_NAME", generationConfig });

// ...

Dart

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
final thinkingConfig = ThinkingConfig.withThinkingBudget(1024);

final generationConfig = GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
final model = FirebaseAI.googleAI().generativeModel(
  model: 'GEMINI_2.5_MODEL_NAME',
  config: generationConfig,
);

// ...

Unity

Defina os valores dos parâmetros em um GenerationConfig como parte da criação de uma instância GenerativeModel.


// ...

// Set the thinking configuration
// Use a thinking budget value appropriate for your model (example value shown here)
var thinkingConfig = new ThinkingConfig(thinkingBudget: 1024);

var generationConfig = new GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "GEMINI_2.5_MODEL_NAME",
  generationConfig: generationConfig
);

// ...

Valores aceitos para o orçamento de pensamento

A tabela a seguir lista os valores de orçamento de pensamento que podem ser definidos para cada modelo ao configurar o thinkingBudget do modelo.

Modelo	Valor padrão	Intervalo disponível para o orçamento de pensamento		Valor para desativar o pensamento	Valor para ativar o pensamento dinâmico
Modelo	Valor padrão			Valor para desativar o pensamento	Valor para ativar o pensamento dinâmico	Valor mínimo	Valor máximo
Gemini 2.5 Pro	`8,192`	`128`	`32,768`	não pode ser desativado	`-1`
Gemini 2.5 Flash	`8,192`	`1`	`24,576`	`0`	`-1`
Gemini 2.5 Flash‑Lite	`0` (o pensamento está desativado por padrão)	`512`	`24,576`	`0` (ou não configure o orçamento de pensamento)	`-1`

Desativar o raciocínio para modelos Gemini 2.5

Para algumas tarefas mais fáceis, a capacidade de raciocínio não é tão necessária, e a inferência tradicional é suficiente. Além disso, se a redução da latência ou do custo for uma prioridade, talvez você não queira que o modelo leve mais tempo ou custe mais do que o necessário para gerar uma resposta.

Nessas situações, é possível desativar (ou desabilitar) o pensamento para alguns modelos:

Gemini 2.5 Pro: o pensamento não pode ser desativado
Gemini 2.5 Flash: desative o pensamento definindo thinkingBudget como 0 tokens.
Gemini 2.5 Flash‑Lite: o pensamento fica desativado por padrão. Portanto, não defina thinkingBudget explicitamente ou defina como 0.

Para todos os modelos Gemini 3, o pensamento não pode ser desativado.

Ativar o pensamento dinâmico para modelos Gemini 2.5

Com o pensamento dinâmico, o modelo decide quando e quanto ele pensa (até um orçamento máximo de pensamento, conforme descrito abaixo).

Ative o pensamento dinâmico definindo thinkingBudget como -1.
Quando o pensamento dinâmico está ativado, o número máximo de tokens de pensamento é sempre 8.192.

Todos os modelos Gemini 3 usam o pensamento dinâmico.

Complexidade da tarefa para todos os modelos de raciocínio

Tarefas fáceis: não é necessário pensar muito
Solicitações simples em que não é necessário raciocínio complexo, como recuperação ou classificação de fatos. Exemplos:
- "Onde a DeepMind foi fundada?"
- "Este e-mail está pedindo uma reunião ou apenas fornecendo informações?"
Tarefas moderadas: é provável que seja necessário pensar um pouco
Solicitações comuns que se beneficiam de um grau de processamento gradual ou compreensão mais profunda. Exemplos:
- "Crie uma analogia entre a fotossíntese e o crescimento."
- "Compare e contraste carros elétricos e híbridos."
Tarefas difíceis: pode ser necessário o máximo de raciocínio
Desafios realmente complexos, como resolver problemas de matemática complexos ou tarefas de programação. Esses tipos de tarefas exigem que o modelo use todas as suas capacidades de raciocínio e planejamento, muitas vezes envolvendo várias etapas internas antes de fornecer uma resposta. Exemplos:
- "Resolva o problema 1 da AIME 2025: encontre a soma de todas as bases inteiras b > 9 para as quais 17b é um divisor de 97b."
- "Escreva um código Python para um aplicativo da Web que mostre dados do mercado de ações em tempo real, incluindo autenticação de usuários. Faça com que ele seja o mais eficiente possível."

Resumos de pensamentos

Os resumos de pensamento são versões sintetizadas dos pensamentos brutos do modelo e oferecem insights sobre o processo de raciocínio interno dele.

Confira alguns motivos para incluir resumos de ideias nas respostas:

Você pode mostrar o resumo de ideias na interface do seu app ou disponibilizá-lo para os usuários. O resumo do pensamento é retornado como uma parte separada na resposta para que você tenha mais controle sobre como ele é usado no seu app.
Se você também ativar o monitoramento de IA no console Firebase, os resumos de ideias vão aparecer no console, onde é possível inspecionar o raciocínio detalhado do modelo para ajudar a depurar e refinar seus comandos.

Confira algumas observações importantes sobre os resumos de ideias:

Os resumos de pensamento não são controlados por orçamentos de pensamento (os orçamentos se aplicam apenas aos pensamentos brutos do modelo). No entanto, se a capacidade de pensar estiver desativada, o modelo não vai retornar um resumo do pensamento.
Os resumos de ideias são considerados parte da resposta de texto gerado regular do modelo e contam como tokens de saída.

Ativar resumos de ideias

Clique no seu provedor de Gemini API para conferir o conteúdo e o código específicos do provedor nesta página.

Para ativar os resumos de ideias, defina includeThoughts como "true" na configuração do modelo. Para acessar o resumo, verifique o campo thoughtSummary na resposta.

Confira um exemplo de como ativar e recuperar resumos de ideias com a resposta:

Swift

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
let generationConfig = GenerationConfig(
  thinkingConfig: ThinkingConfig(includeThoughts: true)
)

// Specify the config as part of creating the `GenerativeModel` instance
let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "GEMINI_MODEL_NAME",
  generationConfig: generationConfig
)

let response = try await model.generateContent("solve x^2 + 4x + 4 = 0")

// Handle the response that includes thought summaries
if let thoughtSummary = response.thoughtSummary {
  print("Thought Summary: \(thoughtSummary)")
}
guard let text = response.text else {
  fatalError("No text in response.")
}
print("Answer: \(text)")

Kotlin

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
val generationConfig = generationConfig {
  thinkingConfig = thinkingConfig {
      includeThoughts = true
  }
}

// Specify the config as part of creating the `GenerativeModel` instance
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
  modelName = "GEMINI_MODEL_NAME",
  generationConfig,
)

val response = model.generateContent("solve x^2 + 4x + 4 = 0")

// Handle the response that includes thought summaries
response.thoughtSummary?.let {
    println("Thought Summary: $it")
}
response.text?.let {
    println("Answer: $it")
}

Java

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
ThinkingConfig thinkingConfig = new ThinkingConfig.Builder()
    .setIncludeThoughts(true)
    .build();

GenerationConfig generationConfig = GenerationConfig.builder()
    .setThinkingConfig(thinkingConfig)
    .build();

// Specify the config as part of creating the `GenerativeModel` instance
GenerativeModelFutures model = GenerativeModelFutures.from(
        FirebaseAI.getInstance(GenerativeBackend.googleAI())
                .generativeModel(
                  /* modelName */ "GEMINI_MODEL_NAME",
                  /* generationConfig */ generationConfig
                );
);

// Handle the response that includes thought summaries
ListenableFuture responseFuture = model.generateContent("solve x^2 + 4x + 4 = 0");
Futures.addCallback(responseFuture, new FutureCallback() {
    @Override
    public void onSuccess(GenerateContentResponse response) {
        if (response.getThoughtSummary() != null) {
            System.out.println("Thought Summary: " + response.getThoughtSummary());
        }
        if (response.getText() != null) {
            System.out.println("Answer: " + response.getText());
        }
    }

    @Override
    public void onFailure(Throwable t) {
        // Handle error
    }
}, MoreExecutors.directExecutor());

Web

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
const generationConfig = {
  thinkingConfig: {
    includeThoughts: true
  }
};

// Specify the config as part of creating the `GenerativeModel` instance
const model = getGenerativeModel(ai, { model: "GEMINI_MODEL_NAME", generationConfig });

const result = await model.generateContent("solve x^2 + 4x + 4 = 0");
const response = result.response;

// Handle the response that includes thought summaries
if (response.thoughtSummary()) {
    console.log(`Thought Summary: ${response.thoughtSummary()}`);
}
const text = response.text();
console.log(`Answer: ${text}`);

Dart

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
final thinkingConfig = ThinkingConfig(includeThoughts: true);

final generationConfig = GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
final model = FirebaseAI.googleAI().generativeModel(
  model: 'GEMINI_MODEL_NAME',
  generationConfig: generationConfig,
);

final response = await model.generateContent('solve x^2 + 4x + 4 = 0');

// Handle the response that includes thought summaries
if (response.thoughtSummary != null) {
  print('Thought Summary: ${response.thoughtSummary}');
}
if (response.text != null) {
  print('Answer: ${response.text}');
}

Unity

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
var thinkingConfig = new ThinkingConfig(includeThoughts: true);

var generationConfig = new GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "GEMINI_MODEL_NAME",
  generationConfig: generationConfig
);

var response = await model.GenerateContentAsync("solve x^2 + 4x + 4 = 0");

// Handle the response that includes thought summaries
if (response.ThoughtSummary != null) {
    Debug.Log($"Thought Summary: {response.ThoughtSummary}");
}
if (response.Text != null) {
    Debug.Log($"Answer: {response.Text}");
}

Conferir a resposta e o resumo do pensamento

# Example Response:
#     Okay, let's solve the quadratic equation x² + 4x + 4 = 0.
#     ...
#     **Answer:**
#     The solution to the equation x² + 4x + 4 = 0 is x = -2. This is a repeated root (or a root with multiplicity 2).

# Example Thought Summary:
#     **My Thought Process for Solving the Quadratic Equation**
#
#     Alright, let's break down this quadratic, x² + 4x + 4 = 0. First things first:
#     it's a quadratic; the x² term gives it away, and we know the general form is
#     ax² + bx + c = 0.
#
#     So, let's identify the coefficients: a = 1, b = 4, and c = 4. Now, what's the
#     most efficient path to the solution? My gut tells me to try factoring; it's
#     often the fastest route if it works. If that fails, I'll default to the quadratic
#     formula, which is foolproof. Completing the square? It's good for deriving the
#     formula or when factoring is difficult, but not usually my first choice for
#     direct solving, but it can't hurt to keep it as an option.
#
#     Factoring, then. I need to find two numbers that multiply to 'c' (4) and add
#     up to 'b' (4). Let's see... 1 and 4 don't work (add up to 5). 2 and 2? Bingo!
#     They multiply to 4 and add up to 4. This means I can rewrite the equation as
#     (x + 2)(x + 2) = 0, or more concisely, (x + 2)² = 0. Solving for x is now
#     trivial: x + 2 = 0, thus x = -2.
#
#     Okay, just to be absolutely certain, I'll run the quadratic formula just to
#     double-check. x = [-b ± √(b² - 4ac)] / 2a. Plugging in the values, x = [-4 ±
#     √(4² - 4 * 1 * 4)] / (2 * 1). That simplifies to x = [-4 ± √0] / 2. So, x =
#     -2 again - a repeated root. Nice.
#
#     Now, let's check via completing the square. Starting from the same equation,
#     (x² + 4x) = -4. Take half of the b-value (4/2 = 2), square it (2² = 4), and
#     add it to both sides, so x² + 4x + 4 = -4 + 4. Which simplifies into (x + 2)²
#     = 0. The square root on both sides gives us x + 2 = 0, therefore x = -2, as
#      expected.
#
#     Always, *always* confirm! Let's substitute x = -2 back into the original
#     equation: (-2)² + 4(-2) + 4 = 0. That's 4 - 8 + 4 = 0. It checks out.
#
#     Conclusion: the solution is x = -2. Confirmed.

Resumos de pensamentos de streaming

Você também pode ver resumos de ideias se escolher transmitir uma resposta usando generateContentStream. Isso vai retornar resumos incrementais e contínuos durante a geração da resposta.

Swift

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
let generationConfig = GenerationConfig(
  thinkingConfig: ThinkingConfig(includeThoughts: true)
)

// Specify the config as part of creating the `GenerativeModel` instance
let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(
  modelName: "GEMINI_MODEL_NAME",
  generationConfig: generationConfig
)

let stream = try model.generateContentStream("solve x^2 + 4x + 4 = 0")

// Handle the streamed response that includes thought summaries
var thoughts = ""
var answer = ""
for try await response in stream {
  if let thought = response.thoughtSummary {
    if thoughts.isEmpty {
      print("--- Thoughts Summary ---")
    }
    print(thought)
    thoughts += thought
  }

  if let text = response.text {
    if answer.isEmpty {
      print("--- Answer ---")
    }
    print(text)
    answer += text
  }
}

Kotlin

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
val generationConfig = generationConfig {
  thinkingConfig = thinkingConfig {
      includeThoughts = true
  }
}

// Specify the config as part of creating the `GenerativeModel` instance
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
  modelName = "GEMINI_MODEL_NAME",
  generationConfig,
)

// Handle the streamed response that includes thought summaries
var thoughts = ""
var answer = ""
model.generateContentStream("solve x^2 + 4x + 4 = 0").collect { response ->
    response.thoughtSummary?.let {
        if (thoughts.isEmpty()) {
            println("--- Thoughts Summary ---")
        }
        print(it)
        thoughts += it
    }
    response.text?.let {
        if (answer.isEmpty()) {
            println("--- Answer ---")
        }
        print(it)
        answer += it
    }
}

Java

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
ThinkingConfig thinkingConfig = new ThinkingConfig.Builder()
    .setIncludeThoughts(true)
    .build();

GenerationConfig generationConfig = GenerationConfig.builder()
    .setThinkingConfig(thinkingConfig)
    .build();

// Specify the config as part of creating the `GenerativeModel` instance
GenerativeModelFutures model = GenerativeModelFutures.from(
        FirebaseAI.getInstance(GenerativeBackend.googleAI())
                .generativeModel(
                  /* modelName */ "GEMINI_MODEL_NAME",
                  /* generationConfig */ generationConfig
                );
);

// Streaming with Java is complex and depends on the async library used.
// This is a conceptual example using a reactive stream.
Flowable responseStream = model.generateContentStream("solve x^2 + 4x + 4 = 0");

// Handle the streamed response that includes thought summaries
StringBuilder thoughts = new StringBuilder();
StringBuilder answer = new StringBuilder();

responseStream.subscribe(response -> {
    if (response.getThoughtSummary() != null) {
        if (thoughts.length() == 0) {
            System.out.println("--- Thoughts Summary ---");
        }
        System.out.print(response.getThoughtSummary());
        thoughts.append(response.getThoughtSummary());
    }
    if (response.getText() != null) {
        if (answer.length() == 0) {
            System.out.println("--- Answer ---");
        }
        System.out.print(response.getText());
        answer.append(response.getText());
    }
}, throwable -> {
    // Handle error
});

Web

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
const generationConfig = {
  thinkingConfig: {
    includeThoughts: true
  }
};

// Specify the config as part of creating the `GenerativeModel` instance
const model = getGenerativeModel(ai, { model: "GEMINI_MODEL_NAME", generationConfig });

const result = await model.generateContentStream("solve x^2 + 4x + 4 = 0");

// Handle the streamed response that includes thought summaries
let thoughts = "";
let answer = "";
for await (const chunk of result.stream) {
  if (chunk.thoughtSummary()) {
    if (thoughts === "") {
      console.log("--- Thoughts Summary ---");
    }
    // In Node.js, process.stdout.write(chunk.thoughtSummary()) could be used
    // to avoid extra newlines.
    console.log(chunk.thoughtSummary());
    thoughts += chunk.thoughtSummary();
  }

  const text = chunk.text();
  if (text) {
    if (answer === "") {
      console.log("--- Answer ---");
    }
    // In Node.js, process.stdout.write(text) could be used.
    console.log(text);
    answer += text;
  }
}

Dart

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
final thinkingConfig = ThinkingConfig(includeThoughts: true);

final generationConfig = GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
final model = FirebaseAI.googleAI().generativeModel(
  model: 'GEMINI_MODEL_NAME',
  generationConfig: generationConfig,
);

final responses = model.generateContentStream('solve x^2 + 4x + 4 = 0');

// Handle the streamed response that includes thought summaries
var thoughts = '';
var answer = '';
await for (final response in responses) {
  if (response.thoughtSummary != null) {
    if (thoughts.isEmpty) {
      print('--- Thoughts Summary ---');
    }
    thoughts += response.thoughtSummary!;
  }
  if (response.text != null) {
    if (answer.isEmpty) {
      print('--- Answer ---');
    }
    answer += response.text!;
  }
}

Unity

Ative os resumos de ideias no GenerationConfig ao criar uma instância do GenerativeModel.


// ...

// Set the thinking configuration
// Optionally enable thought summaries in the generated response (default is false)
var thinkingConfig = new ThinkingConfig(includeThoughts: true);

var generationConfig = new GenerationConfig(
  thinkingConfig: thinkingConfig
);

// Specify the config as part of creating the `GenerativeModel` instance
var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(
  modelName: "GEMINI_MODEL_NAME",
  generationConfig: generationConfig
);

var stream = model.GenerateContentStreamAsync("solve x^2 + 4x + 4 = 0");

// Handle the streamed response that includes thought summaries
var thoughts = "";
var answer = "";
await foreach (var response in stream)
{
    if (response.ThoughtSummary != null)
    {
        if (string.IsNullOrEmpty(thoughts))
        {
            Debug.Log("--- Thoughts Summary ---");
        }
        Debug.Log(response.ThoughtSummary);
        thoughts += response.ThoughtSummary;
    }
    if (response.Text != null)
    {
        if (string.IsNullOrEmpty(answer))
        {
            Debug.Log("--- Answer ---");
        }
        Debug.Log(response.Text);
        answer += response.Text;
    }
}

Assinaturas de pensamento

Ao usar o pensamento em interações de várias interações, o modelo não tem acesso ao contexto de pensamento de turnos anteriores. No entanto, se você estiver usando a chamada de função, poderá aproveitar as assinaturas de pensamento para manter o contexto do pensamento em vários turnos. As assinaturas de pensamento são representações criptografadas do processo de pensamento interno do modelo e estão disponíveis ao usar a chamada de função de pensamento e. Especificamente, as assinaturas de pensamento são geradas quando:

O recurso de pensamento está ativado e as ideias são geradas.
A solicitação inclui declarações de função.

Para aproveitar as assinaturas de pensamento, use a chamada de função normalmente. Os SDKs do Firebase AI Logic simplificam o processo gerenciando o estado e processando automaticamente as assinaturas de pensamento para você. Os SDKs transmitem automaticamente todas as assinaturas de pensamento geradas entre as chamadas subsequentes de sendMessage ou sendMessageStream em uma sessão Chat.

Preços e contagem de tokens de raciocínio

Os tokens de raciocínio usam a mesma tabela de preços dos tokens de saída de texto. Se você ativar os resumos de ideias, eles serão considerados tokens de pensamento e terão o preço adequado.

Você pode ativar o monitoramento de IA no console do Firebase para monitorar a contagem de tokens de pensamento em solicitações que têm esse recurso ativado.

Você pode receber o número total de tokens de pensamento do campo thoughtsTokenCount no atributo usageMetadata da resposta:

Swift

// ...

let response = try await model.generateContent("Why is the sky blue?")

if let usageMetadata = response.usageMetadata {
  print("Thoughts Token Count: \(usageMetadata.thoughtsTokenCount)")
}

Kotlin

// ...

val response = model.generateContent("Why is the sky blue?")

response.usageMetadata?.let { usageMetadata ->
    println("Thoughts Token Count: ${usageMetadata.thoughtsTokenCount}")
}

Java

// ...

ListenableFuture<GenerateContentResponse> response =
    model.generateContent("Why is the sky blue?");

Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String usageMetadata = result.getUsageMetadata();
        if (usageMetadata != null) {
            System.out.println("Thoughts Token Count: " +
                usageMetadata.getThoughtsTokenCount());
        }
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

// ...

const response = await model.generateContent("Why is the sky blue?");

if (response?.usageMetadata?.thoughtsTokenCount != null) {
    console.log(`Thoughts Token Count: ${response.usageMetadata.thoughtsTokenCount}`);
}

Dart

// ...

final response = await model.generateContent(
  Content.text("Why is the sky blue?"),
]);

if (response?.usageMetadata case final usageMetadata?) {
  print("Thoughts Token Count: ${usageMetadata.thoughtsTokenCount}");
}

Unity

// ...

var response = await model.GenerateContentAsync("Why is the sky blue?");

if (response.UsageMetadata != null)
{
    UnityEngine.Debug.Log($"Thoughts Token Count: {response.UsageMetadata?.ThoughtsTokenCount}");
}

Saiba mais sobre tokens no guia de contagem de tokens.

Thinking Mantenha tudo organizado com as coleções Salve e categorize o conteúdo com base nas suas preferências.

Usar um modelo de pensamento

Modelos compatíveis com essa funcionalidade

Práticas recomendadas e orientações de comandos para usar modelos de pensamento

Controlar a quantidade de pensamento

Níveis de pensamento (modelos Gemini 3 e mais recentes)

Definir o nível de pensamento

Swift

Kotlin

Java

Web

Dart

Unity

Valores de nível de pensamento aceitos

Orçamentos de pensamento (modelos Gemini 2.5)

Definir o orçamento de pensamento

Swift

Kotlin

Java

Web

Dart

Unity

Valores aceitos para o orçamento de pensamento

Desativar o raciocínio para modelos Gemini 2.5

Ativar o pensamento dinâmico para modelos Gemini 2.5

Complexidade da tarefa para todos os modelos de raciocínio

Resumos de pensamentos

Ativar resumos de ideias

Swift

Kotlin

Java

Web

Dart

Unity

Resumos de pensamentos de streaming

Swift

Kotlin

Java

Web

Dart

Unity

Assinaturas de pensamento

Preços e contagem de tokens de raciocínio

Swift

Kotlin

Java

Web

Dart

Unity

Thinking