This page describes the following configuration options for hybrid experiences:
Make sure that you've completed the getting started guide for building hybrid experiences.
Set an inference mode
The examples in the getting started guide use the PREFER_ON_DEVICE mode, but
this is only one of the four available
inference modes.
Here are the available inference modes:
PREFER_ON_DEVICE: Attempt to use the on-device model if it's available and supports the type of request. Otherwise, log an error on the device and then automatically fall back to the cloud-hosted model.Kotlin
val config = OnDeviceConfig(mode = InferenceMode.PREFER_ON_DEVICE)Java
InferenceMode mode = InferenceMode.PREFER_ON_DEVICE; OnDeviceConfig config = new OnDeviceConfig(mode);ONLY_ON_DEVICE: Attempt to use the on-device model if it's available and supports the type of request. Otherwise, throw an exception.Kotlin
val config = OnDeviceConfig(mode = InferenceMode.ONLY_ON_DEVICE)Java
InferenceMode mode = InferenceMode.ONLY_ON_DEVICE; OnDeviceConfig config = new OnDeviceConfig(mode);PREFER_IN_CLOUD: Attempt to use the cloud-hosted model if the device is online and if the model is available. If the device is offline, fall back to the on-device model. In all other failure cases, throw an exception.Kotlin
val config = OnDeviceConfig(mode = InferenceMode.PREFER_IN_CLOUD)Java
InferenceMode mode = InferenceMode.PREFER_IN_CLOUD; OnDeviceConfig config = new OnDeviceConfig(mode);ONLY_IN_CLOUD: Attempt to use the cloud-hosted model if the device is online and if the model is available. Otherwise, throw an exception.Kotlin
val config = OnDeviceConfig(mode = InferenceMode.ONLY_IN_CLOUD)Java
InferenceMode mode = InferenceMode.ONLY_IN_CLOUD; OnDeviceConfig config = new OnDeviceConfig(mode);
Determine whether on-device or in-cloud inference was used
If your inference mode is PREFER_ON_DEVICE or PREFER_IN_CLOUD, then it might
be helpful to know which mode was used for given requests. This information is
provided by the inferenceSource property of each response.
When you access this property, the returned value will be either ON_DEVICE or
IN_CLOUD.
Kotlin
// ...
print("You used: ${result.response.inferenceSource}")
print(result.response.text)
Java
// ...
System.out.println("You used: " + result.getResponse().getInferenceSource());
System.out.println(result.getResponse().getText());
Specify a model to use
|
Click your Gemini API provider to view provider-specific content and code on this page. |
You can specify a model to use when you create the generativeModel instance
(Kotlin |
Java).
Specify a cloud-hosted model:
If your inference mode is
PREFER_ON_DEVICE,PREFER_IN_CLOUD, orONLY_IN_CLOUD, then you must explicitly specify a cloud-hosted model to use. The SDK does not have a default cloud-hosted model.Find model names for all supported cloud-hosted Gemini models.
Specify an on-device model:
If your inference mode is
PREFER_ON_DEVICE,PREFER_IN_CLOUD, orONLY_ON_DEVICE, then you can optionally specify in theonDeviceConfiga "category" of on-device model to use. Categories are a combination of release stage and performance characteristics.Supported category values are listed below.
AICore auto-selects the on-device model that meets the conditions of the specified category and is supported by the device. For example, if you specifyPREVIEWand the device is a Pixel 9, then Gemini Nano 4 Full [Preview] (nano-v4-full) would likely be auto-selected.STABLE: The latest stable on-device model.Fully tested and on consumer devices.
For example, Gemini Nano 3 (
nano-v3) or Gemini Nano 2 (nano-v2).Default setting for the on-device model if no
OnDeviceModelOptionis specified.
PREVIEW: The latest preview on-device model with full performance capabilities.Designed for higher reasoning power and complex tasks.
For example, Gemini Nano 4 Full [Preview] (
nano-v4-full, which is based on Gemma 4 E4B).
PREVIEW_FAST: The latest preview on-device model that's fast.Optimized for maximum speed and lower latency.
For example, Gemini Nano 4 Fast [Preview] (
nano-v4-fast, which is based on Gemma 4 E2B).
Kotlin
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
.generativeModel(
// Specify a cloud-hosted model.
// Required for `PREFER_ON_DEVICE`, `PREFER_IN_CLOUD`, and `ONLY_IN_CLOUD` inference modes.
modelName = "CLOUD_HOSTED_MODEL_NAME",
onDeviceConfig = OnDeviceConfig(
mode = InferenceMode.INFERENCE_MODE,
// (Optional) Specify an on-device model category.
// AICore will auto-select an on-device model based on this category.
// If not specified, AICore will auto-select the default stable on-device model.
modelOption = OnDeviceModelOption.ON-DEVICE_MODEL_CATEGORY)
)
Java
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
.generativeModel(
// Specify a cloud-hosted model.
// Required for `PREFER_ON_DEVICE`, `PREFER_IN_CLOUD`, and `ONLY_IN_CLOUD` inference modes.
"CLOUD_HOSTED_MODEL_NAME",
/* config = */ null,
/* safetySettings = */ null,
/* tools = */ null,
/* toolConfig = */ null,
/* systemInstruction = */ null,
/* requestOptions = */ new RequestOptions(),
new OnDeviceConfig(
/* mode = */ InferenceMode.INFERENCE_MODE,
/* maxOutputTokens = */ null,
/* temperature = */ null,
/* topK = */ null,
/* seed = */ null,
/* candidateCount = */ 1,
// (Optional) Specify an on-device model category.
// AICore will auto-select an on-device model based on this category.
// If not specified, AICore will auto-select the default stable on-device model.
/* modelOption = */ OnDeviceModelOption.ON-DEVICE_MODEL_CATEGORY)
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);
Use model configuration to control responses
|
Click your Gemini API provider to view provider-specific content and code on this page. |
In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options (cloud vs on-device parameters).
For cloud-hosted models, set their configuration directly in the model's
configuration. However, for the on-device models, set their configuration within
an
onDeviceConfig.
The configuration is maintained for the lifetime of the instance. If you want to
use a different config, create a new GenerativeModel instance with that
config.
Here's an example that sets the configurations for the cloud-hosted and
on-device models that could be used if PREFER_ON_DEVICE inference mode is
set:
Kotlin
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
.generativeModel("MODEL_NAME",
// Config for cloud-hosted model
generationConfig = generationConfig {
temperature = 0.8f,
topK = 10
},
// Config for on-device model
onDeviceConfig = onDeviceConfig {
mode = InferenceMode.PREFER_ON_DEVICE,
temperature = 0.8f,
topK = 5
})
Java
// Config for cloud-hosted model
GenerationConfig generationConfig = new GenerationConfig.Builder()
.setTemperature(0.8f)
.setTopK(10)
.build();
// Config for on-device model
OnDeviceConfig onDeviceConfig = new OnDeviceConfig.Builder()
.setMode(InferenceMode.PREFER_ON_DEVICE)
.setTemperature(0.8f)
.setTopK(5)
.build();
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
.generativeModel(
"MODEL_NAME",
generationConfig,
onDeviceConfig
);
GenerativeModelFutures model = GenerativeModelFutures.from(ai);