This page describes the following configuration options:
Use model configuration to control responses, like temperature
You can also generate structured output, including JSON and enums.
Before you begin
Make sure that you've completed the getting started guide for building hybrid experiences.
Set an inference mode
The examples in the getting started guide use the PREFER_ON_DEVICE mode, but
this is only one of the four available
inference modes.
PREFER_ON_DEVICE: Use the on-device model if it's available; otherwise, fall back to the cloud-hosted model.const model = getGenerativeModel(ai, { mode: InferenceMode.PREFER_ON_DEVICE });ONLY_ON_DEVICE: Use the on-device model if it's available; otherwise, throw an exception.const model = getGenerativeModel(ai, { mode: InferenceMode.ONLY_ON_DEVICE });PREFER_IN_CLOUD: Use the cloud-hosted model if it's available; otherwise, fall back to the on-device model.const model = getGenerativeModel(ai, { mode: InferenceMode.PREFER_IN_CLOUD });ONLY_IN_CLOUD: Use the cloud-hosted model if it's available; otherwise, throw an exception.const model = getGenerativeModel(ai, { mode: InferenceMode.ONLY_IN_CLOUD });
Determine whether on-device or in-cloud inference was used
If you use PREFER_ON_DEVICE or PREFER_IN_CLOUD inference modes, then it
might be helpful to know which mode was used for given requests. This
information is provided by the inferenceSource property of each response
(available starting with JS SDK v12.5.0).
When you access this property, the returned value will be either
ON_DEVICE or IN_CLOUD.
// ...
console.log('You used: ' + result.response.inferenceSource);
console.log(result.response.text());
Override the default fallback model
The default cloud-hosted model is
gemini-2.5-flash-lite
This model is the fallback cloud-hosted model when you use the
PREFER_ON_DEVICE mode. It's also the default model when you use the
ONLY_IN_CLOUD mode or the PREFER_IN_CLOUD mode.
You can use the
inCloudParams
configuration option to specify an alternative default cloud-hosted model.
const model = getGenerativeModel(ai, {
mode: InferenceMode.INFERENCE_MODE,
inCloudParams: {
model: "GEMINI_MODEL_NAME"
}
});
Find model names for all supported Gemini models.
Use model configuration to control responses
In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options.
The configuration is maintained for the lifetime of the instance. If you want to
use a different config, create a new GenerativeModel instance with that
config.
Configure cloud-hosted model
Use the
inCloudParams
option to configure a cloud-hosted Gemini model. Learn about
available parameters.
const model = getGenerativeModel(ai, {
mode: InferenceMode.INFERENCE_MODE,
inCloudParams: {
model: "GEMINI_MODEL_NAME"
temperature: 0.8,
topK: 10
}
});
Configure on-device model
Note that inference using an on-device model uses the Prompt API from Chrome.
Use the
onDeviceParams
option to configure an on-device model. Learn about
available parameters.
const model = getGenerativeModel(ai, {
mode: InferenceMode.INFERENCE_MODE,
onDeviceParams: {
createOptions: {
temperature: 0.8,
topK: 8
}
}
});