This page describes the following configuration options for hybrid and on-device experiences:
Make sure that you've completed the getting started guide for building hybrid experiences.
Set an "inference mode"
The examples in the getting started guide show how to implement attempting on-device inference first, and then falling back to the cloud-hosted model. This is only one of the available "inference modes" that you can implement.
Hybrid inference
Prefer on-device inference: set
primaryto a "system" model andsecondaryto a cloud model.Attempt to use the on-device model if it's available and supports the type of request. Otherwise, log an error on the device and then automatically fall back to the cloud-hosted model.
// Imports + initialization of Gemini API backend service // ... // Initialize a cloud model that supports your use case let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME") // Initialize an on-device model that supports your use case let systemModel = FirebaseAI.SystemLanguageModel.default // Create a GenerativeModelSession with a hybrid model. // Provide your preferred model as `primary` and your fallback model as `secondary` // Attempt to use the on-device model; otherwise, fall back to the cloud-hosted model. let session = ai.generativeModelSession( model: .hybridModel(primary: systemModel, secondary: cloudModel) )Prefer in-cloud inference: set
primaryto a cloud model andsecondaryto a "system" model.Attempt to use the cloud-hosted model if the device is online and if the model is available. If the device is offline, fall back to the on-device model. In all other failure cases, throw an exception.
// Imports + initialization of Gemini API backend service // ... // Initialize a cloud model that supports your use case let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME") // Initialize an on-device model that supports your use case let systemModel = FirebaseAI.SystemLanguageModel.default // Create a GenerativeModelSession with a hybrid model. // Provide your preferred model as `primary` and your fallback model as `secondary` // Attempt to use the cloud-hosted model; otherwise, fall back to the on-device model. let session = ai.generativeModelSession( model: .hybridModel(primary: cloudModel, secondary: systemModel) )
Only on-device or only in-cloud inference
The SDK supports setting only a single model which means the SDK will only
attempt either on-device or in-cloud inference. Also, you don't create a
HybridModel for this use case. However, for a hybrid experience, you do need
to create a HybridModel and set both primary and secondary models
(as described above).
Only on-device inference: set
modelto a "system" model. You don't create aHybridModelfor this use case.Attempt to use the on-device model if it's available and supports the type of request. Otherwise, throw an exception.
// Imports + initialization of Gemini API backend service // ... // Initialize an on-device model that supports your use case let systemModel = FirebaseAI.SystemLanguageModel.default // Create a GenerativeModelSession with the on-device model. let session = ai.generativeModelSession( model: systemModel )Only in-cloud inference: set
modelto a cloud model. You don't create aHybridModelfor this use case.Attempt to use the cloud-hosted model if the device is online and if the model is available. Otherwise, throw an exception.
// Imports + initialization of Gemini API backend service // ... // Initialize a cloud model that supports your use case let cloudModel = ai.geminiModel(name: "GEMINI_MODEL_NAME") // Create a GenerativeModelSession with a cloud model. let session = ai.generativeModelSession( model: cloudModel )
Check if the on-device model is available
Manual checks for on-device availability are only necessary if you want to
surface that information to the user or request that end-users take action to
download the on-device model. If the on-device model is unavailable – and
you've set primary to an on-device model and secondary to a cloud model –
then the SDK will automatically fallback to using the cloud-hosted model.
To manually check whether the on-device model is actually usable, inspect the
isAvailable property:
if FirebaseAI.SystemLanguageModel.default.isAvailable {
// The on-device model is ready to use.
} else {
// The on-device model is unavailable.
}
To check for specific on-device model availability reasons, inspect the
availability property:
switch FirebaseAI.SystemLanguageModel.default.availability {
case .available:
// The on-device model is ready to use.
break
case .unavailable(.deviceNotEligible):
// This device does not support Apple Intelligence.
break
case .unavailable(.appleIntelligenceNotEnabled):
// The user has not enabled Apple Intelligence in Settings.
break
case .unavailable(.modelNotReady):
// The model is still being downloaded.
break
case let .unavailable(reason):
// The model is unavailable due to the specified `reason`.
break
}
Determine whether on-device or in-cloud inference was used
If you use a HybridModel (and set both primary and secondary models),
then it might be helpful to know which model was used for a given request.
This information is provided by the modelVersion property of rawResponse in
each response.
When you access this property, the returned value will be one of the following:
- Cloud-hosted model used: the model name, for example
gemini-3.1-flash-lite - On-device model used:
apple-foundation-models-system-language-model
// let response = try await session.respond(to: ...
print("You used: \(response.rawResponse.modelVersion)")
print(response.content)
Use model configuration to control responses
In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options (cloud vs on-device parameters).
- Cloud-hosted models: set their configuration in a
GenerationConfig. - On-device models: set their configuration within
FirebaseAI.GenerationOptions.
These options are configured for each request to the model.
Here's an example that sets the configurations for the cloud-hosted and on-device models for hybrid inference:
// ...
let response = try await session.respond(
to: "Why is the sky blue?",
options: .hybrid(
// Config for cloud-hosted model
gemini: GenerationConfig(
temperature: 0.8,
topP: 0.9,
thinkingConfig: ThinkingConfig(thinkingLevel: .high)
),
// Config for on-device model
foundationModels: FirebaseAI.GenerationOptions(
sampling: .random(probabilityThreshold: 0.9),
temperature: 0.8
)
)
)
// ...
Give feedback about your experience with Firebase AI Logic