Create Messaging Experiments with A/B Testing

When you are reaching out to your users or starting a new marketing campaign, you want to make sure that you get it right. A/B testing can help you find the optimal wording and presentation by testing message variants on selected portions of your user base. Whether your goal is better retention or conversion on an offer, A/B testing can perform statistical analysis to determine if a message variant is outperforming the baseline for your selected objective.

To A/B test feature variants with a baseline, do the following:

  1. Create your experiment.
  2. Validate your experiment on a test device.
  3. Manage your experiment.

Create an experiment

An experiment that uses the Notifications composer lets you evaluate multiple variants on a single notification message.

  1. Sign in to the Firebase console and verify that Google Analytics is enabled in your project so that the experiment has access to Analytics data.

    If you did not enable Google Analytics when creating your project, you can enable it on the Integrations tab, which you can access using > Project settings in the Firebase console.

  2. In the Engage section of the Firebase console navigation bar, click A/B Testing.

  3. Click Create experiment, and then select Notifications when prompted for the service you want to experiment with.

  4. Enter a Name and optional Description for your experiment, and click Next.

  5. Fill out the Targeting fields, first choosing the app that uses your experiment. You can also target a subset of your users to participate in your experiment by choosing options that include the following:

    • Version: One or more versions of your app
    • User audience: Analytics audiences used to target users who might be included in the experiment
    • User property: One or more Analytics user properties for selecting users who might be included in the experiment
    • Country/Region: One or more countries or regions for selecting users who might be included in the experiment
    • Device language: One or more languages and locales used to select users who might be included in the experiment
    • First open: Target users based on the first time they ever opened your app
    • Last app engagement: Target users based on the last time they engaged with your app
  6. Set the Percentage of target users: Select the percentage of your app's user base matching the criteria set under Target users that you want to evenly divide between the baseline and one or more variants in your experiment. This can be any percentage between 0.01% and 100%. Percentages are randomly reassigned to users for each experiment, including duplicated experiments.

  7. In the Variants section, type a message to send to the baseline group in the Enter message text field. To send no message to the baseline group, leave this field blank.

  8. (optional) To add more than one variant to your experiment, click Add Variant. By default, experiments have one baseline and one variant.

  9. (optional) Enter a name for each variant in your experiment to replace the names Variant A, Variant B, etc.

  10. Define a goal metric for your experiment to use when evaluating experiment variants along with any desired additional metrics from the dropdown list. These metrics include built-in objectives (engagement, purchases, revenue, retention, etc.), Analytics conversion events, and other Analytics events.

  11. Choose options for your message:

    • Delivery date: Either choose Send Now to launch your experiment immediately on saving, or Scheduled to specify a time to launch your experiment in the future.
    • Advanced options: To choose advanced options for all notifications included in your experiment, expand Advanced options, and then change any of the listed message options.
  12. Click Review to save your experiment.

You are allowed up to 300 experiments per project, which could consist of up to 24 running experiments, with the rest as draft or completed.

Validate your experiment on a test device

For each Firebase installation you can retrieve the FCM registration token associated with it. You can use this token to test specific experiment variants on a test device with your app installed. To validate your experiment on a test device, do the following:

  1. Get the FCM registration token as follows:

    Swift

    Messaging.messaging().token { token, error in
      if let error = error {
        print("Error fetching FCM registration token: \(error)")
      } else if let token = token {
        print("FCM registration token: \(token)")
        self.fcmRegTokenMessage.text  = "Remote FCM registration token: \(token)"
      }
    }
    

    Objective-C

    [[FIRMessaging messaging] tokenWithCompletion:^(NSString *token, NSError *error) {
      if (error != nil) {
        NSLog(@"Error getting FCM registration token: %@", error);
      } else {
        NSLog(@"FCM registration token: %@", token);
        self.fcmRegTokenMessage.text = token;
      }
    }];
    

    Java

    FirebaseMessaging.getInstance().getToken()
        .addOnCompleteListener(new OnCompleteListener<String>() {
            @Override
            public void onComplete(@NonNull Task<String> task) {
              if (!task.isSuccessful()) {
                Log.w(TAG, "Fetching FCM registration token failed", task.getException());
                return;
              }
    
              // Get new FCM registration token
              String token = task.getResult();
    
              // Log and toast
              String msg = getString(R.string.msg_token_fmt, token);
              Log.d(TAG, msg);
              Toast.makeText(MainActivity.this, msg, Toast.LENGTH_SHORT).show();
            }
        });

    Kotlin+KTX

    FirebaseMessaging.getInstance().token.addOnCompleteListener(OnCompleteListener { task ->
        if (!task.isSuccessful) {
            Log.w(TAG, "Fetching FCM registration token failed", task.exception)
            return@OnCompleteListener
        }
    
        // Get new FCM registration token
        val token = task.result
    
        // Log and toast
        val msg = getString(R.string.msg_token_fmt, token)
        Log.d(TAG, msg)
        Toast.makeText(baseContext, msg, Toast.LENGTH_SHORT).show()
    })

    C++

    firebase::InitResult init_result;
    auto* installations_object = firebase::installations::Installations::GetInstance(
        firebase::App::GetInstance(), &init_result);
    installations_object->GetToken().OnCompletion(
        [](const firebase::Future& future) {
          if (future.status() == kFutureStatusComplete &&
              future.error() == firebase::installations::kErrorNone) {
            printf("Installations Auth Token %s\n", future.result()->c_str());
          }
        });
        

    Unity

    Firebase.Messaging.FirebaseMessaging.DefaultInstance.GetTokenAsync().ContinueWith(
      task => {
        if (!(task.IsCanceled || task.IsFaulted) && task.IsCompleted) {
          UnityEngine.Debug.Log(System.String.Format("FCM registration token {0}", task.Result));
        }
      });
    
  2. On the Firebase console navigation bar, click A/B Testing.
  3. Click Draft, hover over your experiment, click the context menu (), and then click Manage test devices
  4. Enter the FCM token for a test device and choose the experiment variant to send to that test device.
  5. Run the app and confirm that the selected variant is being received on the test device.

Manage your experiment

Whether you create an experiment with Remote Config, the Notifications composer, or Firebase In-App Messaging, you can then validate and start your experiment, monitor your experiment while it is running, and increase the number of users included in your running experiment.

When your experiment is done, you can take note of the settings used by the winning variant, and then roll out those settings to all users. Or, you can run another experiment.

Start an experiment

  1. In the Engage section of the Firebase console navigation menu, click A/B Testing.
  2. Click Draft, and then click the title of your experiment.
  3. To validate that your app has users who would be included in your experiment, expand the draft details and check for a number greater than 0% in the Targeting and distribution section (for example, 1% of users matching the criteria).
  4. To change your experiment, click Edit.
  5. To start your experiment, click Start Experiment. You can run up to 24 experiments per project at a time.

Monitor an experiment

Once an experiment has been running for a while, you can check in on its progress and see what your results look like for the users who have participated in your experiment so far.

  1. In the Engage section of the Firebase console navigation menu, click A/B Testing.
  2. Click Running, and then click on, or search for, the title of your experiment. On this page, you can view various observed and modeled statistics about your running experiment, including the following:

    • % difference from baseline: A measure of the improvement of a metric for a given variant as compared to the baseline. Calculated by comparing the value range for the variant to the value range for the baseline.
    • Probability to beat baseline: The estimated probability that a given variant beats the baseline for the selected metric.
    • observed_metric per user: Based on experiment results, this is the predicted range that the metric value will fall into over time.
    • Total observed_metric: The observed cumulative value for the baseline or variant. The value is used to measure how well each experiment variant performs, and is used to calculate Improvement, Value range, Probability to beat baseline, and Probability to be the best variant. Depending on the metric being measured, this column may be labeled "Duration per user," "Revenue per user," "Retention rate," or "Conversion rate."
  3. After your experiment has run for a while (at least 7 days for FCM and In-App Messaging or 14 days for Remote Config), data on this page indicates which variant, if any, is the "leader." Some measurements are accompanied by a bar chart that presents the data in a visual format.

Roll out an experiment to all users

After an experiment has run long enough that you have a "leader," or winning variant, for your goal metric, you can release the experiment to 100% of users. This lets you select a variant to publish to all users moving forward. Even if your experiment has not created a clear winner, you can still choose to release a variant to all of your users.

  1. In the Engage section of the Firebase console navigation menu, click A/B Testing.
  2. Click Completed or Running, click an experiment that you want to release to all users, click the context menu () Roll out variant.
  3. Roll out your experiment to all users by doing one of the following:

    • For an experiment that uses the Notifications composer, use the Roll out message dialog to send the message to the remaining targeted users who were not part of the experiment.
    • For a Remote Config experiment, select a variant to determine which Remote Config parameter values to update. The targeting criteria defined when creating the experiment is added as a new condition in your template, to ensure the rollout only affects users targeted by the experiment. After clicking Review in Remote Config to review the changes, click Publish changes to complete the rollout.
    • For an In-App Messaging experiment, use the dialog to determine which variant needs to be rolled out as a standalone In-App Messaging campaign. Once selected, you are redirected to the FIAM compose screen to make any changes (if required) before publishing.

Expand an experiment

If you find that an experiment isn't bringing in enough users for A/B Testing to declare a leader, you can increase distribution of your experiment to reach a larger percentage of the app's user base.

  1. In the Engage section of the Firebase console navigation menu, click A/B Testing.
  2. Select the running experiment that you want to edit.
  3. In the Experiment overview, click the context menu (), and then click Edit running experiment.
  4. The Targeting dialog displays an option to increase the percentage of users who are in the running experiment. Select a number greater than the current percentage and click Publish. The experiment will be pushed out to the percentage of users you have specified.

Duplicate or stop an experiment

  1. In the Engage section of the Firebase console navigation menu, click A/B Testing.
  2. Click Completed or Running, hold the pointer over your experiment, click the context menu (), and then click Duplicate experiment or Stop experiment.

User targeting

You can target the users to include in your experiment using the following user-targeting criteria.

Targeting criterion Operator(s)    Value(s) Note
Version contains,
does not contain,
matches exactly,
contains regex
Enter a value for one or more app versions that you want to include in the experiment.

When using any of the contains, does not contain, or matches exactly operators, you can provide a comma-separated list of values.

When using the contains regex operator, you can create regular expressions in RE2 format. Your regular expression can match all or part of the target version string. You can also use the ^ and $ anchors to match the beginning, end, or entirety of a target string.

User audience(s) includes all of,
includes at least one of,
does not include all of,
does not include at least one of
Select one or more Analytics audiences to target users who might be included in your experiment. Some experiments that target Google Analytics audiences may require a few days to accumulate data because they are subject to Analytics data processing latency. You are most likely to encounter this delay with new users, who are typically enrolled into qualifying audiences 24-48 hours after creation, or for recently-created audiences.
User property For text:
contains,
does not contain,
exactly matches,
contains regex

For numbers:
<, ≤, =, ≥, >
An Analytics user property is used to select users who might be included in an experiment, with a range of options for selecting user property values.

On the client, you can set only string values for user properties. For conditions that use numeric operators, the Remote Config service converts the value of the corresponding user property into an integer/float.
When using the contains regex operator, you can create regular expressions in RE2 format. Your regular expression can match all or part of the target version string. You can also use the ^ and $ anchors to match the beginning, end, or entirety of a target string.
Country/Region N/A One or more countries or regions used to select users who might be included in the experiment.  
Languages N/A One or more languages and locales used to select users who might be included in the experiment.  
First open More than
Less than
Between
Target users based on the first time they ever opened your app, specified in days.
Last app engagement More than
Less than
Between
Target users based on the last time they engaged with your app, specified in days.

A/B Testing metrics

When you create your experiment, you choose a primary, or goal metric, that is used to determine the winning variant. You should also track other metrics to help you better understand each experiment variant's performance and track important trends that may differ for each variant, like user retention, app stability and in-app purchase revenue. You can track up to five non-goal metrics in your experiment.

For example, say you've added new in-app purchases to your app and want to compare the effectiveness of two different "nudge" messages. In this case, you might decide to choose to set Purchase revenue as your goal metric because you want the winning variant to represent the notification that resulted in the highest in-app purchase revenue. And because you also want to track which variant resulted in more future conversions and retained users, you might add the following in Other metrics to track:

  • Estimated total revenue to see how your combined in-app purchase and ad revenue differs between the two variants
  • Retention (1 day), Retention (2-3 days), Retention (4-7 days) to track your daily/weekly user retention

The following tables provide details on how goal metrics and other metrics are calculated.

Goal metrics

Metric Description
Crash-free users The percentage of users who have not encountered errors in your app that were detected by the Firebase Crashlytics SDK during the experiment.
Estimated ad revenue Estimated ad earnings.
Estimated total revenue Combined value for purchase and estimated ad revenues.
Purchase revenue Combined value for all purchase and in_app_purchase events.
Retention (1 day) The number of users who return to your app on a daily basis.
Retention (2-3 days) The number of users who return to your app within 2-3 days.
Retention (4-7 days) The number of users who return to your app within 4-7 days.
Retention (8-14 days) The number of users who return to your app within 8-14 days.
Retention (15+ days) The number of users who return to your app 15 or more days after they last used it.
first_open An Analytics event that triggers when a user first opens an app after installing or reinstalling it. Used as part of a conversion funnel.

Other metrics

Metric Description
notification_dismiss An Analytics event that triggers when a notification sent by the Notifications composer is dismissed (Android only).
notification_receive An Analytics event that triggers when a notification sent by the Notifications composer is received while the app is in the background (Android only).
os_update An Analytics event that tracks when the device operating system is updated to a new version.To learn more, see Automatically collected events.
screen_view An Analytics event that tracks screens viewed within your app. To learn more, see Track Screenviews.
session_start An Analytics event that counts user sessions in your app. To learn more, see Automatically collected events.

BigQuery data export

You can access all analytics data related to your A/B tests in BigQuery. BigQuery lets you analyze the data using BigQuery SQL, export it to another cloud provider, or use the data for your custom ML models. See Link BigQuery to Firebase for more information.

To take full advantage of BigQuery data export, Firebase projects should adopt the "Blaze" pay-as-you-go pricing plan. BigQuery charges for storing data, streaming inserts, and querying data. Loading and exporting data are no-cost. See BigQuery Pricing, or the BigQuery sandbox for more information.

To get started, make sure that your Firebase project is linked to BigQuery. Select Settings > Project Settings from the left navigation menu, then select Integrations > BigQuery > Link. This page displays options to perform BiqQuery analytics data export for all apps in the project.

To query analytics data for an experiment:

  1. From your active experiments list, select the experiment to open the experiment results page.
  2. From the context menu in the Experiment overview pane, select Query experiment data (this option is not available for projects on the no-cost tier) .

    This opens the BigQuery console's query composer with an auto-generated example query of experiment data preloaded for your review. In this query, your experiment is encoded as a user property with the experiment name in the key and the experiment variant in the value.

  3. In the query composer, select Run query. Results are displayed in the lower pane.

Note that, because Firebase data in BigQuery is updated only once daily, the data available in the experiment page may be more up to date than the data available in the BigQuery console.