You can export your prediction data into BigQuery for further analysis. BigQuery allows you to analyze the data using BigQuery SQL, export it to another cloud provider, or use the data for your custom ML models.
Enabling BigQuery Export
To get started, visit the Predictions panel of the Firebase console, and select "Link BigQuery" from the notification at the bottom of the page. You can also click Settings > Project Settings from the left navigation bar, then select Integrations > BigQuery > Link to get started.
What data is exported to BigQuery?
For each app in the project, a table is created that includes the prediction data for each user. If the prediction record was used as part of the evaluation set, the evaluation label and score are also exported. The evaluation label can be either 0 if the prediction was negative or 1 if the prediction was positive.
What is holdout/evaluation data?
Not all of the data is used directly for training. As is typical for supervised learning tasks, Predictions sets aside 20% of the data as holdout data and uses only the remaining 80% of the data to train the model. Then, to evaluate the model's performance, predictions are generated for the users in the holdout set, based on the data in the training window, and compared to the actual outcomes for each user, based on the labels generated from the label window.
What can you do with the exported data?
BigQuery export contains the raw prediction data at every risk profile along with the score and labeled holdout data.
Access raw Predictions data
In addition to the computed prediction result at every risk profile, you can also get the raw score for every user as well as the set of labeled holdout data. You can use this data to evaluate the performance of your own Predictions or to create user groups beyond the three risk tolerance profiles have defined in the UI.
Transfer data into BigQuery
Using BigQuery Data Transfer services, you can automatically transfer data from Google Analytics for Firebase, Crashlytics, Google Marketing Platform, Google Ads, and YouTube into BigQuery on a scheduled and fully managed basis. You can then use the Predictions data to perform sophisticated analysis like seeing which acquisition channel is resulting in the largest number of users predicted to spend.
Predictions export contains the instance_id field that identifies a unique instance of the app install. You can use this value with other data
Take your Predictions everywhere
While Predictions integrates with Firebase Remote Config, Notifications, and A/B testing, we understand you might want to access your Predictions results server-side or push them to another third party solution. You can export your data by using the BigQuery web UI, by using the bq extract CLI command, or by submitting an extract job via the API or Client Libraries. There is currently no charge for exporting data.
Get all the prediction data for CSV export
Sample query below flattens the prediction data which can be exported as CSV or into google sheets.
SELECT FORMAT_TIMESTAMP("%Y%m%d",u.prediction_time) AS prediction_date, u.instance_id, u.app_id, u.project_id, u.prediction_time, p.prediction_name, p.score, p.id, p.evaluation_info.label, p.evaluation_info.score AS label_score, r.risk_profile_name, r.prediction_result FROM Table_Name u, u.predictions p, p.risk_profiles r ORDER BY u.prediction_time ASC LIMIT 10
Get all the prediction details used for model evaluation (holdout)
SELECT FORMAT_TIMESTAMP("%Y%m%d",u.prediction_time) AS prediction_date, u.instance_id, u.app_id, u.project_id, u.prediction_time, p.prediction_name, p.score, p.id, p.evaluation_info.label, p.evaluation_info.score AS label_score, r.risk_profile_name, r.prediction_result FROM Table_Name u, u.predictions p, p.risk_profiles r WHERE p.evaluation_info.label is not null ORDER BY u.prediction_time ASC LIMIT 10
Combine Predictions data with Analytics data for more powerful analytics
You can enable BigQuery export for the analytics and join analytics events data with Predictions for even more powerful analysis.
Analyze which country has the highest predicted churn
SELECT predictions.prediction_name, risk_profiles.risk_profile_name, events.geo.country, COUNT(1) AS count FROM Analytics_Table_Name events, Predictions_Table_Name predictions_data, predictions_data.predictions predictions, predictions.risk_profiles risk_profiles WHERE predictions.prediction_name = 'churn' AND risk_profiles.prediction_result = 'POSITIVE' AND predictions_data.user_id = events.user_id GROUP BY predictions.prediction_name, risk_profiles.risk_profile_name, events.geo.country ORDER BY risk_profiles.risk_profile_name
Use the acquisition channel data in Analytics to see which channels had the most users predicted to spend
SELECT predictions.prediction_name, risk_profiles.prediction_result, risk_profiles.risk_profile_name, events.traffic_source.source, COUNT(1) as user_count FROM Predictions_Table_Name predictions_data, predictions_data.predictions predictions, predictions_data.predictions predictions, predictions.risk_profiles risk_profiles WHERE predictions.prediction_name = 'spend' AND risk_profiles.prediction_result = 'POSITIVE' AND predictions_data.user_id = events.user_id GROUP BY predictions.prediction_name, risk_profiles.prediction_result, risk_profiles.risk_profile_name, events.traffic_source.source ORDER BY user_count desc, risk_profiles.risk_profile_name