Schedule data exports

This page describes how to schedule exports of your Cloud Firestore data. To run exports on a schedule, we recommend deploying an App Engine service that calls the Cloud Firestore managed export feature. Once deployed, you can schedule calls to this service using the App Engine Cron Service.

Before you begin

Before you schedule data exports with the managed export feature, you must complete the following tasks:

  1. Enable billing for your Google Cloud Platform project. Only GCP projects with billing enabled can use the export and import feature.
  2. Create a Cloud Storage bucket for your project in a location near your Cloud Firestore database location. You cannot use a Requester Pays bucket for export and import operations.
  3. Install the Google Cloud SDK to grant access permissions and deploy the application.

Configure access permissions

The app uses the App Engine default service account to authenticate and authorize its export operations. When you create a project, the default service account is created for you with the following format:

YOUR_PROJECT_ID@appspot.gserviceaccount.com

The service account requires permission to start an export operation and to write to your Cloud Storage bucket. To grant these permissions, assign the following IAM roles to the default service account:

  • Cloud Datastore Import Export Admin
  • Owner or Storage Admin role on the bucket

You can use the gcloud and gsutil command-line tools from the Google Cloud SDK to assign these roles:

  1. Assign the Cloud Datastore Import Export Admin role:

    gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
        --member serviceAccount:YOUR_PROJECT_ID@appspot.gserviceaccount.com \
        --role roles/datastore.importExportAdmin
    
  2. Assign the Storage Admin role on your bucket:

    gsutil iam ch serviceAccount:YOUR_PROJECT_ID@appspot.gserviceaccount.com:storage.admin \
        gs://BUCKET_NAME
    

Application files

In a new folder, create the following application files using the code below:

app.yaml
Configures the App Engine runtime. The app uses the standard environment Node.js runtime.
app.js
The main app code. This app sets up a web service at https://YOUR_PROJECT_ID.appspot.com that starts export operations.
package.json
Includes information about the app and its dependencies.
cron.yaml
Configures a cron job that calls the web service.

app.yaml

runtime: nodejs8

The code above assumes this app is the default application. If it is not, add the following line:

target: cloud-firestore-admin

app.js

const axios = require('axios');
const dateformat = require('dateformat');
const express = require('express');
const { google } = require('googleapis');

const app = express();

// Trigger a backup
app.get('/cloud-firestore-export', async (req, res) => {
  const auth = await google.auth.getClient({
    scopes: ['https://www.googleapis.com/auth/datastore']
  });

  const accessTokenResponse = await auth.getAccessToken();
  const accessToken = accessTokenResponse.token;

  const headers = {
    'Content-Type': 'application/json',
    Authorization: 'Bearer ' + accessToken
  };

  const outputUriPrefix = req.param('outputUriPrefix');
  if (!(outputUriPrefix && outputUriPrefix.indexOf('gs://') == 0)) {
    res.status(500).send(`Malformed outputUriPrefix: ${outputUriPrefix}`);
  }

  // Construct a backup path folder based on the timestamp
  const timestamp = dateformat(Date.now(), 'yyyy-mm-dd-HH-MM-ss');
  let path = outputUriPrefix;
  if (path.endsWith('/')) {
    path += timestamp;
  } else {
    path += '/' + timestamp;
  }

  const body = {
    outputUriPrefix: path
  };

  // If specified, mark specific collections for backup
  const collectionParam = req.param('collections');
  if (collectionParam) {
    body.collectionIds = collectionParam.split(',');
  }

  const projectId = process.env.GOOGLE_CLOUD_PROJECT;
  const url = `https://firestore.googleapis.com/v1beta1/projects/${projectId}/databases/(default):exportDocuments`;

  try {
    const response = await axios.post(url, body, { headers: headers });
    res
      .status(200)
      .send(response.data)
      .end();
  } catch (e) {
    if (e.response) {
      console.warn(e.response.data);
    }

    res
      .status(500)
      .send('Could not start backup: ' + e)
      .end();
  }
});

// Index page, just to make it easy to see if the app is working.
app.get('/', (req, res) => {
  res
    .status(200)
    .send('[scheduled-backups]: Hello, world!')
    .end();
});

// Start the server
const PORT = process.env.PORT || 6060;
app.listen(PORT, () => {
  console.log(`App listening on port ${PORT}`);
  console.log('Press Ctrl+C to quit.');
});

package.json

{
  "name": "solution-scheduled-backups",
  "version": "1.0.0",
  "description": "Scheduled Cloud Firestore backups via AppEngine cron",
  "main": "app.js",
  "engines": {
    "node": "8.x.x"
  },
  "scripts": {
    "deploy": "gcloud app deploy --quiet app.yaml cron.yaml",
    "start": "node app.js"
  },
  "author": "Google, Inc.",
  "license": "Apache-2.0",
  "dependencies": {
    "axios": "^0.18.0",
    "dateformat": "^3.0.3",
    "express": "^4.16.4",
    "googleapis": "^34.0.0"
  },
  "devDependencies": {
    "prettier": "^1.14.3"
  }
}

cron.yaml

cron:
- description: "Daily Cloud Firestore Export"
  url: /cloud-firestore-export?outputUriPrefix=gs://BUCKET_NAME[/PATH]&collections=test1,test2
  target: cloud-firestore-admin
  schedule: every 24 hours

Modify the url line to configure the export operation. The app sets up a service at https://YOUR_PROJECT_ID.appspot.com/cloud-firestore-export that accepts the following URL parameters:

outputUriPrefix
the location of you Cloud Storage bucket in the format of gs://BUCKET_NAME.
collections
A comma separated list of collection IDs to export. If not specified, the operation exports all collections.

For example, to export all collections with collection ID Songs or Albums , you would use the following:

url: /cloud-firestore-export?outputUriPrefix=gs://BUCKET_NAME&collections=Songs,Albums

The example cron.yaml runs an export every 24 hours. For different schedule options, see the schedule format.

Deploy the app and cron job

Using gcloud, deploy the app and the cron job:

gcloud app deploy app.yaml cron.yaml

Test your cron job

You can test your deployed cron job by starting it in the Cron Jobs page of the Google API Console.

  1. Open the Cron Jobs page in the GCP console.
    Open the Cron Jobs page

  2. For the cron job with a description of Daily Cloud Firestore Export, click Run now.

  3. After the job completes, see the status message under Status. Click View to see the job log. The status message and job log will provide information on job success or failure.

View your exports

After an export operation completes, you can view the exports in your Cloud Storage bucket:

Open the Cloud Storage browser in the API Console.
Open the Cloud Storage browser

Send feedback about...

Need help? Visit our support page.