Build leaderboards with Firestore

1. Introduction

Last Updated: 2023-01-27

What does it take to build a leaderboard?

At their core, leaderboards are just tables of scores with one complicating factor: reading a rank for any given score requires knowledge of all the other scores in some kind of order. Also, if your game takes off, your leaderboards will grow large and be read from and written to frequently. To build a successful leaderboard, it needs to be able to handle this ranking operation quickly.

What you'll build

In this codelab, you will implement various different leaderboards, each suitable for a different scenario.

What you'll learn

You'll learn how to implement four different leaderboards:

  • A naive implementation using simple record-counting to determine rank
  • A cheap, periodically-updating leaderboard
  • A real-time leaderboard with some tree nonsense
  • A stochastic (probabilistic) leaderboard for approximate ranking of very large player bases

What you'll need

  • A recent version of Chrome (107 or later)
  • Node.js 16 or higher (run nvm --version to see your version number if you're using nvm)
  • A paid Firebase Blaze plan (optional)
  • The Firebase CLI v11.16.0 or higher
    To install the CLI, you can run npm install -g firebase-tools or refer to the CLI documentation for more installation options.
  • Knowledge of JavaScript, Cloud Firestore, Cloud Functions, and Chrome DevTools

2. Getting set up

Get the code

We've put everything you need for this project into a Git repo. To get started, you'll need to grab the code and open it in your favorite dev environment. For this codelab, we used VS Code, but any text editor will do.

and unpack the downloaded zip file.

Or, clone into your directory of choice:

git clone https://github.com/FirebaseExtended/firestore-leaderboards-codelab.git

What's our starting point?

Our project is currently a blank slate with some empty functions:

  • index.html contains some glue scripts that let us invoke functions from the dev console and see their outputs. We'll use this to interface with our backend and see the results of our function invocations. In a real-world scenario, you'd make these backend calls from your game directly—we're not using a game in this codelab because it would take too long to play a game every time you want to add a score to the leaderboard.
  • functions/index.js contains all of our Cloud Functions. You'll see some utility functions, like addScores and deleteScores, as well as the functions we'll implement in this codelab, which call out to helper functions in another file.
  • functions/functions-helpers.js contains the empty functions we'll implement. For each leaderboard, we'll implement read, create, and update functions, and you'll see how our choice of implementation affects both the complexity of our implementation and its scaling performance.
  • functions/utils.js contains more utility functions. We won't touch this file in this codelab.

Create and configure a Firebase project

  1. In the Firebase console, click Add project.
  2. To create a new project, enter the desired project name.
    This will also set the project ID (displayed below the project name) to something based on the project name. You can optionally click the edit icon on the project ID to further customize it.
  3. If prompted, review and accept the Firebase terms.
  4. Click Continue.
  5. Select the Enable Google Analytics for this project option, and then click Continue.
  6. Select an existing Google Analytics account to use or select Create a new account to create a new account.
  7. Click Create project.
  8. When the project has been created, click Continue.
  9. From the Build menu, click Functions, and if prompted, upgrade your project to use the Blaze billing plan.
  10. From the Build menu, click Firestore database.
  11. In the Create database dialog that appears, select Start in test mode, then click Next.
  12. Choose a region from the Cloud Firestore location drop-down, then click Enable.

Configure and run your leaderboard

  1. In a terminal, navigate to the project root and run firebase use --add. Pick the Firebase project you just created.
  2. In the root of the project, run firebase emulators:start --only hosting.
  3. In your browser, navigate to localhost:5000.
  4. Open Chrome DevTools's JavaScript console and import leaderboard.js:
    const leaderboard = await import("http://localhost:5000/scripts/leaderboard.js");
    
  5. Run leaderboard.codelab(); in console. If you see a welcome message, you're all set! If not, shut down the emulator and re-run steps 2-4.

Let's jump into the first leaderboard implementation.

3. Implement a simple leaderboard

By the end of this section, we'll be able to add a score to the leaderboard and have it tell us our rank.

Before we jump in, let's explain how this leaderboard implementation works: All players are stored in a single collection, and fetching a player's rank is done by retrieving the collection and counting how many players are ahead of them. This makes inserting and updating a score easy. To insert a new score, we just append it to the collection, and to update it, we filter for our current user and then update the resulting document. Let's see what that looks like in code.

In functions/functions-helper.js, implement the createScore function, which is about as straightforward as it gets:

async function createScore(score, playerID, firestore) {
  return firestore.collection("scores").doc().create({
    user: playerID,
    score: score,
  });
}

For updating scores, we just need to add an error check to make sure the score being updated already exists:

async function updateScore(playerID, newScore, firestore) {
  const playerSnapshot = await firestore.collection("scores")
      .where("user", "==", playerID).get();
  if (playerSnapshot.size !== 1) {
    throw Error(`User not found in leaderboard: ${playerID}`);
  }
  const player = playerSnapshot.docs[0];
  const doc = firestore.doc(player.id);
  return doc.update({
    score: newScore,
  });
}

And finally, our simple but less-scalable rank function:

async function readRank(playerID, firestore) {
  const scores = await firestore.collection("scores")
      .orderBy("score", "desc").get();
  const player = `${playerID}`;
  let rank = 1;
  for (const doc of scores.docs) {
    const user = `${doc.get("user")}`;
    if (user === player) {
      return {
        user: player,
        rank: rank,
        score: doc.get("score"),
      };
    }
    rank++;
  }
  // No user found
  throw Error(`User not found in leaderboard: ${playerID}`);
}

Let's put it to the test! Deploy your functions by running the following in terminal:

firebase deploy --only functions

And then, in Chrome's JS console, add some other scores so we can see our ranking among other players.

leaderboard.addScores(); // Results may take some time to appear.

Now we can add our own score to the mix:

leaderboard.addScore(999, 11); // You can make up a score (second argument) here.

When the write completes, you should see a response in the console saying "Score created." Seeing an error instead? Open up the Functions logs via Firebase console to see what went wrong.

And, finally, we can fetch and update our score.

leaderboard.getRank(999);
leaderboard.updateScore(999, 0);
leaderboard.getRank(999); // we should be last place now (11)

However, this implementation gives us undesirable linear time and memory requirements for fetching a given score's rank. Since function execution time and memory are both limited, not only will this mean our fetches become increasingly slow, but after enough scores are added to the leaderboard, our functions will time out or crash before they can return a result. Clearly, we'll need something better if we're going to scale beyond a handful of players.

If you're a Firestore aficionado, you may be aware of COUNT aggregation queries, which would make this leaderboard much more performant. And you'd be right! With COUNT queries, this scales nicely below a million or so users, though its performance is still linear.

But wait, you may be thinking to yourself, if we're going to enumerate all of the documents in the collection anyway, we can assign every document a rank and then when we need to fetch it, our fetches will be O(1) time and memory! This leads us to our next approach, the periodically-updating leaderboard.

4. Implement a periodically-updating leaderboard

The key to this approach is to store the rank in the document itself, so fetching it gives us the rank with no added work. To achieve this, we'll need a new kind of function.

In index.js, add the following:

// Also add this to the top of your file
const admin = require("firebase-admin");

exports.scheduledFunctionCrontab = functions.pubsub.schedule("0 2 * * *")
    // Schedule this when most of your users are offline to avoid
    // database spikiness.
    .timeZone("America/Los_Angeles")
    .onRun((context) => {
      const scores = admin.firestore().collection("scores");
      scores.orderBy("score", "desc").get().then((snapshot) => {
        let rank = 1;
        const writes = [];
        for (const docSnapshot of snapshot.docs) {
          const docReference = scores.doc(docSnapshot.id);
          writes.push(docReference.set({rank: rank}, admin.firestore.SetOptions.merge()));
          rank++;
        }
        Promise.all(writes).then((result) => {
          console.log(`Writes completed with results: ${result}`);
        });
      });
      return null;
    });

Now our read, update, and write operations are all nice and simple. Write and update are both unchanged, but read becomes (in functions-helpers.js):

async function readRank(playerID, firestore) {
  const scores = firestore.collection("scores");
  const playerSnapshot = await scores
      .where("user", "==", playerID).get();
  if (playerSnapshot.size === 0) {
    throw Error(`User not found in leaderboard: ${playerID}`);
  }

  const player = playerSnapshot.docs[0];
  if (player.get("rank") === undefined) {
    // This score was added before our scheduled function could run,
    // but this shouldn't be treated as an error
    return {
    user: playerID,
    rank: null,
    score: player.get("score"),
  };
  }

  return {
    user: playerID,
    rank: player.get("rank"),
    score: player.get("score"),
  };
}

Unfortunately, you won't be able to deploy and test this without adding a billing account to your project. If you do have a billing account, shorten the interval on the scheduled function and watch your function magically assign ranks to your leaderboard scores.

If not, delete the scheduled function and skip ahead to the next implementation.

Go ahead and delete the scores in your Firestore database by clicking on the 3 dots next to the scores collection to prepare for the next section.

Firestore scores document page with\nDelete Collection activated

5. Implement a real-time tree leaderboard

This approach works by storing search data in the database collection itself. Instead of having a uniform collection, our goal is to store everything in a tree that we can traverse by moving through documents. This allows us to perform a binary (or n-ary) search for a given score's rank. What might that look like?

To start out, we'll want to be able to distribute our scores into roughly even buckets, which will require some knowledge of the values of the scores our users are logging; for example, if you're building a leaderboard for skill rating in a competitive game, your users' skill ratings will almost always end up normally distributed. Our random score generating function uses JavaScript's Math.random(), which results in an approximately even distribution, so we'll divide our buckets evenly.

In this example we'll use 3 buckets for simplicity, but you'll likely find that if you use this implementation in a real app more buckets will yield faster results–a shallower tree means on average fewer collection fetches and less lock contention.

The rank of a player is given by the sum of the number of players with higher scores, plus one for the player themself. Each collection under scores will store three documents, each with a range, the number of documents under each range, and then three corresponding subcollections. To read a rank we'll traverse this tree searching for a score and keeping track of the sum of the greater scores. When we find our score, we'll also have the correct sum.

Writing is significantly more complicated. First, we'll need to make all of our writes within a transaction to prevent data inconsistencies when multiple writes or reads occur at the same time. We'll also need to maintain all of the conditions we've described above as we traverse the tree to write our new documents. And, finally, since we have all the tree complexity of this new approach combined with the need to store all of our original documents, our storage cost will increase slightly (but it's still linear).

In functions-helpers.js:

async function createScore(playerID, score, firestore) {
  /**
   * This function assumes a minimum score of 0 and that value
   * is between min and max.
   * Returns the expected size of a bucket for a given score
   * so that bucket sizes stay constant, to avoid expensive
   * re-bucketing.
   * @param {number} value The new score.
   * @param {number} min The min of the previous range.
   * @param {number} max The max of the previous range. Must be greater than
   *     min.
   * @return {Object<string, number>} Returns an object containing the new min
   *     and max.
   */
  function bucket(value, min, max) {
    const bucketSize = (max - min) / 3;
    const bucketMin = Math.floor(value / bucketSize) * bucketSize;
    const bucketMax = bucketMin + bucketSize;
    return {min: bucketMin, max: bucketMax};
  }

  /**
   * A function used to store pending writes until all reads within a
   * transaction have completed.
   *
   * @callback PendingWrite
   * @param {admin.firestore.Transaction} transaction The transaction
   *     to be used for writes.
   * @return {void}
   */

  /**
   * Recursively searches for the node to write the score to,
   * then writes the score and updates any counters along the way.
   * @param {number} id The user associated with the score.
   * @param {number} value The new score.
   * @param {admin.firestore.CollectionReference} coll The collection this
   *     value should be written to.
   * @param {Object<string, number>} range An object with properties min and
   *     max defining the range this score should be in. Ranges cannot overlap
   *     without causing problems. Use the bucket function above to determine a
   *     root range from constant values to ensure consistency.
   * @param {admin.firestore.Transaction} transaction The transaction used to
   *     ensure consistency during tree updates.
   * @param {Array<PendingWrite>} pendingWrites A series of writes that should
   *     occur once all reads within a transaction have completed.
   * @return {void} Write error/success is handled via the transaction object.
   */
  async function writeScoreToCollection(
      id, value, coll, range, transaction, pendingWrites) {
    const snapshot = await transaction.get(coll);
    if (snapshot.empty) {
      // This is the first score to be inserted into this node.
      for (const write of pendingWrites) {
        write(transaction);
      }
      const docRef = coll.doc();
      transaction.create(docRef, {exact: {score: value, user: id}});
      return;
    }

    const min = range.min;
    const max = range.max;

    for (const node of snapshot.docs) {
      const data = node.data();
      if (data.exact !== undefined) {
        // This node held an exact score.
        const newRange = bucket(value, min, max);
        const tempRange = bucket(data.exact.score, min, max);

        if (newRange.min === tempRange.min &&
          newRange.max === tempRange.max) {
          // The scores belong in the same range, so we need to "demote" both
          // to a lower level of the tree and convert this node to a range.
          const rangeData = {
            range: newRange,
            count: 2,
          };
          for (const write of pendingWrites) {
            write(transaction);
          }
          const docReference = node.ref;
          transaction.set(docReference, rangeData);
          transaction.create(docReference.collection("scores").doc(), data);
          transaction.create(
              docReference.collection("scores").doc(),
              {exact: {score: value, user: id}},
          );
          return;
        } else {
          // The scores are in different ranges. Continue and try to find a
          // range that fits this score.
          continue;
        }
      }

      if (data.range.min <= value && data.range.max > value) {
        // The score belongs to this range that may have subvalues.
        // Increment the range's count in pendingWrites, since
        // subsequent recursion may incur more reads.
        const docReference = node.ref;
        const newCount = node.get("count") + 1;
        pendingWrites.push((t) => {
          t.update(docReference, {count: newCount});
        });
        const newRange = bucket(value, min, max);
        return writeScoreToCollection(
            id,
            value,
            docReference.collection("scores"),
            newRange,
            transaction,
            pendingWrites,
        );
      }
    }

    // No appropriate range was found, create an `exact` value.
    transaction.create(coll.doc(), {exact: {score: value, user: id}});
  }

  const scores = firestore.collection("scores");
  const players = firestore.collection("players");
  return firestore.runTransaction((transaction) => {
    return writeScoreToCollection(
        playerID, score, scores, {min: 0, max: 1000}, transaction, [],
    ).then(() => {
      transaction.create(players.doc(), {
        user: playerID,
        score: score,
      });
    });
  });
}

This is certainly more complicated than our last implementation, which was a single method call and just six lines of code. Once you've implemented this method, try adding a few scores to the database and observing the structure of the resulting tree. In your JS console:

leaderboard.addScores();

The resulting database structure should look something like this, with the tree structure clearly visible and the leaves of the tree representing individual scores.

scores
  - document
    range: 0-333.33
    count: 2
    scores:
      - document
        exact:
          score: 18
          user: 1
      - document
        exact:
          score: 22
          user: 2

Now that we have the hard part out of the way, we can read scores by traversing the tree as described previously.

async function readRank(playerID, firestore) {
  const players = await firestore.collection("players")
      .where("user", "==", playerID).get();
  if (players.empty) {
    throw Error(`Player not found in leaderboard: ${playerID}`);
  }
  if (players.size > 1) {
    console.info(`Multiple scores with player ${playerID}, fetching first`);
  }
  const player = players.docs[0].data();
  const score = player.score;

  const scores = firestore.collection("scores");

  /**
   * Recursively finds a player score in a collection.
   * @param {string} id The player's ID, since some players may be tied.
   * @param {number} value The player's score.
   * @param {admin.firestore.CollectionReference} coll The collection to
   *     search.
   * @param {number} currentCount The current count of players ahead of the
   *     player.
   * @return {Promise<number>} The rank of the player (the number of players
   *     ahead of them plus one).
   */
  async function findPlayerScoreInCollection(id, value, coll, currentCount) {
    const snapshot = await coll.get();
    for (const doc of snapshot.docs) {
      if (doc.get("exact") !== undefined) {
        // This is an exact score. If it matches the score we're looking
        // for, return. Otherwise, check if it should be counted.
        const exact = doc.data().exact;
        if (exact.score === value) {
          if (exact.user === id) {
            // Score found.
            return currentCount + 1;
          } else {
            // The player is tied with another. In this case, don't increment
            // the count.
            continue;
          }
        } else if (exact.score > value) {
          // Increment count
          currentCount++;
          continue;
        } else {
          // Do nothing
          continue;
        }
      } else {
        // This is a range. If it matches the score we're looking for,
        // search the range recursively, otherwise, check if it should be
        // counted.
        const range = doc.data().range;
        const count = doc.get("count");
        if (range.min > value) {
          // The range is greater than the score, so add it to the rank
          // count.
          currentCount += count;
          continue;
        } else if (range.max <= value) {
          // do nothing
          continue;
        } else {
          const subcollection = doc.ref.collection("scores");
          return findPlayerScoreInCollection(
              id,
              value,
              subcollection,
              currentCount,
          );
        }
      }
    }
    // There was no range containing the score.
    throw Error(`Range not found for score: ${value}`);
  }

  const rank = await findPlayerScoreInCollection(playerID, score, scores, 0);
  return {
    user: playerID,
    rank: rank,
    score: score,
  };
}

Updates are left as an extra exercise. Try adding and fetching scores in your JS console with the leaderboard.addScore(id, score) and leaderboard.getRank(id) methods and see how your leaderboard changes in the Firebase console.

With this implementation, however, the complexity we've added to achieve logarithmic performance comes at a cost.

  • First, this leaderboard implementation can run into lock contention issues, since transactions require locking reads and writes to documents to make sure they stay consistent.
  • Second, Firestore imposes a subcollection depth limit of 100, meaning you'll need to avoid creating subtrees after 100 tied scores, which this implementation does not.
  • And finally, this leaderboard scales logarithmically only in the ideal case where the tree is balanced–if it's unbalanced, the worst case performance of this leaderboard is once again linear.

Once you're done, delete the scores and players collections via Firebase console and we'll move on to our last leaderboard implementation.

6. Implement a stochastic (probabilistic) leaderboard

When running the insertion code, you may notice that if you run it too many times in parallel your functions will start failing with an error message related to transaction lock contention. There are ways around this that we won't explore in this codelab, but if you don't need exact ranking, you can drop all of the complexity of the previous approach for something both simpler and faster. Let's take a look at how we might return an estimated rank for our players' scores instead of an exact ranking, and how that changes our database logic.

For this approach, we'll divide our leaderboard into 100 buckets, each representing approximately one percent of the scores we expect to be receiving. This approach works even without knowledge of our score distribution, in which case we have no way of guaranteeing a roughly even distribution of scores throughout the bucket, but we'll achieve greater precision in our approximations if we do know how our scores will be distributed.

Our approach is as follows: like before, each bucket stores the count of the number of scores within and the range of the scores. When inserting a new score, we'll find the bucket for the score and increment its count. When fetching a rank, we'll just sum the buckets ahead of it and then approximate within our bucket instead of searching further. This gives us very nice constant time lookups and insertions, and requires much less code.

First, insertion:

// Add this line to the top of your file.
const admin = require("firebase-admin");

// Implement this method (again).
async function createScore(playerID, score, firestore) {
  const scores = await firestore.collection("scores").get();
  if (scores.empty) {
    // Create the buckets since they don't exist yet.
    // In a real app, don't do this in your write function. Do it once
    // manually and then keep the buckets in your database forever.
    for (let i = 0; i < 10; i++) {
      const min = i * 100;
      const max = (i + 1) * 100;
      const data = {
        range: {
          min: min,
          max: max,
        },
        count: 0,
      };
      await firestore.collection("scores").doc().create(data);
    }
    throw Error("Database not initialized");
  }

  const buckets = await firestore.collection("scores")
      .where("range.min", "<=", score).get();
  for (const bucket of buckets.docs) {
    const range = bucket.get("range");
    if (score < range.max) {
      const writeBatch = firestore.batch();
      const playerDoc = firestore.collection("players").doc();
      writeBatch.create(playerDoc, {
        user: playerID,
        score: score,
      });
      writeBatch.update(
          bucket.ref,
          {count: admin.firestore.FieldValue.increment(1)},
      );
      const scoreDoc = bucket.ref.collection("scores").doc();
      writeBatch.create(scoreDoc, {
        user: playerID,
        score: score,
      });
      return writeBatch.commit();
    }
  }
}

You'll notice this insertion code has some logic for initializing your database state at the top with a warning to not do something like this in production. The code for initialization isn't protected at all against race conditions, so if you were to do this, multiple concurrent writes would corrupt your database by giving you a bunch of duplicate buckets.

Go ahead and deploy your functions and then run an insertion to initialize all of the buckets with a count of zero. It'll return an error, which you can safely ignore.

leaderboard.addScore(999, 0); // The params aren't important here.

Now that the database is correctly initialized, we can run addScores and see the structure of our data in Firebase console. The resulting structure is much flatter than our last implementation, though they're superficially similar.

leaderboard.addScores();

And, now, to read scores:

async function readRank(playerID, firestore) {
  const players = await firestore.collection("players")
      .where("user", "==", playerID).get();
  if (players.empty) {
    throw Error(`Player not found in leaderboard: ${playerID}`);
  }
  if (players.size > 1) {
    console.info(`Multiple scores with player ${playerID}, fetching first`);
  }
  const player = players.docs[0].data();
  const score = player.score;

  const scores = await firestore.collection("scores").get();
  let currentCount = 1; // Player is rank 1 if there's 0 better players.
  let interp = -1;
  for (const bucket of scores.docs) {
    const range = bucket.get("range");
    const count = bucket.get("count");
    if (score < range.min) {
      currentCount += count;
    } else if (score >= range.max) {
      // do nothing
    } else {
      // interpolate where the user is in this bucket based on their score.
      const relativePosition = (score - range.min) / (range.max - range.min);
      interp = Math.round(count - (count * relativePosition));
    }
  }

  if (interp === -1) {
    // Didn't find a correct bucket
    throw Error(`Score out of bounds: ${score}`);
  }

  return {
    user: playerID,
    rank: currentCount + interp,
    score: score,
  };
}

Since we've made the addScores function generate a uniform distribution of scores and we're using linear interpolation within the buckets, we'll get very accurate results, the performance of our leaderboard won't degrade as we increase the number of users, and we don't have to worry about lock contention (as much) when updating counts.

7. Addendum: Cheating

Hang on, you might be thinking, if I'm writing values to my codelab via the JS console of a browser tab, can't any of my players just lie to the leaderboard and say they got a high score that they didn't achieve fairly?

Yes, they can. If you want to prevent cheating, the most robust way to do so is to disable client writes to your database via security rules, secure access to your Cloud Functions so clients cannot call them directly, and then validate in-game actions on your server before sending score updates to the leaderboard.

It's important to note this strategy is not a panacea against cheating–with a large enough incentive, cheaters can find ways to circumvent server-side validations, and many large, successful video games are constantly playing cat-and-mouse with their cheaters to identify new cheats and stop them from proliferating. A difficult consequence of this phenomenon is that server-side validation for every game is inherently bespoke; though Firebase provides anti-abuse tools like App Check that will prevent a user from copying your game via a simple scripted client, Firebase does not provide any service that amounts to a holistic anti-cheat.

Anything short of server-side validation will, for a popular enough game or a low enough barrier to cheating, result in a leaderboard where the top values are all cheaters.

8. Congratulations

Congratulations, you've successfully built four different leaderboards on Firebase! Depending on your game's needs for exactness and speed, you'll be able to pick one that works for you at a reasonable cost.

Next up, check out the learning pathways for games.