取樣

說明

從前一階段的結果傳回非決定性樣本。

支援的模式有兩種:

  • DOCUMENTS 模式可讓您取樣一組文件。
    • 這個模式與 GoogleSQL.RESERVOIR 類似,會輸出大小為 n 的樣本,其中任何大小為 n 的樣本可能性均相等。
  • PERCENT 模式可對一定比例的文件進行取樣
    • 這個模式與 GoogleSQL.BERNOULLI 類似,每個文件都是以相同的 percent 機率獨立選取。這會導致系統平均傳回 #documents * percent / 100 份文件。

語法

Node.js

  const sampled = await db.pipeline()
    .database()
    .sample(50)
    .execute();

  const sampled = await db.pipeline()
    .database()
    .sample({ percent: 0.5 })
    .execute();

行為

文件模式

文件模式會以隨機順序擷取指定數量的文件。 指定的數字必須為非負數 INT64 值。

舉例來說,如果是下列集合:

Node.js

await db.collection('cities').doc('SF').set({name: 'San Francsico', state: 'California'});
await db.collection('cities').doc('NYC').set({name: 'New York City', state: 'New York'});
await db.collection('cities').doc('CHI').set({name: 'Chicago', state: 'Illinois'});

在文件模式中,您可以使用範例階段,從這個集合擷取非絕對的結果子集。

Node.js

const sampled = await db.pipeline()
    .collection("/cities")
    .sample(1)
    .execute();

在本例中,系統只會隨機傳回 1 份文件。

  {name: 'New York City', state: 'New York'}

如果提供的數字大於傳回的文件總數,系統會以隨機順序傳回所有文件。

Node.js

const sampled = await db.pipeline()
    .collection("/cities")
    .sample(5)
    .execute();

這會產生下列文件:

  {name: 'New York City', state: 'New York'}
  {name: 'Chicago', state: 'Illinois'}
  {name: 'San Francisco', state: 'California'}

用戶端範例

Web

let results;

// Get a sample of 100 documents in a database
results = await execute(db.pipeline()
  .database()
  .sample(100)
);

// Randomly shuffle a list of 3 documents
results = await execute(db.pipeline()
  .documents([
    doc(db, "cities", "SF"),
    doc(db, "cities", "NY"),
    doc(db, "cities", "DC"),
  ])
  .sample(3)
);
Swift
var results: Pipeline.Snapshot

// Get a sample of 100 documents in a database
results = try await db.pipeline()
  .database()
  .sample(count: 100)
  .execute()

// Randomly shuffle a list of 3 documents
results = try await db.pipeline()
  .documents([
    db.collection("cities").document("SF"),
    db.collection("cities").document("NY"),
    db.collection("cities").document("DC"),
  ])
  .sample(count: 3)
  .execute()

Kotlin

var results: Task<Pipeline.Snapshot>

// Get a sample of 100 documents in a database
results = db.pipeline()
    .database()
    .sample(100)
    .execute()

// Randomly shuffle a list of 3 documents
results = db.pipeline()
    .documents(
        db.collection("cities").document("SF"),
        db.collection("cities").document("NY"),
        db.collection("cities").document("DC")
    )
    .sample(3)
    .execute()

Java

Task<Pipeline.Snapshot> results;

// Get a sample of 100 documents in a database
results = db.pipeline()
    .database()
    .sample(100)
    .execute();

// Randomly shuffle a list of 3 documents
results = db.pipeline()
    .documents(
        db.collection("cities").document("SF"),
        db.collection("cities").document("NY"),
        db.collection("cities").document("DC")
    )
    .sample(3)
    .execute();
Python
# Get a sample of 100 documents in a database
results = client.pipeline().database().sample(100).execute()

# Randomly shuffle a list of 3 documents
results = (
    client.pipeline()
    .documents(
        client.collection("cities").document("SF"),
        client.collection("cities").document("NY"),
        client.collection("cities").document("DC"),
    )
    .sample(3)
    .execute()
)
Java
// Get a sample of 100 documents in a database
Pipeline.Snapshot results1 = firestore.pipeline().database().sample(100).execute().get();

// Randomly shuffle a list of 3 documents
Pipeline.Snapshot results2 =
    firestore
        .pipeline()
        .documents(
            firestore.collection("cities").document("SF"),
            firestore.collection("cities").document("NY"),
            firestore.collection("cities").document("DC"))
        .sample(3)
        .execute()
        .get();

百分比模式

在百分比模式中,每個文件都有指定的 percent 機率會傳回。與文件模式不同,這裡的順序並非隨機,而是會保留先前的文件順序。這個百分比輸入值必須是介於 0.01.0 之間的雙精度浮點數值。

由於系統會獨立選取每個文件,因此輸出內容不具決定性,平均會傳回 #documents * percent / 100 個文件。

舉例來說,如果是下列集合:

Node.js

await db.collection('cities').doc('SF').set({name: 'San Francsico', state: 'California'});
await db.collection('cities').doc('NYC').set({name: 'New York City', state: 'New York'});
await db.collection('cities').doc('CHI').set({name: 'Chicago', state: 'Illinois'});
await db.collection('cities').doc('ATL').set({name: 'Atlanta', state: 'Georgia'});

百分比模式的取樣階段可用於從集合階段擷取 (平均) 50% 的文件。

Node.js

  const sampled = await db.pipeline()
    .collection("/cities")
    .sample({ percent: 0.5 })
    .execute();

這會從 cities 集合中隨機選取 (平均) 50% 的文件。以下為可能的輸出內容。

  {name: 'New York City', state: 'New York'}
  {name: 'Chicago', state: 'Illinois'}

在百分比模式中,由於每份文件被選取的機率相同,因此可能不會傳回任何文件,也可能傳回所有文件。

用戶端範例

Web

// Get a sample of on average 50% of the documents in the database
const results = await execute(db.pipeline()
  .database()
  .sample({ percentage: 0.5 })
);
Swift
// Get a sample of on average 50% of the documents in the database
let results = try await db.pipeline()
  .database()
  .sample(percentage: 0.5)
  .execute()

Kotlin

// Get a sample of on average 50% of the documents in the database
val results = db.pipeline()
    .database()
    .sample(SampleStage.withPercentage(0.5))
    .execute()

Java

// Get a sample of on average 50% of the documents in the database
Task<Pipeline.Snapshot> results = db.pipeline()
    .database()
    .sample(SampleStage.withPercentage(0.5))
    .execute();
Python
from google.cloud.firestore_v1.pipeline_stages import SampleOptions

# Get a sample of on average 50% of the documents in the database
results = (
    client.pipeline().database().sample(SampleOptions.percentage(0.5)).execute()
)
Java
// Get a sample of on average 50% of the documents in the database
Pipeline.Snapshot results =
    firestore.pipeline().database().sample(Sample.withPercentage(0.5)).execute().get();