取樣

說明

從前一階段的結果傳回非決定性樣本。

支援的模式有兩種:

  • documents:隨機挑選 n 文件。
  • percent:隨機挑選 n% 的文件。

範例

Web

let results;

// Get a sample of 100 documents in a database
results = await execute(db.pipeline()
  .database()
  .sample(100)
);

// Randomly shuffle a list of 3 documents
results = await execute(db.pipeline()
  .documents([
    doc(db, "cities", "SF"),
    doc(db, "cities", "NY"),
    doc(db, "cities", "DC"),
  ])
  .sample(3)
);
Swift
var results: Pipeline.Snapshot

// Get a sample of 100 documents in a database
results = try await db.pipeline()
  .database()
  .sample(count: 100)
  .execute()

// Randomly shuffle a list of 3 documents
results = try await db.pipeline()
  .documents([
    db.collection("cities").document("SF"),
    db.collection("cities").document("NY"),
    db.collection("cities").document("DC"),
  ])
  .sample(count: 3)
  .execute()

Kotlin

var results: Task<Pipeline.Snapshot>

// Get a sample of 100 documents in a database
results = db.pipeline()
    .database()
    .sample(100)
    .execute()

// Randomly shuffle a list of 3 documents
results = db.pipeline()
    .documents(
        db.collection("cities").document("SF"),
        db.collection("cities").document("NY"),
        db.collection("cities").document("DC")
    )
    .sample(3)
    .execute()

Java

Task<Pipeline.Snapshot> results;

// Get a sample of 100 documents in a database
results = db.pipeline()
    .database()
    .sample(100)
    .execute();

// Randomly shuffle a list of 3 documents
results = db.pipeline()
    .documents(
        db.collection("cities").document("SF"),
        db.collection("cities").document("NY"),
        db.collection("cities").document("DC")
    )
    .sample(3)
    .execute();
Python
# Get a sample of 100 documents in a database
results = client.pipeline().database().sample(100).execute()

# Randomly shuffle a list of 3 documents
results = (
    client.pipeline()
    .documents(
        client.collection("cities").document("SF"),
        client.collection("cities").document("NY"),
        client.collection("cities").document("DC"),
    )
    .sample(3)
    .execute()
)
Java
// Get a sample of 100 documents in a database
Pipeline.Snapshot results1 = firestore.pipeline().database().sample(100).execute().get();

// Randomly shuffle a list of 3 documents
Pipeline.Snapshot results2 =
    firestore
        .pipeline()
        .documents(
            firestore.collection("cities").document("SF"),
            firestore.collection("cities").document("NY"),
            firestore.collection("cities").document("DC"))
        .sample(3)
        .execute()
        .get();

模式

文件模式

documents 模式會從輸入內容中隨機挑選最多 n 個文件,每個文件 (以及文件順序) 的選取機率都相同。為此,Cloud Firestore仍須掃描及處理所有文件,因此這項作業的成本可能還是很高。

舉例來說,如果是下列集合:

Node.js

await db.collection("cities").doc("SF").set({name: "San Francsico", state: "California"});
await db.collection("cities").doc("NYC").set({name: "New York City", state: "New York"});
await db.collection("cities").doc("CHI").set({name: "Chicago", state: "Illinois"});

文件模式中的範例階段可用於從這個集合擷取非絕對的結果子集。

Node.js

const sampled = await db.pipeline()
    .collection("/cities")
    .sample(1)
    .execute();

在本範例中,系統只會隨機傳回 1 份文件。

  { name: "New York City", state: "New York" }

如果提供的數字大於傳回的文件總數,系統會以隨機順序傳回所有文件。

Node.js

const sampled = await db.pipeline()
    .collection("/cities")
    .sample(5)
    .execute();

這會產生下列文件:

  { name: "New York City", state: "New York" }
  { name: "Chicago", state: "Illinois" }
  { name: "San Francisco", state: "California" }

百分比模式

percent 模式會嘗試從輸入內容中挑選 n% 的文件。因此這個階段會產生約 # documents * percent / 100 份文件。與 documents 模式相同,Cloud Firestore 可確保每個文件都有相同的傳回機率。不過,Cloud Firestore 必須掃描及處理所有文件,因此即使結果集很小,這項作業仍可能相當耗費資源。

documents 模式不同,這裡的順序並非隨機,而是保留先前的文件順序。這個百分比輸入值必須是介於 0.01.0 之間的雙精度浮點數值。

舉例來說,如果是下列集合:

Node.js

await db.collection("cities").doc("SF").set({name: "San Francsico", state: "California"});
await db.collection("cities").doc("NYC").set({name: "New York City", state: "New York"});
await db.collection("cities").doc("CHI").set({name: "Chicago", state: "Illinois"});
await db.collection("cities").doc("ATL").set({name: "Atlanta", state: "Georgia"});

百分比模式的取樣階段可用於從集合階段擷取 (平均) 50% 的文件。

Node.js

  const sampled = await db.pipeline()
    .collection("/cities")
    .sample({ percent: 0.5 })
    .execute();

這會從 cities 集合中隨機選取 (平均) 50% 的文件。以下為可能的輸出內容。

  { name: "New York City", state: "New York" }
  { name: "Chicago", state: "Illinois" }

在百分比模式中,由於每份文件被選取的機率相同,因此可能不會傳回任何文件,也可能傳回所有文件。

用戶端範例

Web

// Get a sample of on average 50% of the documents in the database
const results = await execute(db.pipeline()
  .database()
  .sample({ percentage: 0.5 })
);
Swift
// Get a sample of on average 50% of the documents in the database
let results = try await db.pipeline()
  .database()
  .sample(percentage: 0.5)
  .execute()

Kotlin

// Get a sample of on average 50% of the documents in the database
val results = db.pipeline()
    .database()
    .sample(SampleStage.withPercentage(0.5))
    .execute()

Java

// Get a sample of on average 50% of the documents in the database
Task<Pipeline.Snapshot> results = db.pipeline()
    .database()
    .sample(SampleStage.withPercentage(0.5))
    .execute();
Python
from google.cloud.firestore_v1.pipeline_stages import SampleOptions

# Get a sample of on average 50% of the documents in the database
results = (
    client.pipeline().database().sample(SampleOptions.percentage(0.5)).execute()
)
Java
// Get a sample of on average 50% of the documents in the database
Pipeline.Snapshot results =
    firestore.pipeline().database().sample(Sample.withPercentage(0.5)).execute().get();