ネイティブモードの Cloud Firestore Enterprise エディションが利用可能になりました。詳細

ベクターエンベディングを使用した探索

このページでは、Cloud Firestore を使用して、次の手法で K 最近傍（KNN）ベクトル検索を行う方法について説明します。

ベクター値を保存する
KNN ベクトルインデックスを作成して管理する
サポートされているベクトル距離測定のいずれかを使用して K 最近傍（KNN）クエリを実行する

始める前に

Cloud Firestore にエンベディングを保存する前に、ベクトルエンベディングを生成する必要があります。Cloud Firestore はエンベディングを生成しません。Vertex AI などのサービスを使用して、Cloud Firestore データからテキストエンベディングなどのベクトル値を作成できます。これらのエンベディングは、Cloud Firestore ドキュメントに保存できます。

エンベディングの詳細については、エンベディングとはをご覧ください。

Vertex AI を使用してテキストエンベディングを取得する方法については、テキストエンベディングを取得するをご覧ください。

ベクトルエンベディングを保存する

次の例は、ベクトルエンベディングを Cloud Firestore に保存する方法を示しています。

ベクターエンベディングを使用した書き込みオペレーション

次の例は、ベクターエンベディングを Cloud Firestore ドキュメントに保存する方法を示しています。

Python

from google.cloud import firestore
from google.cloud.firestore_v1.vector import Vector

firestore_client = firestore.Client()
collection = firestore_client.collection("coffee-beans")
doc = {
    "name": "Kahawa coffee beans",
    "description": "Information about the Kahawa coffee beans.",
    "embedding_field": Vector([0.18332680, 0.24160706, 0.3416704]),
}

collection.add(doc)vector_search.py

Node.js

import {
  Firestore,
  FieldValue,
} from "@google-cloud/firestore";

const db = new Firestore();
const coll = db.collection('coffee-beans');
await coll.add({
  name: "Kahawa coffee beans",
  description: "Information about the Kahawa coffee beans.",
  embedding_field: FieldValue.vector([1.0 , 2.0, 3.0])
});

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

type CoffeeBean struct {
	Name           string             `firestore:"name,omitempty"`
	Description    string             `firestore:"description,omitempty"`
	EmbeddingField firestore.Vector32 `firestore:"embedding_field,omitempty"`
	Color          string             `firestore:"color,omitempty"`
}

func storeVectors(w io.Writer, projectID string) error {
	ctx := context.Background()

	// Create client
	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	// Vector can be represented by Vector32 or Vector64
	doc := CoffeeBean{
		Name:           "Kahawa coffee beans",
		Description:    "Information about the Kahawa coffee beans.",
		EmbeddingField: []float32{1.0, 2.0, 3.0},
		Color:          "red",
	}
	ref := client.Collection("coffee-beans").NewDoc()
	if _, err = ref.Set(ctx, doc); err != nil {
		fmt.Fprintf(w, "failed to upsert: %v", err)
		return err
	}

	return nil
}
vector_store.go

Java

import com.google.cloud.firestore.CollectionReference;
import com.google.cloud.firestore.DocumentReference;
import com.google.cloud.firestore.FieldValue;
import com.google.cloud.firestore.VectorQuery;

CollectionReference coll = firestore.collection("coffee-beans");

Map<String, Object> docData = new HashMap<>();
docData.put("name", "Kahawa coffee beans");
docData.put("description", "Information about the Kahawa coffee beans.");
docData.put("embedding_field", FieldValue.vector(new double[] {1.0, 2.0, 3.0}));

ApiFuture<DocumentReference> future = coll.add(docData);
DocumentReference documentReference = future.get();

Cloud Functions の関数を使用してベクターエンベディングを計算する

ドキュメントが更新または作成されるたびにベクターエンベディングを計算して保存するには、Cloud Functions の関数を設定します。

Python

@functions_framework.cloud_event
def store_embedding(cloud_event) -> None:
  """Triggers by a change to a Firestore document.
  """
  firestore_payload = firestore.DocumentEventData()
  payload = firestore_payload._pb.ParseFromString(cloud_event.data)

  collection_id, doc_id = from_payload(payload)
  # Call a function to calculate the embedding
  embedding = calculate_embedding(payload)
  # Update the document
  doc = firestore_client.collection(collection_id).document(doc_id)
  doc.set({"embedding_field": embedding}, merge=True)

Node.js

/**
 * A vector embedding will be computed from the
 * value of the `content` field. The vector value
 * will be stored in the `embedding` field. The
 * field names `content` and `embedding` are arbitrary
 * field names chosen for this example.
 */
async function storeEmbedding(event: FirestoreEvent<any>): Promise<void> {
  // Get the previous value of the document's `content` field.
  const previousDocumentSnapshot = event.data.before as QueryDocumentSnapshot;
  const previousContent = previousDocumentSnapshot.get("content");

  // Get the current value of the document's `content` field.
  const currentDocumentSnapshot = event.data.after as QueryDocumentSnapshot;
  const currentContent = currentDocumentSnapshot.get("content");

  // Don't update the embedding if the content field did not change
  if (previousContent === currentContent) {
    return;
  }

  // Call a function to calculate the embedding for the value
  // of the `content` field.
  const embeddingVector = calculateEmbedding(currentContent);

  // Update the `embedding` field on the document.
  await currentDocumentSnapshot.ref.update({
    embedding: embeddingVector,
  });
}

Go

  // Not yet supported in the Go client library

Java

  // Not yet supported in the Java client library

ベクトルインデックスを作成して管理する

ベクトルエンベディングで最近傍検索を実行するには、対応するインデックスを作成する必要があります。次の例では、Google Cloud CLI や Google Cloud コンソールを使用してベクトルインデックスを作成および管理する方法を示します。ベクトルインデックスは、Firebase CLI と Terraform で管理することもできます。

ベクトルインデックスを作成する

Google Cloud コンソール

Google Cloud コンソールから手動で新しいインデックスを作成するには:

Google Cloud コンソールで [データベース] ページに移動します。
[データベース] に移動
データベースのリストから、必要なデータベースを選択します。
ナビゲーションメニューで、[インデックス] をクリックし、[手動] タブをクリックします。
[インデックスを作成] をクリックします。
ベクトル検索用にベクトルフィールドのインデックスを作成するには、[ベクトルインデックスを作成] を選択します。
コレクション ID を入力します。ベクトルフィールドパスとベクトルエンベディングディメンションの数を入力します。インデックスを作成する追加のフィールドの名前と各フィールドのインデックスモードを追加します。

[インデックスを保存] をクリックします。

新しいインデックスが手動インデックスのリストに表示され、Cloud Firestore がインデックスの作成を開始します。インデックスの作成が完了すると、インデックスの横に緑色のチェックマークが表示されます。

gcloud

ベクトルインデックスを作成する前に、Google Cloud CLI を最新バージョンにアップグレードします。

gcloud components update

ベクトルインデックスを作成するには、gcloud firestore indexes composite create を使用します。

gcloud firestore indexes composite create \
--collection-group=collection-group \
--query-scope=COLLECTION \
--field-config field-path=vector-field,vector-config='vector-configuration' \
--database=database-id

ここで

collection-group は、コレクショングループの ID です。
vector-field は、ベクターエンベディングを含むフィールドの名前です。
database-id は、データベースの ID です。
vector-configuration には、ベクトル dimension とインデックスタイプが含まれます。 dimension は、2,048 までの整数です。インデックスのタイプは flat にする必要があります。インデックス構成を {"dimension":"DIMENSION", "flat": "{}"} 形式にします。

次の例では、フィールド vector-field のベクトルインデックスとフィールド color の昇順インデックスを含む複合インデックスを作成します。このタイプのインデックスを使用して、最近傍探索の前にデータを事前にフィルタできます。

gcloud firestore indexes composite create \
--collection-group=collection-group \
--query-scope=COLLECTION \
--field-config=order=ASCENDING,field-path="color" \
--field-config field-path=vector-field,vector-config='{"dimension":"1024", "flat": "{}"}' \
--database=database-id

すべてのベクターインデックスを一覧表示する

Google Cloud コンソール

Google Cloud コンソールで [データベース] ページに移動します。
[データベース] に移動
データベースのリストから、必要なデータベースを選択します。
ナビゲーションメニューで、[インデックス] をクリックし、[手動] タブをクリックします。

インデックステーブルには、データベースのすべてのインデックスが一覧表示されます。ベクトルインデックスには、アイコンが付いたベクトルフィールドが含まれています。

gcloud

すべてのインデックスを一覧表示してインデックス ID を取得するには:

gcloud firestore indexes composite list --database=database-id

database-id は、データベースの ID に置き換えます。

インデックス ID を使用して、インデックスの詳細を表示できます。

gcloud firestore indexes composite describe index-id --database=database-id

ここで

index-id は、説明するインデックスの ID です。
database-id は、データベースの ID です。

ベクトルインデックスを削除する

Google Cloud コンソール

Google Cloud コンソールで [データベース] ページに移動します。
[データベース] に移動
データベースのリストから、必要なデータベースを選択します。
ナビゲーションメニューで、[インデックス] をクリックし、[手動] タブをクリックします。
手動インデックスのリストで、削除するインデックスのその他ボタンをクリックします。[削除] をクリックします。
アラートから [インデックスの削除] をクリックして、このインデックスを削除することの確認を行います。

gcloud

gcloud firestore indexes composite delete index-id --database=database-id

ここで

index-id は、削除するインデックスの ID です。インデックス ID は indexes composite list を使用して取得します。
database-id は、データベースの ID です。

最近傍クエリを実行する

類似度検索を実行して、ベクターエンベディングの最近傍を見つけることができます。類似性検索にはベクターインデックスが必要です。インデックスが存在しない場合、Cloud Firestore では gcloud CLI を使用して作成するインデックスが提案されます。

次の例では、クエリベクトルの 10 個の最近傍を検索します。

Python

from google.cloud.firestore_v1.base_vector_query import DistanceMeasure
from google.cloud.firestore_v1.vector import Vector

collection = db.collection("coffee-beans")

# Requires a single-field vector index
vector_query = collection.find_nearest(
    vector_field="embedding_field",
    query_vector=Vector([0.3416704, 0.18332680, 0.24160706]),
    distance_measure=DistanceMeasure.EUCLIDEAN,
    limit=5,
)vector_search.py

Node.js

import {
  Firestore,
  FieldValue,
  VectorQuery,
  VectorQuerySnapshot,
} from "@google-cloud/firestore";

// Requires a single-field vector index
const vectorQuery: VectorQuery = coll.findNearest({
  vectorField: 'embedding_field',
  queryVector: [3.0, 1.0, 2.0],
  limit: 10,
  distanceMeasure: 'EUCLIDEAN'
});

const vectorQuerySnapshot: VectorQuerySnapshot = await vectorQuery.get();

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

func vectorSearchBasic(w io.Writer, projectID string) error {
	ctx := context.Background()

	// Create client
	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	collection := client.Collection("coffee-beans")

	// Requires a vector index
	// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexes
	vectorQuery := collection.FindNearest("embedding_field",
		[]float32{3.0, 1.0, 2.0},
		5,
		// More info: https://firebase.google.com/docs/firestore/vector-search#vector_distances
		firestore.DistanceMeasureEuclidean,
		nil)

	docs, err := vectorQuery.Documents(ctx).GetAll()
	if err != nil {
		fmt.Fprintf(w, "failed to get vector query results: %v", err)
		return err
	}

	for _, doc := range docs {
		fmt.Fprintln(w, doc.Data()["name"])
	}
	return nil
}
vector_search_basic.go

Java

import com.google.cloud.firestore.VectorQuery;
import com.google.cloud.firestore.VectorQuerySnapshot;

VectorQuery vectorQuery = coll.findNearest(
        "embedding_field",
        new double[] {3.0, 1.0, 2.0},
        /* limit */ 10,
        VectorQuery.DistanceMeasure.EUCLIDEAN);

ApiFuture<VectorQuerySnapshot> future = vectorQuery.get();
VectorQuerySnapshot vectorQuerySnapshot = future.get();

ベクター距離

最近傍クエリでは、ベクトル距離に関する次のオプションがサポートされています。

EUCLIDEAN: ベクトル間の EUCLIDEAN 距離を測定します。詳細については、ユークリッドをご覧ください。
COSINE: ベクター間の角度に基づいてベクターを比較します。これにより、ベクターの大きさに基づかない類似性を測定できます。コサイン距離ではなく、単位正規化ベクターを使用して DOT_PRODUCT を使用することをおすすめします。数学的には、パフォーマンスが向上します。詳細については、コサイン類似度をご覧ください。
DOT_PRODUCT: COSINE に似ていますが、ベクターの大きさの影響を受けます。詳細については、ドット積をご覧ください。

距離測定を選択する

すべてのベクトルエンベディングが正規化されているかどうかによって、距離測定を見つけるために使用する距離測定を決定できます。正規化されたベクトルエンベディングの強度（長さ）は正確に 1.0 です。

また、モデルのトレーニングに使用された距離測定がわかっている場合は、その距離測定を使用してベクトルエンベディング間の距離を計算します。

正規化されたデータ

すべてのベクトルエンベディングが正規化されたデータセットがある場合、3 つの距離測定はすべて同じセマンティック検索の結果を提供します。基本的に、各距離測定から異なる値が返されますが、値の並べ替え方法は同じです。エンべディングが正規化されている場合、通常は DOT_PRODUCT が最も計算効率が優れていますが、ほとんどの場合、その差はごくわずかです。ただし、アプリケーションのパフォーマンスが非常に重要である場合は、DOT_PRODUCT がパフォーマンスチューニングに役立つ場合があります。

正規化されていないデータ

ベクトルエンベディングが正規化されていないデータセットがある場合、ドット積は距離を測定しないため、DOT_PRODUCT を距離測定として使用することは数学的に正しいものではありません。エンベディングの生成方法と優先する検索タイプに応じて、COSINE または EUCLIDEAN 距離測定のいずれかを使用すると、もう一方よりも主観的に優れた検索結果が得られます。COSINE または EUCLIDEAN を試して、ユースケースにどちらが最適かを判断しなければならない場合があります。

データが正規化されているかどうか不明

データが正規化されているかどうかが不明な場合に DOT_PRODUCT を使用する場合は、代わりに COSINE を使用することをおすすめします。COSINE は、正規化が組み込まれた DOT_PRODUCT に似ています。COSINE を使用して測定された距離は 0～2 の範囲です。結果が 0 に近い場合は、ベクトルが非常に類似していることを示しています。

ドキュメントを事前にフィルタする

最近傍を検索する前にドキュメントを事前フィルタするには、類似度検索を他のクエリ演算子と組み合わせます。and 複合フィルタと or 複合フィルタがサポートされています。サポートされているフィールドフィルタの詳細については、クエリ演算子をご覧ください。

Python

from google.cloud.firestore_v1.base_vector_query import DistanceMeasure
from google.cloud.firestore_v1.vector import Vector

collection = db.collection("coffee-beans")

# Similarity search with pre-filter
# Requires a composite vector index
vector_query = collection.where("color", "==", "red").find_nearest(
    vector_field="embedding_field",
    query_vector=Vector([0.3416704, 0.18332680, 0.24160706]),
    distance_measure=DistanceMeasure.EUCLIDEAN,
    limit=5,
)vector_search.py

Node.js

// Similarity search with pre-filter
// Requires composite vector index
const preFilteredVectorQuery: VectorQuery = coll
    .where("color", "==", "red")
    .findNearest({
      vectorField: "embedding_field",
      queryVector: [3.0, 1.0, 2.0],
      limit: 5,
      distanceMeasure: "EUCLIDEAN",
    });

const vectorQueryResults = await preFilteredVectorQuery.get();

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

func vectorSearchPrefilter(w io.Writer, projectID string) error {
	ctx := context.Background()

	// Create client
	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	collection := client.Collection("coffee-beans")

	// Similarity search with pre-filter
	// Requires a composite vector index
	vectorQuery := collection.Where("color", "==", "red").
		FindNearest("embedding_field",
			[]float32{3.0, 1.0, 2.0},
			5,
			// More info: https://firebase.google.com/docs/firestore/vector-search#vector_distances
			firestore.DistanceMeasureEuclidean,
			nil)

	docs, err := vectorQuery.Documents(ctx).GetAll()
	if err != nil {
		fmt.Fprintf(w, "failed to get vector query results: %v", err)
		return err
	}

	for _, doc := range docs {
		fmt.Fprintln(w, doc.Data()["name"])
	}
	return nil
}
vector_search_prefilter.go

Java

import com.google.cloud.firestore.VectorQuery;
import com.google.cloud.firestore.VectorQuerySnapshot;

VectorQuery preFilteredVectorQuery = coll
        .whereEqualTo("color", "red")
        .findNearest(
                "embedding_field",
                new double[] {3.0, 1.0, 2.0},
                /* limit */ 10,
                VectorQuery.DistanceMeasure.EUCLIDEAN);

ApiFuture<VectorQuerySnapshot> future = preFilteredVectorQuery.get();
VectorQuerySnapshot vectorQuerySnapshot = future.get();

計算されたベクトル距離を取得する

計算されたベクトル距離を取得するには、次の例に示すように、FindNearest クエリに distance_result_field 出力プロパティ名を割り当てます。

Python

from google.cloud.firestore_v1.base_vector_query import DistanceMeasure
from google.cloud.firestore_v1.vector import Vector

collection = db.collection("coffee-beans")

vector_query = collection.find_nearest(
    vector_field="embedding_field",
    query_vector=Vector([0.3416704, 0.18332680, 0.24160706]),
    distance_measure=DistanceMeasure.EUCLIDEAN,
    limit=10,
    distance_result_field="vector_distance",
)

docs = vector_query.stream()

for doc in docs:
    print(f"{doc.id}, Distance: {doc.get('vector_distance')}")vector_search.py

Node.js

const vectorQuery: VectorQuery = coll.findNearest(
    {
      vectorField: 'embedding_field',
      queryVector: [3.0, 1.0, 2.0],
      limit: 10,
      distanceMeasure: 'EUCLIDEAN',
      distanceResultField: 'vector_distance'
    });

const snapshot: VectorQuerySnapshot = await vectorQuery.get();

snapshot.forEach((doc) => {
  console.log(doc.id, ' Distance: ', doc.get('vector_distance'));
});

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

func vectorSearchDistanceResultField(w io.Writer, projectID string) error {
	ctx := context.Background()

	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	collection := client.Collection("coffee-beans")

	// Requires a vector index
	// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexes
	vectorQuery := collection.FindNearest("embedding_field",
		[]float32{3.0, 1.0, 2.0},
		10,
		firestore.DistanceMeasureEuclidean,
		&firestore.FindNearestOptions{
			DistanceResultField: "vector_distance",
		})

	docs, err := vectorQuery.Documents(ctx).GetAll()
	if err != nil {
		fmt.Fprintf(w, "failed to get vector query results: %v", err)
		return err
	}

	for _, doc := range docs {
		fmt.Fprintf(w, "%v, Distance: %v\n", doc.Data()["name"], doc.Data()["vector_distance"])
	}
	return nil
}
vector_search_result_field.go

Java

import com.google.cloud.firestore.VectorQuery;
import com.google.cloud.firestore.VectorQueryOptions;
import com.google.cloud.firestore.VectorQuerySnapshot;

VectorQuery vectorQuery = coll.findNearest(
        "embedding_field",
        new double[] {3.0, 1.0, 2.0},
        /* limit */ 10,
        VectorQuery.DistanceMeasure.EUCLIDEAN,
        VectorQueryOptions.newBuilder().setDistanceResultField("vector_distance").build());

ApiFuture<VectorQuerySnapshot> future = vectorQuery.get();
VectorQuerySnapshot vectorQuerySnapshot = future.get();

for (DocumentSnapshot document : vectorQuerySnapshot.getDocuments()) {
    System.out.println(document.getId() + " Distance: " + document.get("vector_distance"));
}

フィールドマスクを使用して、distanceResultField とともにドキュメントフィールドのサブセットを返す場合は、次の例に示すように、フィールドマスクに distanceResultField の値も含める必要があります。

Python

vector_query = collection.select(["color", "vector_distance"]).find_nearest(
    vector_field="embedding_field",
    query_vector=Vector([0.3416704, 0.18332680, 0.24160706]),
    distance_measure=DistanceMeasure.EUCLIDEAN,
    limit=10,
    distance_result_field="vector_distance",
)vector_search.py

Node.js

const vectorQuery: VectorQuery = coll
    .select('name', 'description', 'vector_distance')
    .findNearest({
      vectorField: 'embedding_field',
      queryVector: [3.0, 1.0, 2.0],
      limit: 10,
      distanceMeasure: 'EUCLIDEAN',
      distanceResultField: 'vector_distance'
    });

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

func vectorSearchDistanceResultFieldMasked(w io.Writer, projectID string) error {
	ctx := context.Background()

	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	collection := client.Collection("coffee-beans")

	// Requires a vector index
	// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexes
	vectorQuery := collection.Select("color", "vector_distance").
		FindNearest("embedding_field",
			[]float32{3.0, 1.0, 2.0},
			10,
			firestore.DistanceMeasureEuclidean,
			&firestore.FindNearestOptions{
				DistanceResultField: "vector_distance",
			})

	docs, err := vectorQuery.Documents(ctx).GetAll()
	if err != nil {
		fmt.Fprintf(w, "failed to get vector query results: %v", err)
		return err
	}

	for _, doc := range docs {
		fmt.Fprintf(w, "%v, Distance: %v\n", doc.Data()["color"], doc.Data()["vector_distance"])
	}
	return nil
}
vector_search_result_field_masked.go

Java

import com.google.cloud.firestore.VectorQuery;
import com.google.cloud.firestore.VectorQueryOptions;
import com.google.cloud.firestore.VectorQuerySnapshot;

VectorQuery vectorQuery = coll
        .select("name", "description", "vector_distance")
        .findNearest(
          "embedding_field",
          new double[] {3.0, 1.0, 2.0},
          /* limit */ 10,
          VectorQuery.DistanceMeasure.EUCLIDEAN,
          VectorQueryOptions.newBuilder()
            .setDistanceResultField("vector_distance")
            .build());

ApiFuture<VectorQuerySnapshot> future = vectorQuery.get();
VectorQuerySnapshot vectorQuerySnapshot = future.get();

for (DocumentSnapshot document : vectorQuerySnapshot.getDocuments()) {
    System.out.println(document.getId() + " Distance: " + document.get("vector_distance"));
}

距離のしきい値を指定する

しきい値内のドキュメントのみを返す類似しきい値を指定できます。しきい値フィールドの動作は、選択した距離測定によって異なります。

EUCLIDEAN 距離と COSINE 距離は、距離が指定されたしきい値以下であるドキュメントにしきい値を制限します。ベクトルが類似するほど、これら距離測定は小さくなります。
DOT_PRODUCT 距離は、距離が指定されたしきい値以上であるドキュメントにのみしきい値を適用します。ベクトルが類似するほど、ドット積距離は大きくなります。

次の例では、EUCLIDEAN 距離指標を使用して、最大 4.5 単位の距離にある最寄りのドキュメントを最大 10 件返す距離しきい値を指定する方法を示します。

Python

from google.cloud.firestore_v1.base_vector_query import DistanceMeasure
from google.cloud.firestore_v1.vector import Vector

collection = db.collection("coffee-beans")

vector_query = collection.find_nearest(
    vector_field="embedding_field",
    query_vector=Vector([0.3416704, 0.18332680, 0.24160706]),
    distance_measure=DistanceMeasure.EUCLIDEAN,
    limit=10,
    distance_threshold=4.5,
)

docs = vector_query.stream()

for doc in docs:
    print(f"{doc.id}")vector_search.py

Node.js

const vectorQuery: VectorQuery = coll.findNearest({
  vectorField: 'embedding_field',
  queryVector: [3.0, 1.0, 2.0],
  limit: 10,
  distanceMeasure: 'EUCLIDEAN',
  distanceThreshold: 4.5
});

const snapshot: VectorQuerySnapshot = await vectorQuery.get();

snapshot.forEach((doc) => {
  console.log(doc.id);
});

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/firestore"
)

func vectorSearchDistanceThreshold(w io.Writer, projectID string) error {
	ctx := context.Background()

	client, err := firestore.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("firestore.NewClient: %w", err)
	}
	defer client.Close()

	collection := client.Collection("coffee-beans")

	// Requires a vector index
	// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexes
	vectorQuery := collection.FindNearest("embedding_field",
		[]float32{3.0, 1.0, 2.0},
		10,
		firestore.DistanceMeasureEuclidean,
		&firestore.FindNearestOptions{
			DistanceThreshold: firestore.Ptr[float64](4.5),
		})

	docs, err := vectorQuery.Documents(ctx).GetAll()
	if err != nil {
		fmt.Fprintf(w, "failed to get vector query results: %v", err)
		return err
	}

	for _, doc := range docs {
		fmt.Fprintln(w, doc.Data()["name"])
	}
	return nil
}
vector_search_distance_threshold.go

Java

import com.google.cloud.firestore.VectorQuery;
import com.google.cloud.firestore.VectorQueryOptions;
import com.google.cloud.firestore.VectorQuerySnapshot;

VectorQuery vectorQuery = coll.findNearest(
        "embedding_field",
        new double[] {3.0, 1.0, 2.0},
        /* limit */ 10,
        VectorQuery.DistanceMeasure.EUCLIDEAN,
        VectorQueryOptions.newBuilder()
          .setDistanceThreshold(4.5)
          .build());

ApiFuture<VectorQuerySnapshot> future = vectorQuery.get();
VectorQuerySnapshot vectorQuerySnapshot = future.get();

for (DocumentSnapshot document : vectorQuerySnapshot.getDocuments()) {
    System.out.println(document.getId());
}

制限事項

ベクターエンベディングを使用する場合は、次の制限事項に注意してください。

サポートされているエンべディングディメンションの最大値は 2,048 です。サイズの大きいインデックスを保存するには、次元削減を使用します。
最近傍クエリから返すドキュメントの最大数は 1,000 です。
ベクター検索は、リアルタイムスナップショットリスナーをサポートしていません。
Python、Node.js、Go、Java のクライアントライブラリだけがベクトル検索をサポートしています。

次のステップ

Cloud Firestore のベストプラクティスを確認する。
大規模な読み取りと書き込みについて確認する。

ベクター エンベディングを使用した探索 コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

始める前に

ベクトル エンベディングを保存する

ベクター エンベディングを使用した書き込みオペレーション

Python

Node.js

Go

Java

Cloud Functions の関数を使用してベクター エンベディングを計算する

Python

Node.js

Go

Java

ベクトル インデックスを作成して管理する

ベクトル インデックスを作成する

Google Cloud コンソール

gcloud

すべてのベクター インデックスを一覧表示する

Google Cloud コンソール

gcloud

ベクトル インデックスを削除する

Google Cloud コンソール

gcloud

最近傍クエリを実行する

Python

Node.js

Go

Java

ベクター距離

距離測定を選択する

ドキュメントを事前にフィルタする

Python

Node.js

Go

Java

計算されたベクトル距離を取得する

Python

Node.js

Go

Java

Python

Node.js

Go

Java

距離のしきい値を指定する

Python

Node.js

Go

Java

制限事項

次のステップ

ベクターエンベディングを使用した探索

ベクトルエンベディングを保存する

ベクターエンベディングを使用した書き込みオペレーション

Cloud Functions の関数を使用してベクターエンベディングを計算する

ベクトルインデックスを作成して管理する

ベクトルインデックスを作成する

すべてのベクターインデックスを一覧表示する

ベクトルインデックスを削除する