在 Android 上使用 ML Kit 偵測及追蹤物件

您可以使用 ML Kit 偵測及追蹤影片影格中的物件。

傳遞 ML Kit 圖片時，ML Kit 會針對每張圖片傳回最多五個偵測到的物件清單，以及這些物件在圖片中的位置。偵測影片串流中的物件時，每個物件都有 ID，可用於追蹤圖片中的物件。您也可以選擇啟用粗略物件分類，為物件加上廣泛的類別說明標籤。

事前準備

如果您尚未將 Firebase 新增至 Android 專案，請先新增。

將 ML Kit Android 程式庫的依附元件新增至模組 (應用程式層級) Gradle 檔案 (通常為 app/build.gradle)：

apply plugin: 'com.android.application'
apply plugin: 'com.google.gms.google-services'

dependencies {
  // ...

  implementation 'com.google.firebase:firebase-ml-vision:24.0.3'
  implementation 'com.google.firebase:firebase-ml-vision-object-detection-model:19.0.6'
}

1. 設定物件偵測器

如要開始偵測及追蹤物件，請先建立 FirebaseVisionObjectDetector 的執行個體，並視需要指定要變更的偵測器設定 (預設設定)。

使用 FirebaseVisionObjectDetectorOptions 物件，為您的用途設定物件偵測器。您可以變更下列設定：

物件偵測工具設定

偵測模式

物件偵測工具設定
偵測模式	`STREAM_MODE` (預設) \| `SINGLE_IMAGE_MODE` 在 `STREAM_MODE` (預設) 中，物件偵測器會以低延遲執行，但可能在前幾次呼叫偵測器時產生不完整的結果 (例如未指定的定界框或類別標籤)。此外，在 `STREAM_MODE` 中，偵測器會為物件指派追蹤 ID，您可以使用這些 ID 追蹤跨影格的物件。如要追蹤物體，或需要低延遲 (例如即時處理影片串流)，請使用這個模式。在 `SINGLE_IMAGE_MODE` 中，物件偵測器會等待，直到偵測到的物件的邊界方塊和 (如果已啟用分類) 類別標籤可用為止，才會傳回結果。因此偵測延遲時間可能會較長。此外，在 `SINGLE_IMAGE_MODE` 中，系統不會指派追蹤 ID。如果延遲不是問題，且您不想處理部分結果，請使用這個模式。
偵測及追蹤多個物件	`false` (預設) \| `true` 是否要偵測及追蹤最多五個物件，或只追蹤最顯眼的物件 (預設)。
分類物件	`false` (預設) \| `true` 是否要將偵測到的物件分類為粗略類別。啟用後，物件偵測器會將物件分類為以下類別：時尚商品、食品、居家用品、地點、植物和不明。

STREAM_MODE (預設) | SINGLE_IMAGE_MODE

在 STREAM_MODE (預設) 中，物件偵測器會以低延遲執行，但可能在前幾次呼叫偵測器時產生不完整的結果 (例如未指定的定界框或類別標籤)。此外，在 STREAM_MODE 中，偵測器會為物件指派追蹤 ID，您可以使用這些 ID 追蹤跨影格的物件。如要追蹤物體，或需要低延遲 (例如即時處理影片串流)，請使用這個模式。

在 SINGLE_IMAGE_MODE 中，物件偵測器會等待，直到偵測到的物件的邊界方塊和 (如果已啟用分類) 類別標籤可用為止，才會傳回結果。因此偵測延遲時間可能會較長。此外，在 SINGLE_IMAGE_MODE 中，系統不會指派追蹤 ID。如果延遲不是問題，且您不想處理部分結果，請使用這個模式。

偵測及追蹤多個物件

false (預設) | true

是否要偵測及追蹤最多五個物件，或只追蹤最顯眼的物件 (預設)。

分類物件

false (預設) | true

是否要將偵測到的物件分類為粗略類別。啟用後，物件偵測器會將物件分類為以下類別：時尚商品、食品、居家用品、地點、植物和不明。

物件偵測和追蹤 API 適用於下列兩項核心用途：

即時偵測及追蹤觀景窗中最顯眼的物件
從靜態圖片偵測多個物件

如要針對這些用途設定 API，請按照下列步驟操作：

Java

// Live detection and tracking
FirebaseVisionObjectDetectorOptions options =
        new FirebaseVisionObjectDetectorOptions.Builder()
                .setDetectorMode(FirebaseVisionObjectDetectorOptions.STREAM_MODE)
                .enableClassification()  // Optional
                .build();

// Multiple object detection in static images
FirebaseVisionObjectDetectorOptions options =
        new FirebaseVisionObjectDetectorOptions.Builder()
                .setDetectorMode(FirebaseVisionObjectDetectorOptions.SINGLE_IMAGE_MODE)
                .enableMultipleObjects()
                .enableClassification()  // Optional
                .build();

Kotlin

// Live detection and tracking
val options = FirebaseVisionObjectDetectorOptions.Builder()
        .setDetectorMode(FirebaseVisionObjectDetectorOptions.STREAM_MODE)
        .enableClassification()  // Optional
        .build()

// Multiple object detection in static images
val options = FirebaseVisionObjectDetectorOptions.Builder()
        .setDetectorMode(FirebaseVisionObjectDetectorOptions.SINGLE_IMAGE_MODE)
        .enableMultipleObjects()
        .enableClassification()  // Optional
        .build()

取得 FirebaseVisionObjectDetector 的執行個體：

Java

FirebaseVisionObjectDetector objectDetector =
        FirebaseVision.getInstance().getOnDeviceObjectDetector();

// Or, to change the default settings:
FirebaseVisionObjectDetector objectDetector =
        FirebaseVision.getInstance().getOnDeviceObjectDetector(options);

Kotlin

val objectDetector = FirebaseVision.getInstance().getOnDeviceObjectDetector()

// Or, to change the default settings:
val objectDetector = FirebaseVision.getInstance().getOnDeviceObjectDetector(options)

2. 執行物件偵測工具

如要偵測及追蹤物件，請將圖片傳遞至執行個體的 FirebaseVisionObjectDetector processImage() 方法。

針對序列中的每個影片或圖片影格，執行下列操作：

從圖片建立 FirebaseVisionImage 物件。

如要從 media.Image 物件建立 FirebaseVisionImage 物件 (例如從裝置的相機擷取圖片時)，請將 media.Image 物件和圖片的旋轉角度傳遞至 FirebaseVisionImage.fromMediaImage()。

如果您使用 CameraX 程式庫，OnImageCapturedListener 和 ImageAnalysis.Analyzer 類別會為您計算旋轉值，因此您只需在呼叫 FirebaseVisionImage.fromMediaImage() 前，將旋轉值轉換為 ML Kit 的 ROTATION_ 常數之一：

Java

private class YourAnalyzer implements ImageAnalysis.Analyzer {

    private int degreesToFirebaseRotation(int degrees) {
        switch (degrees) {
            case 0:
                return FirebaseVisionImageMetadata.ROTATION_0;
            case 90:
                return FirebaseVisionImageMetadata.ROTATION_90;
            case 180:
                return FirebaseVisionImageMetadata.ROTATION_180;
            case 270:
                return FirebaseVisionImageMetadata.ROTATION_270;
            default:
                throw new IllegalArgumentException(
                        "Rotation must be 0, 90, 180, or 270.");
        }
    }

    @Override
    public void analyze(ImageProxy imageProxy, int degrees) {
        if (imageProxy == null || imageProxy.getImage() == null) {
            return;
        }
        Image mediaImage = imageProxy.getImage();
        int rotation = degreesToFirebaseRotation(degrees);
        FirebaseVisionImage image =
                FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
        // Pass image to an ML Kit Vision API
        // ...
    }
}

Kotlin

private class YourImageAnalyzer : ImageAnalysis.Analyzer {
    private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
        0 -> FirebaseVisionImageMetadata.ROTATION_0
        90 -> FirebaseVisionImageMetadata.ROTATION_90
        180 -> FirebaseVisionImageMetadata.ROTATION_180
        270 -> FirebaseVisionImageMetadata.ROTATION_270
        else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
    }

    override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
        val mediaImage = imageProxy?.image
        val imageRotation = degreesToFirebaseRotation(degrees)
        if (mediaImage != null) {
            val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
            // Pass image to an ML Kit Vision API
            // ...
        }
    }
}

如果您使用的相機程式庫未提供圖片的旋轉角度，可以根據裝置的旋轉角度和裝置中相機感應器的方向計算：

Java

private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
static {
    ORIENTATIONS.append(Surface.ROTATION_0, 90);
    ORIENTATIONS.append(Surface.ROTATION_90, 0);
    ORIENTATIONS.append(Surface.ROTATION_180, 270);
    ORIENTATIONS.append(Surface.ROTATION_270, 180);
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
private int getRotationCompensation(String cameraId, Activity activity, Context context)
        throws CameraAccessException {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
    int rotationCompensation = ORIENTATIONS.get(deviceRotation);

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
    int sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION);
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    int result;
    switch (rotationCompensation) {
        case 0:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            break;
        case 90:
            result = FirebaseVisionImageMetadata.ROTATION_90;
            break;
        case 180:
            result = FirebaseVisionImageMetadata.ROTATION_180;
            break;
        case 270:
            result = FirebaseVisionImageMetadata.ROTATION_270;
            break;
        default:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            Log.e(TAG, "Bad rotation value: " + rotationCompensation);
    }
    return result;
}VisionImage.java

Kotlin

private val ORIENTATIONS = SparseIntArray()

init {
    ORIENTATIONS.append(Surface.ROTATION_0, 90)
    ORIENTATIONS.append(Surface.ROTATION_90, 0)
    ORIENTATIONS.append(Surface.ROTATION_180, 270)
    ORIENTATIONS.append(Surface.ROTATION_270, 180)
}
/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
@Throws(CameraAccessException::class)
private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    val deviceRotation = activity.windowManager.defaultDisplay.rotation
    var rotationCompensation = ORIENTATIONS.get(deviceRotation)

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
    val sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    val result: Int
    when (rotationCompensation) {
        0 -> result = FirebaseVisionImageMetadata.ROTATION_0
        90 -> result = FirebaseVisionImageMetadata.ROTATION_90
        180 -> result = FirebaseVisionImageMetadata.ROTATION_180
        270 -> result = FirebaseVisionImageMetadata.ROTATION_270
        else -> {
            result = FirebaseVisionImageMetadata.ROTATION_0
            Log.e(TAG, "Bad rotation value: $rotationCompensation")
        }
    }
    return result
}VisionImage.kt

接著，將 media.Image 物件和旋轉值傳遞至 FirebaseVisionImage.fromMediaImage()：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)VisionImage.kt

如要從檔案 URI 建立 FirebaseVisionImage 物件，請將應用程式內容和檔案 URI 傳遞至 FirebaseVisionImage.fromFilePath()。當您使用 ACTION_GET_CONTENT 意圖提示使用者從相簿應用程式選取圖片時，這項功能就非常實用。

Java

FirebaseVisionImage image;
try {
    image = FirebaseVisionImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}VisionImage.java

Kotlin

val image: FirebaseVisionImage
try {
    image = FirebaseVisionImage.fromFilePath(context, uri)
} catch (e: IOException) {
    e.printStackTrace()
}VisionImage.kt

如要從 ByteBuffer 或位元組陣列建立 FirebaseVisionImage 物件，請先計算圖片旋轉角度，如上文所述，以做為 media.Image 輸入內容。

接著，建立 FirebaseVisionImageMetadata 物件，其中包含圖片的高度、寬度、色彩編碼格式和旋轉角度：

Java

FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
        .setWidth(480)   // 480x360 is typically sufficient for
        .setHeight(360)  // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build();VisionImage.java

Kotlin

val metadata = FirebaseVisionImageMetadata.Builder()
        .setWidth(480) // 480x360 is typically sufficient for
        .setHeight(360) // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build()VisionImage.kt

使用緩衝區或陣列和中繼資料物件，建立 FirebaseVisionImage 物件：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
// Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
// Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)VisionImage.kt

如要從 Bitmap 物件建立 FirebaseVisionImage 物件，請執行下列操作：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromBitmap(bitmap)VisionImage.kt

Bitmap 物件代表的圖片必須直立，不需額外旋轉。

將圖片傳遞至 processImage() 方法：

Java

objectDetector.processImage(image)
        .addOnSuccessListener(
                new OnSuccessListener<List<FirebaseVisionObject>>() {
                    @Override
                    public void onSuccess(List<FirebaseVisionObject> detectedObjects) {
                        // Task completed successfully
                        // ...
                    }
                })
        .addOnFailureListener(
                new OnFailureListener() {
                    @Override
                    public void onFailure(@NonNull Exception e) {
                        // Task failed with an exception
                        // ...
                    }
                });

Kotlin

objectDetector.processImage(image)
        .addOnSuccessListener { detectedObjects ->
            // Task completed successfully
            // ...
        }
        .addOnFailureListener { e ->
            // Task failed with an exception
            // ...
        }

如果對 processImage() 的呼叫成功，系統會將 FirebaseVisionObject 清單傳遞至成功監聽器。

每個 FirebaseVisionObject 都包含下列屬性：

定界框	`Rect`：表示物件在圖片中的位置。
追蹤 ID	用於識別跨圖像物件的整數。在 SINGLE_IMAGE_MODE 中為空值。
類別	物件的粗略類別。如果物件偵測器未啟用分類功能，這項值一律為 `FirebaseVisionObject.CATEGORY_UNKNOWN`。
可信度	物件分類的信賴度值。如果物件偵測器未啟用分類功能，或物件分類為不明，則為 `null`。

Java

// The list of detected objects contains one item if multiple object detection wasn't enabled.
for (FirebaseVisionObject obj : detectedObjects) {
    Integer id = obj.getTrackingId();
    Rect bounds = obj.getBoundingBox();

    // If classification was enabled:
    int category = obj.getClassificationCategory();
    Float confidence = obj.getClassificationConfidence();
}

Kotlin

// The list of detected objects contains one item if multiple object detection wasn't enabled.
for (obj in detectedObjects) {
    val id = obj.trackingId       // A number that identifies the object across images
    val bounds = obj.boundingBox  // The object's position in the image

    // If classification was enabled:
    val category = obj.classificationCategory
    val confidence = obj.classificationConfidence
}

提升可用性和效能

為提供最佳使用者體驗，請在應用程式中遵守下列規範：

物件偵測成功與否取決於物件的視覺複雜度。如果物件的視覺特徵較少，可能需要佔據圖片較大的部分，才能偵測到。請提供相關指引，說明如何擷取適合偵測物件的輸入內容。
使用分類功能時，如要偵測不屬於支援類別的物件，請針對不明物件實作特殊處理方式。

此外，也請參閱 [ML Kit Material Design 展示應用程式][showcase-link]{: .external }，以及「Material Design Patterns for machine learning-powered features」集合。

在即時應用程式中使用串流模式時，請遵循下列指南，盡可能提高影格速率：

請勿在串流模式中使用多個物件偵測功能，因為大多數裝置無法產生足夠的影格速率。
如果不需要分類功能，請停用。
節流對偵測器的呼叫。如果偵測器執行期間有新的影片影格可用，請捨棄該影格。
如果使用偵測器的輸出內容，在輸入圖片上疊加圖像，請先從 ML Kit 取得結果，然後在單一步驟中算繪圖片並疊加圖像。這樣做的話，每個輸入影格只會轉譯到顯示表面一次。
如果您使用 Camera2 API，請以 ImageFormat.YUV_420_888 格式擷取圖片。

如果使用舊版 Camera API，請以 ImageFormat.NV21 格式擷取圖片。

在 Android 上使用 ML Kit 偵測及追蹤物件 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

事前準備

1. 設定物件偵測器

Java

Kotlin

Java

Kotlin

2. 執行物件偵測工具

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

提升可用性和效能

在 Android 上使用 ML Kit 偵測及追蹤物件