ML Kit を使用して画像にラベルを付ける（Android）

ML Kit を使用すると、端末モデルまたはクラウドモデルを使って、画像内で認識されたオブジェクトにラベルを付けることができます。それぞれの手法の利点については、概要をご覧ください。

始める前に

まだ Firebase を Android プロジェクトに追加していない場合は追加します。

ML Kit Android ライブラリの依存関係をモジュール（アプリレベル）の Gradle ファイル（通常は app/build.gradle）に追加します。

apply plugin: 'com.android.application'
apply plugin: 'com.google.gms.google-services'

dependencies {
  // ...

  implementation 'com.google.firebase:firebase-ml-vision:24.0.3'
  implementation 'com.google.firebase:firebase-ml-vision-image-label-model:20.0.1'
}

省略可能、ただし推奨: デバイス用 API を使用する場合は、アプリが Play ストアからインストールされたら自動で ML モデルをデバイスにダウンロードするようにアプリを構成します。
この構成を行うには、アプリの AndroidManifest.xml ファイルに次の宣言を追加します。
```
<application ...>
  ...
  <meta-data
      android:name="com.google.firebase.ml.vision.DEPENDENCIES"
      android:value="label" />
  
</application>
```
インストール時点でのモデルのダウンロードを有効にしない場合は、デバイス上の検出器の初回実行時にモデルがダウンロードされます。ダウンロードが完了する前にリクエストしても結果は生成されません。
Cloud ベースモデルを使用する場合に、まだプロジェクトで Cloud ベースの API を有効にしていないときは、ここで有効にします。
1. Firebase コンソールの ML Kit API ページを開きます。
2. まだプロジェクトを Blaze 料金プランにアップグレードしていない場合は、[アップグレード] をクリックしてアップグレードします（プロジェクトをアップグレードするよう求められるのは、プロジェクトが Blaze プランでない場合のみです）。
  
  Blaze レベルのプロジェクトだけが Cloud ベースの API を使用できます。
3. Cloud ベースの API がまだ有効になっていない場合は、[Cloud ベースの API を有効化] をクリックします。
Cloud APIs を使用するアプリを本番環境にデプロイする前に、不正な API アクセスを防いでその影響を軽減するため、いくつかの追加手順が必要になります。
デバイスモデルのみを使用する場合は、この手順を省略できます。

これで、デバイスモデルまたはクラウドベースモデルを使用して画像にラベルを付ける準備ができました。

1. 入力画像を準備する

画像から FirebaseVisionImage オブジェクトを作成します。Bitmap を使用するか、Camera2 API（JPEG フォーマットの media.Image）を使用すると、画像ラベラーの処理が速くなります。可能であれば、このフォーマットの使用をおすすめします。

FirebaseVisionImage オブジェクトを media.Image オブジェクトから作成するには（デバイスのカメラから画像をキャプチャする場合など）、media.Image オブジェクトと画像の回転を FirebaseVisionImage.fromMediaImage() に渡します。

CameraX ライブラリを使用する場合は、OnImageCapturedListener クラスと ImageAnalysis.Analyzer クラスによって回転値が計算されるので、FirebaseVisionImage.fromMediaImage() を呼び出す前に、その回転を ML Kit の ROTATION_ 定数のいずれかに変換するだけで済みます。

Java

private class YourAnalyzer implements ImageAnalysis.Analyzer {

    private int degreesToFirebaseRotation(int degrees) {
        switch (degrees) {
            case 0:
                return FirebaseVisionImageMetadata.ROTATION_0;
            case 90:
                return FirebaseVisionImageMetadata.ROTATION_90;
            case 180:
                return FirebaseVisionImageMetadata.ROTATION_180;
            case 270:
                return FirebaseVisionImageMetadata.ROTATION_270;
            default:
                throw new IllegalArgumentException(
                        "Rotation must be 0, 90, 180, or 270.");
        }
    }

    @Override
    public void analyze(ImageProxy imageProxy, int degrees) {
        if (imageProxy == null || imageProxy.getImage() == null) {
            return;
        }
        Image mediaImage = imageProxy.getImage();
        int rotation = degreesToFirebaseRotation(degrees);
        FirebaseVisionImage image =
                FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
        // Pass image to an ML Kit Vision API
        // ...
    }
}

Kotlin+KTX

private class YourImageAnalyzer : ImageAnalysis.Analyzer {
    private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
        0 -> FirebaseVisionImageMetadata.ROTATION_0
        90 -> FirebaseVisionImageMetadata.ROTATION_90
        180 -> FirebaseVisionImageMetadata.ROTATION_180
        270 -> FirebaseVisionImageMetadata.ROTATION_270
        else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
    }

    override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
        val mediaImage = imageProxy?.image
        val imageRotation = degreesToFirebaseRotation(degrees)
        if (mediaImage != null) {
            val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
            // Pass image to an ML Kit Vision API
            // ...
        }
    }
}

画像の回転を取得するカメラライブラリを使用しない場合は、デバイスの回転とデバイス内のカメラセンサーの向きから計算できます。

Java

private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
static {
    ORIENTATIONS.append(Surface.ROTATION_0, 90);
    ORIENTATIONS.append(Surface.ROTATION_90, 0);
    ORIENTATIONS.append(Surface.ROTATION_180, 270);
    ORIENTATIONS.append(Surface.ROTATION_270, 180);
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
private int getRotationCompensation(String cameraId, Activity activity, Context context)
        throws CameraAccessException {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
    int rotationCompensation = ORIENTATIONS.get(deviceRotation);

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
    int sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION);
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    int result;
    switch (rotationCompensation) {
        case 0:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            break;
        case 90:
            result = FirebaseVisionImageMetadata.ROTATION_90;
            break;
        case 180:
            result = FirebaseVisionImageMetadata.ROTATION_180;
            break;
        case 270:
            result = FirebaseVisionImageMetadata.ROTATION_270;
            break;
        default:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            Log.e(TAG, "Bad rotation value: " + rotationCompensation);
    }
    return result;
}VisionImage.java

Kotlin+KTX

private val ORIENTATIONS = SparseIntArray()

init {
    ORIENTATIONS.append(Surface.ROTATION_0, 90)
    ORIENTATIONS.append(Surface.ROTATION_90, 0)
    ORIENTATIONS.append(Surface.ROTATION_180, 270)
    ORIENTATIONS.append(Surface.ROTATION_270, 180)
}
/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
@Throws(CameraAccessException::class)
private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    val deviceRotation = activity.windowManager.defaultDisplay.rotation
    var rotationCompensation = ORIENTATIONS.get(deviceRotation)

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
    val sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    val result: Int
    when (rotationCompensation) {
        0 -> result = FirebaseVisionImageMetadata.ROTATION_0
        90 -> result = FirebaseVisionImageMetadata.ROTATION_90
        180 -> result = FirebaseVisionImageMetadata.ROTATION_180
        270 -> result = FirebaseVisionImageMetadata.ROTATION_270
        else -> {
            result = FirebaseVisionImageMetadata.ROTATION_0
            Log.e(TAG, "Bad rotation value: $rotationCompensation")
        }
    }
    return result
}VisionImage.kt

次に、media.Image オブジェクトと回転値を FirebaseVisionImage.fromMediaImage() に渡します。

Java

FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);VisionImage.java

Kotlin+KTX

val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)VisionImage.kt

FirebaseVisionImage オブジェクトをファイルの URI から作成するには、アプリコンテキストとファイルの URI を FirebaseVisionImage.fromFilePath() に渡します。これは、ACTION_GET_CONTENT インテントを使用して、ギャラリーアプリから画像を選択するようにユーザーに促すときに便利です。
Java
```
FirebaseVisionImage image;
try {
    image = FirebaseVisionImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}VisionImage.java
```
Kotlin+KTX
```
val image: FirebaseVisionImage
try {
    image = FirebaseVisionImage.fromFilePath(context, uri)
} catch (e: IOException) {
    e.printStackTrace()
}VisionImage.kt
```

FirebaseVisionImage オブジェクトを ByteBuffer またはバイト配列から作成するには、media.Image 入力について上記のように、まず画像の回転を計算します。

次に、画像の高さ、幅、カラーエンコード形式、回転を含む FirebaseVisionImageMetadata オブジェクトを作成します。

Java

FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
        .setWidth(480)   // 480x360 is typically sufficient for
        .setHeight(360)  // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build();VisionImage.java

Kotlin+KTX

val metadata = FirebaseVisionImageMetadata.Builder()
        .setWidth(480) // 480x360 is typically sufficient for
        .setHeight(360) // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build()VisionImage.kt

メタデータオブジェクトと、バッファまたは配列を使用して、FirebaseVisionImage オブジェクトを作成します。

Java

FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
// Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);VisionImage.java

Kotlin+KTX

val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
// Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)VisionImage.kt

FirebaseVisionImage オブジェクトを Bitmap オブジェクトから作成するコードは、以下のとおりです。
Java
```
FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);VisionImage.java
```
Kotlin+KTX
```
val image = FirebaseVisionImage.fromBitmap(bitmap)VisionImage.kt
```
Bitmap オブジェクトによって表される画像は、これ以上回転させる必要がないように、正しい向きになっている必要があります。

2. 画像ラベラーを構成して実行する

画像内のオブジェクトにラベルを付けるには、FirebaseVisionImage オブジェクトを FirebaseVisionImageLabeler の processImage メソッドに渡します。

まず、FirebaseVisionImageLabeler のインスタンスを取得します。

デバイスの画像ラベラーを使用する場合:

Java

FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
    .getOnDeviceImageLabeler();

// Or, to set the minimum confidence required:
// FirebaseVisionOnDeviceImageLabelerOptions options =
//     new FirebaseVisionOnDeviceImageLabelerOptions.Builder()
//         .setConfidenceThreshold(0.7f)
//         .build();
// FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
//     .getOnDeviceImageLabeler(options);

Kotlin+KTX

val labeler = FirebaseVision.getInstance().getOnDeviceImageLabeler()

// Or, to set the minimum confidence required:
// val options = FirebaseVisionOnDeviceImageLabelerOptions.Builder()
//     .setConfidenceThreshold(0.7f)
//     .build()
// val labeler = FirebaseVision.getInstance().getOnDeviceImageLabeler(options)

クラウドの画像ラベラーを使用する場合:

Java

FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
    .getCloudImageLabeler();

// Or, to set the minimum confidence required:
// FirebaseVisionCloudImageLabelerOptions options =
//     new FirebaseVisionCloudImageLabelerOptions.Builder()
//         .setConfidenceThreshold(0.7f)
//         .build();
// FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
//     .getCloudImageLabeler(options);

Kotlin+KTX

val labeler = FirebaseVision.getInstance().getCloudImageLabeler()

// Or, to set the minimum confidence required:
// val options = FirebaseVisionCloudImageLabelerOptions.Builder()
//     .setConfidenceThreshold(0.7f)
//     .build()
// val labeler = FirebaseVision.getInstance().getCloudImageLabeler(options)

次に、画像を processImage() メソッドに渡します。

Java

labeler.processImage(image)
    .addOnSuccessListener(new OnSuccessListener<List<FirebaseVisionImageLabel>>() {
      @Override
      public void onSuccess(List<FirebaseVisionImageLabel> labels) {
        // Task completed successfully
        // ...
      }
    })
    .addOnFailureListener(new OnFailureListener() {
      @Override
      public void onFailure(@NonNull Exception e) {
        // Task failed with an exception
        // ...
      }
    });

Kotlin+KTX

labeler.processImage(image)
    .addOnSuccessListener { labels ->
      // Task completed successfully
      // ...
    }
    .addOnFailureListener { e ->
      // Task failed with an exception
      // ...
    }

3. ラベル付きオブジェクトに関する情報を取得する

画像のラベル付けオペレーションが成功すると、FirebaseVisionImageLabel オブジェクトのリストが成功リスナーに渡されます。各 FirebaseVisionImageLabel オブジェクトは画像内でラベル付けされたものを表します。ラベルごとに、ラベルのテキストの説明、ラベルのナレッジグラフエンティティの ID（使用できる場合）、マッチの信頼スコアを取得できます。次に例を示します。

Java

for (FirebaseVisionImageLabel label: labels) {
  String text = label.getText();
  String entityId = label.getEntityId();
  float confidence = label.getConfidence();
}

Kotlin+KTX

for (label in labels) {
  val text = label.text
  val entityId = label.entityId
  val confidence = label.confidence
}

リアルタイムのパフォーマンスを改善するためのヒント

リアルタイムのアプリケーションでラベルイメージを使用する場合は、適切なフレームレートを得るために次のガイドラインに従ってください。

画像ラベラーの呼び出しのスロットル調整を行います。画像ラベラーの実行中に新しい動画フレームが使用可能になった場合は、そのフレームをドロップします。
画像ラベラーの出力を使用して入力画像の上にグラフィックスをオーバーレイする場合は、まず ML Kit から検出結果を取得し、画像とオーバーレイを 1 つのステップでレンダリングします。これにより、ディスプレイサーフェスへのレンダリングは入力フレームごとに 1 回で済みます。
Camera2 API を使用する場合は、ImageFormat.YUV_420_888 形式で画像をキャプチャします。

古い Camera API を使用する場合は、ImageFormat.NV21 形式で画像をキャプチャします。

次のステップ

Cloud APIs を使用するアプリを本番環境にデプロイする前に、不正な API アクセスを防いでその影響を軽減するため、いくつかの追加手順が必要になります。