Label Images with ML Kit on Android

You can use ML Kit to label objects recognized in an image, using either an on-device model or a cloud model. See the overview to learn about the benefits of each approach.

See the ML Kit quickstart sample on GitHub for an example of this API in use.

Before you begin

  1. If you have not already added Firebase to your app, do so by following the steps in the getting started guide.
  2. Include the dependencies for ML Kit in your app-level build.gradle file:
    dependencies {
      // ...
    
      implementation 'com.google.firebase:firebase-ml-vision:18.0.1'
      implementation 'com.google.firebase:firebase-ml-vision-image-label-model:17.0.2'
    }
    
  3. Optional but recommended: If you use the on-device API, configure your app to automatically download the ML model to the device after your app is installed from the Play Store.

    To do so, add the following declaration to your app's AndroidManifest.xml file:

    <application ...>
      ...
      <meta-data
          android:name="com.google.firebase.ml.vision.DEPENDENCIES"
          android:value="label" />
      <!-- To use multiple models: android:value="label,model2,model3" -->
    </application>
    
    If you do not enable install-time model downloads, the model will be downloaded the first time you run the on-device detector. Requests you make before the download has completed will produce no results.
  4. If you want to use the Cloud-based model, and you have not already enabled the Cloud-based APIs for your project, do so now:

    1. Open the ML Kit APIs page of the Firebase console.
    2. If you have not already upgraded your project to a Blaze plan, click Upgrade to do so. (You will be prompted to upgrade only if your project isn't on the Blaze plan.)

      Only Blaze-level projects can use Cloud-based APIs.

    3. If Cloud-based APIs aren't already enabled, click Enable Cloud-based APIs.

    If you want to use only the on-device model, you can skip this step.

Now you are ready to label images using either an on-device model or a cloud-based model.


On-device image labeling

To use the on-device image labeling model, configure and run the the image labeler as described below.

1. Configure the image labeler

By default, the on-device image labeler returns at most 10 labels for an image. If you want to change this setting, create a FirebaseVisionLabelDetectorOptions object as in the following example:

FirebaseVisionLabelDetectorOptions options =
        new FirebaseVisionLabelDetectorOptions.Builder()
                .setConfidenceThreshold(0.8f)
                .build();

2. Run the image labeler

To label objects in an image, create a FirebaseVisionImage object from either a Bitmap, media.Image, ByteBuffer, byte array, or a file on the device. Then, pass the FirebaseVisionImage object to the FirebaseVisionLabelDetector's detectInImage method.

  1. Create a FirebaseVisionImage object from your image. The image labeler runs fastest when you use a Bitmap or, if you use the camera2 API, a JPEG-formatted media.Image, which are recommended when possible.

    • To create a FirebaseVisionImage object from a Bitmap object:
      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);
      The image represented by the Bitmap object must be upright, with no additional rotation required.
    • To create a FirebaseVisionImage object from a media.Image object, such as when capturing an image from a device's camera, first determine the angle the image must be rotated to compensate for both the device's rotation and the orientation of camera sensor in the device:
      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      Then, pass the media.Image object and the rotation value to FirebaseVisionImage.fromMediaImage():

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
    • To create a FirebaseVisionImage object from a ByteBuffer or a byte array, first calculate the image rotation as described above.

      Then, create a FirebaseVisionImageMetadata object that contains the image's height, width, color encoding format, and rotation:

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      Use the buffer or array, and the metadata object, to create a FirebaseVisionImage object:

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);
      
    • To create a FirebaseVisionImage object from a file, pass the app context and file URI to FirebaseVisionImage.fromFilePath():
      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }

  2. Get an instance of FirebaseVisionLabelDetector:

    FirebaseVisionLabelDetector detector = FirebaseVision.getInstance()
            .getVisionLabelDetector();
    // Or, to set the minimum confidence required:
    FirebaseVisionLabelDetector detector = FirebaseVision.getInstance()
            .getVisionLabelDetector(options);

  3. Finally, pass the image to the detectInImage method:

    Task<List<FirebaseVisionLabel>> result =
            detector.detectInImage(image)
                    .addOnSuccessListener(
                            new OnSuccessListener<List<FirebaseVisionLabel>>() {
                                @Override
                                public void onSuccess(List<FirebaseVisionLabel> labels) {
                                    // Task completed successfully
                                    // ...
                                }
                            })
                    .addOnFailureListener(
                            new OnFailureListener() {
                                @Override
                                public void onFailure(@NonNull Exception e) {
                                    // Task failed with an exception
                                    // ...
                                }
                            });

3. Get information about labeled objects

If the image labeling operation succeeds, a list of FirebaseVisionLabel objects will be passed to the success listener. Each FirebaseVisionLabel object represents something that was labeled in the image. For each label, you can get the label's text description, its Knowledge Graph entity ID (if available), and the confidence score of the match. For example:
for (FirebaseVisionLabel label: labels) {
    String text = label.getLabel();
    String entityId = label.getEntityId();
    float confidence = label.getConfidence();
}

Tips to improve real-time performance

If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:

  • Throttle calls to the image labeler. If a new video frame becomes available while the image labeler is running, drop the frame.
  • If you are using the output of the image labeler to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each input frame. See the CameraSourcePreview and GraphicOverlay classes in the quickstart sample app for an example.
  • If you use the Camera2 API, capture images in ImageFormat.YUV_420_888 format.

    If you use the older Camera API, capture images in ImageFormat.NV21 format.

  • Capture images at a lower resolution. A 480x360 pixel image is typically sufficient.

Cloud image labeling

To use the Cloud-based image labeling model, configure and run the the image labeler as described below.

1. Configure the image labeler

By default, the Cloud detector uses the STABLE version of the model and returns up to 10 results. If you want to change either of these settings, specify them with a FirebaseVisionCloudDetectorOptions object.

For example, to change both of the default settings, build a FirebaseVisionCloudDetectorOptions object as in the following example:

FirebaseVisionCloudDetectorOptions options =
        new FirebaseVisionCloudDetectorOptions.Builder()
                .setModelType(FirebaseVisionCloudDetectorOptions.LATEST_MODEL)
                .setMaxResults(15)
                .build();

To use the default settings, you can use FirebaseVisionCloudDetectorOptions.DEFAULT in the next step.

2. Run the image labeler

To label objects in an image, create a FirebaseVisionImage object from either a Bitmap, media.Image, ByteBuffer, byte array, or a file on the device. Then, pass the FirebaseVisionImage object to the FirebaseCloudVisionLabelDetector's detectInImage method.

  1. Create a FirebaseVisionImage object from your image. The image labeler runs fastest when you use a Bitmap or, if you use the camera2 API, a JPEG-formatted media.Image, which are recommended when possible.

    • To create a FirebaseVisionImage object from a Bitmap object:
      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);
      The image represented by the Bitmap object must be upright, with no additional rotation required.
    • To create a FirebaseVisionImage object from a media.Image object, such as when capturing an image from a device's camera, first determine the angle the image must be rotated to compensate for both the device's rotation and the orientation of camera sensor in the device:
      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      Then, pass the media.Image object and the rotation value to FirebaseVisionImage.fromMediaImage():

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
    • To create a FirebaseVisionImage object from a ByteBuffer or a byte array, first calculate the image rotation as described above.

      Then, create a FirebaseVisionImageMetadata object that contains the image's height, width, color encoding format, and rotation:

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      Use the buffer or array, and the metadata object, to create a FirebaseVisionImage object:

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);
      
    • To create a FirebaseVisionImage object from a file, pass the app context and file URI to FirebaseVisionImage.fromFilePath():
      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }

  2. Get an instance of FirebaseVisionCloudLabelDetector:

    FirebaseVisionCloudLabelDetector detector = FirebaseVision.getInstance()
            .getVisionCloudLabelDetector();
    // Or, to change the default settings:
    // FirebaseVisionCloudLabelDetector detector = FirebaseVision.getInstance()
    //         .getVisionCloudLabelDetector(options);

  3. Finally, pass the image to the detect method:

    Task<List<FirebaseVisionCloudLabel>> result =
            detector.detectInImage(image)
                    .addOnSuccessListener(
                            new OnSuccessListener<List<FirebaseVisionCloudLabel>>() {
                                @Override
                                public void onSuccess(List<FirebaseVisionCloudLabel> labels) {
                                    // Task completed successfully
                                    // ...
                                }
                            })
                    .addOnFailureListener(
                            new OnFailureListener() {
                                @Override
                                public void onFailure(@NonNull Exception e) {
                                    // Task failed with an exception
                                    // ...
                                }
                            });

3. Get information about labeled objects

If the image labeling operation succeeds, a list of FirebaseVisionCloudLabel objects will be passed to the success listener. Each FirebaseVisionCloudLabel object represents something that was labeled in the image. For each label, you can get its text description, its Knowledge Graph entity ID (if available), and the confidence score of the match. For example:

for (FirebaseVisionCloudLabel label: labels) {
    String text = label.getLabel();
    String entityId = label.getEntityId();
    float confidence = label.getConfidence();
}

Next steps

Before you deploy to production an app that uses a Cloud API, you should take some additional steps to prevent and mitigate the effect of unauthorized API access.

Оставить отзыв о...

Текущей странице
Нужна помощь? Обратитесь в службу поддержки.