Google is committed to advancing racial equity for Black communities. See how.

Label Images with Firebase ML on iOS

You can use Firebase ML to label objects recognized in an image. See the overview for information about this API's features.

Before you begin

  1. If you have not already added Firebase to your app, do so by following the steps in the getting started guide.
  2. Include Firebase in your Podfile:
    pod 'Firebase/MLVision'
    After you install or update your project's Pods, be sure to open your Xcode project using its .xcworkspace.
  3. In your app, import Firebase:


    import Firebase


    @import Firebase;
  4. If you have not already enabled Cloud-based APIs for your project, do so now:

    1. Open the Firebase ML APIs page of the Firebase console.
    2. If you have not already upgraded your project to the Blaze plan, click Upgrade to do so. (You will be prompted to upgrade only if your project isn't on the Blaze plan.)

      Only Blaze-level projects can use Cloud-based APIs.

    3. If Cloud-based APIs aren't already enabled, click Enable Cloud-based APIs.

Now you are ready to label images.

1. Prepare the input image

Create a VisionImage object using a UIImage or a CMSampleBufferRef.

To use a UIImage:

  1. If necessary, rotate the image so that its imageOrientation property is .up.
  2. Create a VisionImage object using the correctly-rotated UIImage. Do not specify any rotation metadata—the default value, .topLeft, must be used.


    let image = VisionImage(image: uiImage)


    FIRVisionImage *image = [[FIRVisionImage alloc] initWithImage:uiImage];

To use a CMSampleBufferRef:

  1. Create a VisionImageMetadata object that specifies the orientation of the image data contained in the CMSampleBufferRef buffer.

    To get the image orientation:


    func imageOrientation(
        deviceOrientation: UIDeviceOrientation,
        cameraPosition: AVCaptureDevice.Position
        ) -> VisionDetectorImageOrientation {
        switch deviceOrientation {
        case .portrait:
            return cameraPosition == .front ? .leftTop : .rightTop
        case .landscapeLeft:
            return cameraPosition == .front ? .bottomLeft : .topLeft
        case .portraitUpsideDown:
            return cameraPosition == .front ? .rightBottom : .leftBottom
        case .landscapeRight:
            return cameraPosition == .front ? .topRight : .bottomRight
        case .faceDown, .faceUp, .unknown:
            return .leftTop


    - (FIRVisionDetectorImageOrientation)
                               cameraPosition:(AVCaptureDevicePosition)cameraPosition {
      switch (deviceOrientation) {
        case UIDeviceOrientationPortrait:
          if (cameraPosition == AVCaptureDevicePositionFront) {
            return FIRVisionDetectorImageOrientationLeftTop;
          } else {
            return FIRVisionDetectorImageOrientationRightTop;
        case UIDeviceOrientationLandscapeLeft:
          if (cameraPosition == AVCaptureDevicePositionFront) {
            return FIRVisionDetectorImageOrientationBottomLeft;
          } else {
            return FIRVisionDetectorImageOrientationTopLeft;
        case UIDeviceOrientationPortraitUpsideDown:
          if (cameraPosition == AVCaptureDevicePositionFront) {
            return FIRVisionDetectorImageOrientationRightBottom;
          } else {
            return FIRVisionDetectorImageOrientationLeftBottom;
        case UIDeviceOrientationLandscapeRight:
          if (cameraPosition == AVCaptureDevicePositionFront) {
            return FIRVisionDetectorImageOrientationTopRight;
          } else {
            return FIRVisionDetectorImageOrientationBottomRight;
          return FIRVisionDetectorImageOrientationTopLeft;

    Then, create the metadata object:


    let cameraPosition = AVCaptureDevice.Position.back  // Set to the capture device you used.
    let metadata = VisionImageMetadata()
    metadata.orientation = imageOrientation(
        deviceOrientation: UIDevice.current.orientation,
        cameraPosition: cameraPosition


    FIRVisionImageMetadata *metadata = [[FIRVisionImageMetadata alloc] init];
    AVCaptureDevicePosition cameraPosition =
        AVCaptureDevicePositionBack;  // Set to the capture device you used.
    metadata.orientation =
        [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
  2. Create a VisionImage object using the CMSampleBufferRef object and the rotation metadata:


    let image = VisionImage(buffer: sampleBuffer)
    image.metadata = metadata


    FIRVisionImage *image = [[FIRVisionImage alloc] initWithBuffer:sampleBuffer];
    image.metadata = metadata;

2. Configure and run the image labeler

To label objects in an image, pass the VisionImage object to the VisionImageLabeler's processImage() method.

  1. First, get an instance of VisionImageLabeler:


    let labeler =
    // Or, to set the minimum confidence required:
    // let options = VisionCloudImageLabelerOptions()
    // options.confidenceThreshold = 0.7
    // let labeler = options)


    FIRVisionImageLabeler *labeler = [[FIRVision vision] cloudImageLabeler];
    // Or, to set the minimum confidence required:
    // FIRVisionCloudImageLabelerOptions *options =
    //         [[FIRVisionCloudImageLabelerOptions alloc] init];
    // options.confidenceThreshold = 0.7;
    // FIRVisionImageLabeler *labeler =
    //         [[FIRVision vision] cloudImageLabelerWithOptions:options];
  2. Then, pass the image to the processImage() method:


    labeler.process(image) { labels, error in
        guard error == nil, let labels = labels else { return }
        // Task succeeded.
        // ...


    [labeler processImage:image
               completion:^(NSArray<FIRVisionImageLabel *> *_Nullable labels,
                            NSError *_Nullable error) {
                   if (error != nil) { return; }
                   // Task succeeded.
                   // ...

3. Get information about labeled objects

If image labeling succeeds, an array of VisionImageLabel objects will be passed to the completion handler. From each object, you can get information about a feature recognized in the image.

For example:


for label in labels {
    let labelText = label.text
    let entityId = label.entityID
    let confidence = label.confidence


for (FIRVisionImageLabel *label in labels) {
   NSString *labelText = label.text;
   NSString *entityId = label.entityID;
   NSNumber *confidence = label.confidence;

Next steps