Google I/O 2023에서 Firebase의 주요 소식을 확인하세요. 자세히 알아보기

iOS에서 ML Kit를 사용하여 얼굴 인식

ML Kit를 사용하여 이미지 및 동영상 속 얼굴을 인식할 수 있습니다.

시작하기 전에

앱에 Firebase를 아직 추가하지 않았다면 시작 가이드의 단계에 따라 추가합니다.

Podfile에 ML Kit 라이브러리를 포함합니다.

pod 'Firebase/MLVision', '6.25.0'
# If you want to detect face contours (landmark detection and classification
# don't require this additional model):
pod 'Firebase/MLVisionFaceModel', '6.25.0'

프로젝트의 포드를 설치하거나 업데이트한 후 .xcworkspace를 사용하여 Xcode 프로젝트를 열어야 합니다.

앱에서 Firebase를 가져옵니다.
Swift
```
import Firebase
```
Objective-C
```
@import Firebase;
```

입력 이미지 가이드라인

ML Kit가 얼굴을 정확하게 인식하려면 입력 이미지에 충분한 픽셀 데이터로 표시된 얼굴이 있어야 합니다. 일반적으로 이미지에서 인식하려는 얼굴은 100x100 픽셀 이상이어야 합니다. 얼굴 윤곽을 인식하려는 경우 ML Kit에는 더 높은 해상도의 입력이 필요합니다. 얼굴 각각이 200x200 픽셀 이상이어야 합니다.

실시간 애플리케이션에서 얼굴을 인식하는 경우 입력 이미지의 전체 크기를 고려해야 할 수도 있습니다. 이미지 크기가 작을수록 더 빠르게 처리될 수 있으므로 지연 시간을 줄이려면 위의 정확도 요구사항에 유의하여 가능한 낮은 해상도에서 이미지를 캡처하고 가능한 한 얼굴이 이미지의 많은 부분을 차지하도록 합니다. 또한 실시간 성능 향상을 위한 팁도 참조하세요.

이미지 초점이 잘 맞지 않으면 정확도가 저하될 수 있습니다. 허용 가능한 수준의 결과를 얻지 못하는 경우 사용자에게 이미지를 다시 캡처하도록 요청합니다.

카메라를 기준으로 한 얼굴의 방향은 ML Kit에서 인식하는 얼굴 형태에 영향을 미칠 수 있습니다. 얼굴 인식 개념을 참조하세요.

1. 얼굴 인식기 구성

이미지에 얼굴 인식 기능을 적용하기 전에 얼굴 인식기의 기본 설정을 변경하려면 VisionFaceDetectorOptions 객체를 사용하여 설정을 지정합니다. 다음과 같은 설정을 변경할 수 있습니다.

설정
`performanceMode`	`fast`(기본값) \| `accurate` 얼굴을 인식할 때 속도 또는 정확도를 우선시합니다.
`landmarkMode`	`none`(기본값) \| `all` 인식된 모든 얼굴에서 눈, 귀, 코, 뺨, 입과 같은 얼굴의 '랜드마크'를 식별할지 여부입니다.
`contourMode`	`none`(기본값) \| `all` 얼굴 특징의 윤곽선을 인식할지 여부입니다. 윤곽선은 이미지 속 가장 뚜렷한 얼굴에 대해서만 인식됩니다.
`classificationMode`	`none`(기본값) \| `all` 얼굴을 '웃고 있음'이나 '눈을 뜸'과 같은 카테고리로 분류할 것인지 여부입니다.
`minFaceSize`	`CGFloat`(기본값: `0.1`) 이미지를 기준으로, 인식할 얼굴의 최소 크기
`isTrackingEnabled`	`false`(기본값) \| `true` 얼굴에 ID를 할당할 것인지 여부이며, 서로 다른 이미지에서 얼굴을 추적하는 데 사용할 수 있습니다. 윤곽선 인식이 사용 설정되면 한 얼굴만 인식되므로 얼굴 추적 기능을 사용해도 유용한 결과가 나오지 않습니다. 따라서 인식 속도를 높이려면 윤곽선 인식과 얼굴 추적은 동시에 사용 설정하지 마세요.

다음 예시와 같이 VisionFaceDetectorOptions 객체를 빌드합니다.

Swift

// High-accuracy landmark detection and face classification
let options = VisionFaceDetectorOptions()
options.performanceMode = .accurate
options.landmarkMode = .all
options.classificationMode = .all

// Real-time contour detection of multiple faces
let options = VisionFaceDetectorOptions()
options.contourMode = .all

Objective-C

// High-accuracy landmark detection and face classification
FIRVisionFaceDetectorOptions *options = [[FIRVisionFaceDetectorOptions alloc] init];
options.performanceMode = FIRVisionFaceDetectorPerformanceModeAccurate;
options.landmarkMode = FIRVisionFaceDetectorLandmarkModeAll;
options.classificationMode = FIRVisionFaceDetectorClassificationModeAll;

// Real-time contour detection of multiple faces
FIRVisionFaceDetectorOptions *options = [[FIRVisionFaceDetectorOptions alloc] init];
options.contourMode = FIRVisionFaceDetectorContourModeAll;

2. 얼굴 인식기 실행

이미지 속 얼굴을 인식하려면 이미지를 UIImage 또는 CMSampleBufferRef로 VisionFaceDetector의 detect(in:) 메서드에 전달합니다.

VisionFaceDetector의 인스턴스를 가져옵니다.

Swift

lazy var vision = Vision.vision()

let faceDetector = vision.faceDetector(options: options)

Objective-C

FIRVision *vision = [FIRVision vision];
FIRVisionFaceDetector *faceDetector = [vision faceDetector];
// Or, to change the default settings:
// FIRVisionFaceDetector *faceDetector =
//     [vision faceDetectorWithOptions:options];

UIImage 또는 CMSampleBufferRef를 사용하여 VisionImage 객체를 만듭니다.

UIImage를 사용하는 방법은 다음과 같습니다.

필요한 경우 imageOrientation 속성이 .up이 되도록 이미지를 회전합니다.
올바르게 회전된 UIImage를 사용하여 VisionImage 객체를 만듭니다. 회전 메타데이터를 지정하지 마세요. 기본값인 .topLeft를 사용해야 합니다.
Swift
```
let image = VisionImage(image: uiImage)
```
Objective-C
```
FIRVisionImage *image = [[FIRVisionImage alloc] initWithImage:uiImage];
```

CMSampleBufferRef를 사용하는 방법은 다음과 같습니다.

CMSampleBufferRef 버퍼에 포함된 이미지 데이터의 방향을 지정하는 VisionImageMetadata 객체를 만듭니다.

이미지 방향을 가져오는 방법은 다음과 같습니다.

Swift

func imageOrientation(
    deviceOrientation: UIDeviceOrientation,
    cameraPosition: AVCaptureDevice.Position
    ) -> VisionDetectorImageOrientation {
    switch deviceOrientation {
    case .portrait:
        return cameraPosition == .front ? .leftTop : .rightTop
    case .landscapeLeft:
        return cameraPosition == .front ? .bottomLeft : .topLeft
    case .portraitUpsideDown:
        return cameraPosition == .front ? .rightBottom : .leftBottom
    case .landscapeRight:
        return cameraPosition == .front ? .topRight : .bottomRight
    case .faceDown, .faceUp, .unknown:
        return .leftTop
    }
}

Objective-C

- (FIRVisionDetectorImageOrientation)
    imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation
                           cameraPosition:(AVCaptureDevicePosition)cameraPosition {
  switch (deviceOrientation) {
    case UIDeviceOrientationPortrait:
      if (cameraPosition == AVCaptureDevicePositionFront) {
        return FIRVisionDetectorImageOrientationLeftTop;
      } else {
        return FIRVisionDetectorImageOrientationRightTop;
      }
    case UIDeviceOrientationLandscapeLeft:
      if (cameraPosition == AVCaptureDevicePositionFront) {
        return FIRVisionDetectorImageOrientationBottomLeft;
      } else {
        return FIRVisionDetectorImageOrientationTopLeft;
      }
    case UIDeviceOrientationPortraitUpsideDown:
      if (cameraPosition == AVCaptureDevicePositionFront) {
        return FIRVisionDetectorImageOrientationRightBottom;
      } else {
        return FIRVisionDetectorImageOrientationLeftBottom;
      }
    case UIDeviceOrientationLandscapeRight:
      if (cameraPosition == AVCaptureDevicePositionFront) {
        return FIRVisionDetectorImageOrientationTopRight;
      } else {
        return FIRVisionDetectorImageOrientationBottomRight;
      }
    default:
      return FIRVisionDetectorImageOrientationTopLeft;
  }
}

그런 다음 메타데이터 객체를 만듭니다.

Swift

let cameraPosition = AVCaptureDevice.Position.back  // Set to the capture device you used.
let metadata = VisionImageMetadata()
metadata.orientation = imageOrientation(
    deviceOrientation: UIDevice.current.orientation,
    cameraPosition: cameraPosition
)

Objective-C

FIRVisionImageMetadata *metadata = [[FIRVisionImageMetadata alloc] init];
AVCaptureDevicePosition cameraPosition =
    AVCaptureDevicePositionBack;  // Set to the capture device you used.
metadata.orientation =
    [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
                                 cameraPosition:cameraPosition];

CMSampleBufferRef 객체 및 회전 메타데이터를 사용하여 VisionImage 객체를 만듭니다.

Swift

let image = VisionImage(buffer: sampleBuffer)
image.metadata = metadata

Objective-C

FIRVisionImage *image = [[FIRVisionImage alloc] initWithBuffer:sampleBuffer];
image.metadata = metadata;

이제 이미지를 detect(in:) 메서드에 전달합니다.

Swift

faceDetector.process(visionImage) { faces, error in
  guard error == nil, let faces = faces, !faces.isEmpty else {
    // ...
    return
  }

  // Faces detected
  // ...
}

Objective-C

[faceDetector detectInImage:image
                 completion:^(NSArray<FIRVisionFace *> *faces,
                              NSError *error) {
  if (error != nil) {
    return;
  } else if (faces != nil) {
    // Recognized faces
  }
}];

3. 인식된 얼굴에 관한 정보 얻기

얼굴 인식 작업이 성공하면 얼굴 인식기에서 VisionFace 객체의 배열을 완료 핸들러에 전달합니다. 각 VisionFace 객체는 이미지에서 인식된 얼굴을 나타냅니다. 얼굴별로 입력 이미지의 경계 좌표 및 얼굴 인식기가 찾도록 설정해 둔 다른 정보를 얻을 수 있습니다. 예를 들면 다음과 같습니다.

Swift

for face in faces {
  let frame = face.frame
  if face.hasHeadEulerAngleY {
    let rotY = face.headEulerAngleY  // Head is rotated to the right rotY degrees
  }
  if face.hasHeadEulerAngleZ {
    let rotZ = face.headEulerAngleZ  // Head is rotated upward rotZ degrees
  }

  // If landmark detection was enabled (mouth, ears, eyes, cheeks, and
  // nose available):
  if let leftEye = face.landmark(ofType: .leftEye) {
    let leftEyePosition = leftEye.position
  }

  // If contour detection was enabled:
  if let leftEyeContour = face.contour(ofType: .leftEye) {
    let leftEyePoints = leftEyeContour.points
  }
  if let upperLipBottomContour = face.contour(ofType: .upperLipBottom) {
    let upperLipBottomPoints = upperLipBottomContour.points
  }

  // If classification was enabled:
  if face.hasSmilingProbability {
    let smileProb = face.smilingProbability
  }
  if face.hasRightEyeOpenProbability {
    let rightEyeOpenProb = face.rightEyeOpenProbability
  }

  // If face tracking was enabled:
  if face.hasTrackingID {
    let trackingId = face.trackingID
  }
}

Objective-C

for (FIRVisionFace *face in faces) {
  // Boundaries of face in image
  CGRect frame = face.frame;

  if (face.hasHeadEulerAngleY) {
    CGFloat rotY = face.headEulerAngleY;  // Head is rotated to the right rotY degrees
  }
  if (face.hasHeadEulerAngleZ) {
    CGFloat rotZ = face.headEulerAngleZ;  // Head is tilted sideways rotZ degrees
  }

  // If landmark detection was enabled (mouth, ears, eyes, cheeks, and
  // nose available):
  FIRVisionFaceLandmark *leftEar = [face landmarkOfType:FIRFaceLandmarkTypeLeftEar];
  if (leftEar != nil) {
    FIRVisionPoint *leftEarPosition = leftEar.position;
  }

  // If contour detection was enabled:
  FIRVisionFaceContour *upperLipBottomContour = [face contourOfType:FIRFaceContourTypeUpperLipBottom];
  if (upperLipBottomContour != nil) {
    NSArray<FIRVisionPoint *> *upperLipBottomPoints = upperLipBottomContour.points;
    if (upperLipBottomPoints.count > 0) {
      NSLog("Detected the bottom contour of the subject's upper lip.")
    }
  }

  // If classification was enabled:
  if (face.hasSmilingProbability) {
    CGFloat smileProb = face.smilingProbability;
  }
  if (face.hasRightEyeOpenProbability) {
    CGFloat rightEyeOpenProb = face.rightEyeOpenProbability;
  }

  // If face tracking was enabled:
  if (face.hasTrackingID) {
    NSInteger trackingID = face.trackingID;
  }
}

얼굴 윤곽선 예시

얼굴 윤곽 인식이 사용 설정되어 있으면 인식된 각 얼굴 특징에 대한 점들의 목록을 가져올 수 있습니다. 이러한 점들은 특징의 형태를 나타냅니다. 윤곽선 표시 방법에 대한 자세한 내용은 얼굴 인식 개념 개요를 참조하세요.

다음 이미지는 이 점들이 얼굴에 어떻게 매핑되는지 보여줍니다(확대하려면 이미지를 클릭).

실시간 얼굴 인식

실시간 애플리케이션에서 얼굴 인식을 사용하려는 경우 최상의 프레임 속도를 얻으려면 다음 안내를 따르세요.

얼굴 인식기를 구성하면 얼굴 윤곽선 인식 또는 분류와 랜드마크를 사용할 수 있습니다. 단, 둘 다 사용할 수는 없습니다.

윤곽선 인식
랜드마크 인식
분류
랜드마크 인식 및 분류
윤곽선 인식 및 랜드마크 인식
윤곽선 인식 및 분류
윤곽선 인식, 랜드마크 인식, 분류
fast 모드를 사용 설정합니다(기본적으로 사용 설정됨).
낮은 해상도에서 이미지 캡처를 고려합니다. 하지만 이 API의 이미지 크기 요구사항도 유의해야 합니다.
인식기 호출을 제한합니다. 인식기가 실행 중일 때 새 동영상 프레임이 제공되는 경우 해당 프레임을 삭제합니다.
인식기 출력을 사용해서 입력 이미지에서 그래픽을 오버레이하는 경우 먼저 ML Kit에서 결과를 가져온 후 이미지를 렌더링하고 단일 단계로 오버레이합니다. 이렇게 하면 입력 프레임별로 한 번만 디스플레이 표면에 렌더링됩니다. 관련 예시는 쇼케이스 샘플 앱에서 previewOverlayView 및 FIRDetectionOverlayView 클래스를 참조하세요.