Google is committed to advancing racial equity for Black communities. See how.
本頁面由 Cloud Translation API 翻譯而成。
Switch to English

在Android上使用Firebase ML識別圖像中的文本

您可以使用Firebase ML識別圖像中的文本。 Firebase ML既具有適用於識別圖像中的文本(例如路牌的文本)的通用API,也具有針對識別文檔文本進行了優化的API。

在你開始之前

  1. 如果尚未將Firebase添加到您的Android項目中
  2. 使用Firebase Android BoM ,在模塊(應用程序級)Gradle文件(通常為app/build.gradle )中聲明Firebase ML Vision Android庫的依賴app/build.gradle
    dependencies {
        // Import the BoM for the Firebase platform
        implementation platform('com.google.firebase:firebase-bom:26.3.0')
    
        // Declare the dependency for the Firebase ML Vision library
        // When using the BoM, you don't specify versions in Firebase library dependencies
        implementation 'com.google.firebase:firebase-ml-vision'
    }
    

    通過使用Firebase Android BoM ,您的應用將始終使用Firebase Android庫的兼容版本。

    (可選)使用BoM聲明Firebase庫依賴關係

    如果選擇不使用Firebase BoM,則必須在其依賴關係行中指定每個Firebase庫版本。

    請注意,如果您在應用中使用多個Firebase庫,我們強烈建議您使用BoM來管理庫版本,以確保所有版本兼容。

    dependencies {
        // Declare the dependency for the Firebase ML Vision library
        // When NOT using the BoM, you must specify versions in Firebase library dependencies
        implementation 'com.google.firebase:firebase-ml-vision:24.1.0'
    }
    
  3. 如果您尚未為項目啟用基於雲的API,請立即執行以下操作:

    1. 打開Firebase控制台的Firebase ML API頁面
    2. 如果尚未將項目升級到Blaze計劃,請單擊“升級” 。 (僅當您的項目不在Blaze計劃中時,系統才會提示您升級。)

      只有Blaze級別的項目才能使用基於雲的API。

    3. 如果尚未啟用基於雲的API,請點擊啟用基於雲的API

現在您可以開始識別圖像中的文本了。

輸入圖像準則

  • 為了使Firebase ML能夠準確識別文本,輸入圖像必須包含由足夠的像素數據表示的文本。理想情況下,對於拉丁文字,每個字符應至少為16x16像素。對於中文,日文和韓文文本,每個字符應為24x24像素。對於所有語言,大於24x24像素的字符通常沒有準確性。

    因此,例如,一張640x480的圖像可能會很好地掃描佔據該圖像整個寬度的名片。要掃描打印在letter尺寸紙張上的文檔,可能需要720x1280像素的圖像。

  • 圖像聚焦不良會損害文本識別的準確性。如果沒有得到滿意的結果,請嘗試要求用戶重新捕獲圖像。


識別圖像中的文字

要識別圖像中的文本,請按如下所述運行文本識別器。

1.運行文本識別器

要識別圖像中的文本,請從Bitmapmedia.ImageByteBuffer ,字節數組或設備上的文件創建FirebaseVisionImage對象。然後,將FirebaseVisionImage對像傳遞給FirebaseVisionTextRecognizerprocessImage方法。

  1. 從您的圖像創建一個FirebaseVisionImage對象。

    • 要從media.Image對象創建FirebaseVisionImage對象(例如,從設備的相機捕獲圖像時),請將media.Image對象和圖像的旋轉傳遞給FirebaseVisionImage.fromMediaImage()

      如果使用CameraX庫,則OnImageCapturedListenerImageAnalysis.Analyzer類將為您計算旋轉值,因此您只需要在調用FirebaseVisionImage.fromMediaImage()之前將旋轉值轉換為Firebase ML的ROTATION_常量之一即可:

      爪哇

      private class YourAnalyzer implements ImageAnalysis.Analyzer {
      
          private int degreesToFirebaseRotation(int degrees) {
              switch (degrees) {
                  case 0:
                      return FirebaseVisionImageMetadata.ROTATION_0;
                  case 90:
                      return FirebaseVisionImageMetadata.ROTATION_90;
                  case 180:
                      return FirebaseVisionImageMetadata.ROTATION_180;
                  case 270:
                      return FirebaseVisionImageMetadata.ROTATION_270;
                  default:
                      throw new IllegalArgumentException(
                              "Rotation must be 0, 90, 180, or 270.");
              }
          }
      
          @Override
          public void analyze(ImageProxy imageProxy, int degrees) {
              if (imageProxy == null || imageProxy.getImage() == null) {
                  return;
              }
              Image mediaImage = imageProxy.getImage();
              int rotation = degreesToFirebaseRotation(degrees);
              FirebaseVisionImage image =
                      FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
              // Pass image to an ML Vision API
              // ...
          }
      }
      

      Kotlin + KTX

      private class YourImageAnalyzer : ImageAnalysis.Analyzer {
          private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
              0 -> FirebaseVisionImageMetadata.ROTATION_0
              90 -> FirebaseVisionImageMetadata.ROTATION_90
              180 -> FirebaseVisionImageMetadata.ROTATION_180
              270 -> FirebaseVisionImageMetadata.ROTATION_270
              else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
          }
      
          override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
              val mediaImage = imageProxy?.image
              val imageRotation = degreesToFirebaseRotation(degrees)
              if (mediaImage != null) {
                  val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
                  // Pass image to an ML Vision API
                  // ...
              }
          }
      }
      

      如果不使用提供圖像旋轉角度的相機庫,則可以根據設備的旋轉角度和設備中相機傳感器的方向來計算圖像:

      爪哇

      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      Kotlin + KTX

      private val ORIENTATIONS = SparseIntArray()
      
      init {
          ORIENTATIONS.append(Surface.ROTATION_0, 90)
          ORIENTATIONS.append(Surface.ROTATION_90, 0)
          ORIENTATIONS.append(Surface.ROTATION_180, 270)
          ORIENTATIONS.append(Surface.ROTATION_270, 180)
      }
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      @Throws(CameraAccessException::class)
      private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          val deviceRotation = activity.windowManager.defaultDisplay.rotation
          var rotationCompensation = ORIENTATIONS.get(deviceRotation)
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
          val sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          val result: Int
          when (rotationCompensation) {
              0 -> result = FirebaseVisionImageMetadata.ROTATION_0
              90 -> result = FirebaseVisionImageMetadata.ROTATION_90
              180 -> result = FirebaseVisionImageMetadata.ROTATION_180
              270 -> result = FirebaseVisionImageMetadata.ROTATION_270
              else -> {
                  result = FirebaseVisionImageMetadata.ROTATION_0
                  Log.e(TAG, "Bad rotation value: $rotationCompensation")
              }
          }
          return result
      }

      然後,將media.Image對象和旋轉值傳遞給FirebaseVisionImage.fromMediaImage()

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)
    • 要從文件URI創建FirebaseVisionImage對象,請將應用上下文和文件URI傳遞給FirebaseVisionImage.fromFilePath() 。當您使用ACTION_GET_CONTENT意圖提示用戶從其圖庫應用中選擇圖像時,此功能很有用。

      爪哇

      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }

      Kotlin + KTX

      val image: FirebaseVisionImage
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri)
      } catch (e: IOException) {
          e.printStackTrace()
      }
    • 要從ByteBuffer或字節數組創建FirebaseVisionImage對象,請首先按照上述media.Image輸入計算圖像旋轉media.Image

      然後,創建一個FirebaseVisionImageMetadata對象,該對象包含圖像的高度,寬度,顏色編碼格式和旋轉:

      爪哇

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      Kotlin + KTX

      val metadata = FirebaseVisionImageMetadata.Builder()
              .setWidth(480) // 480x360 is typically sufficient for
              .setHeight(360) // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build()

      使用緩衝區或數組以及元數據對象創建FirebaseVisionImage對象:

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
      // Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)
    • 要從Bitmap對象創建FirebaseVisionImage對象,請執行以下操作:

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromBitmap(bitmap)
      Bitmap對象表示的圖像必須是直立的,不需要額外旋轉。

  2. 獲取FirebaseVisionTextRecognizer的實例。

    爪哇

    FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudTextRecognizer();
    // Or, to change the default settings:
    //   FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
    //          .getCloudTextRecognizer(options);
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    FirebaseVisionCloudTextRecognizerOptions options = new FirebaseVisionCloudTextRecognizerOptions.Builder()
            .setLanguageHints(Arrays.asList("en", "hi"))
            .build();
    

    Kotlin + KTX

    val detector = FirebaseVision.getInstance().cloudTextRecognizer
    // Or, to change the default settings:
    // val detector = FirebaseVision.getInstance().getCloudTextRecognizer(options)
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    val options = FirebaseVisionCloudTextRecognizerOptions.Builder()
            .setLanguageHints(listOf("en", "hi"))
            .build()
    
  3. 最後,將圖像傳遞給processImage方法:

    爪哇

    Task<FirebaseVisionText> result =
            detector.processImage(image)
                    .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                        @Override
                        public void onSuccess(FirebaseVisionText firebaseVisionText) {
                            // Task completed successfully
                            // ...
                        }
                    })
                    .addOnFailureListener(
                            new OnFailureListener() {
                                @Override
                                public void onFailure(@NonNull Exception e) {
                                    // Task failed with an exception
                                    // ...
                                }
                            });

    Kotlin + KTX

    val result = detector.processImage(image)
            .addOnSuccessListener { firebaseVisionText ->
                // Task completed successfully
                // ...
            }
            .addOnFailureListener { e ->
                // Task failed with an exception
                // ...
            }

2.從識別的文本塊中提取文本

如果文本識別操作成功,則將FirebaseVisionText像傳遞給成功偵聽器。 FirebaseVisionText對象包含圖像中識別的全文以及零個或多個TextBlock對象。

每個TextBlock代表一個矩形文本塊,其中包含零個或多個Line對象。每個Line對象包含零個或多個Element對象,它們表示單詞和類似單詞的實體(日期,數字等)。

對於每個TextBlockLineElement對象,您可以獲得在區域中識別的文本以及該區域的邊界坐標。

例如:

爪哇

String resultText = result.getText();
for (FirebaseVisionText.TextBlock block: result.getTextBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockLanguages = block.getRecognizedLanguages();
    Point[] blockCornerPoints = block.getCornerPoints();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionText.Line line: block.getLines()) {
        String lineText = line.getText();
        Float lineConfidence = line.getConfidence();
        List<RecognizedLanguage> lineLanguages = line.getRecognizedLanguages();
        Point[] lineCornerPoints = line.getCornerPoints();
        Rect lineFrame = line.getBoundingBox();
        for (FirebaseVisionText.Element element: line.getElements()) {
            String elementText = element.getText();
            Float elementConfidence = element.getConfidence();
            List<RecognizedLanguage> elementLanguages = element.getRecognizedLanguages();
            Point[] elementCornerPoints = element.getCornerPoints();
            Rect elementFrame = element.getBoundingBox();
        }
    }
}

Kotlin + KTX

val resultText = result.text
for (block in result.textBlocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockLanguages = block.recognizedLanguages
    val blockCornerPoints = block.cornerPoints
    val blockFrame = block.boundingBox
    for (line in block.lines) {
        val lineText = line.text
        val lineConfidence = line.confidence
        val lineLanguages = line.recognizedLanguages
        val lineCornerPoints = line.cornerPoints
        val lineFrame = line.boundingBox
        for (element in line.elements) {
            val elementText = element.text
            val elementConfidence = element.confidence
            val elementLanguages = element.recognizedLanguages
            val elementCornerPoints = element.cornerPoints
            val elementFrame = element.boundingBox
        }
    }
}

下一步


識別文檔圖像中的文本

要識別文檔的文本,請按如下所述配置並運行文檔文本識別器。

如下所述,文檔文本識別API提供了一個旨在更方便地處理文檔圖像的接口。但是,如果您喜歡FirebaseVisionTextRecognizer API提供的接口,則可以通過將雲文本識別器配置為使用密集文本模型來使用它來掃描文檔。

要使用文檔文本識別API,請執行以下操作:

1.運行文本識別器

要識別圖像中的文本,請從Bitmapmedia.ImageByteBuffer ,字節數組或設備上的文件創建FirebaseVisionImage對象。然後,將FirebaseVisionImage對像傳遞給FirebaseVisionDocumentTextRecognizerprocessImage方法。

  1. 從您的圖像創建一個FirebaseVisionImage對象。

    • 要從media.Image對象創建FirebaseVisionImage對象(例如,從設備的相機捕獲圖像時),請將media.Image對象和圖像的旋轉傳遞給FirebaseVisionImage.fromMediaImage()

      如果您使用CameraX庫,則OnImageCapturedListenerImageAnalysis.Analyzer類將為您計算旋轉值,因此您只需在調用FirebaseVisionImage.fromMediaImage()之前將旋轉值轉換為Firebase ML的ROTATION_常量之一即可:

      爪哇

      private class YourAnalyzer implements ImageAnalysis.Analyzer {
      
          private int degreesToFirebaseRotation(int degrees) {
              switch (degrees) {
                  case 0:
                      return FirebaseVisionImageMetadata.ROTATION_0;
                  case 90:
                      return FirebaseVisionImageMetadata.ROTATION_90;
                  case 180:
                      return FirebaseVisionImageMetadata.ROTATION_180;
                  case 270:
                      return FirebaseVisionImageMetadata.ROTATION_270;
                  default:
                      throw new IllegalArgumentException(
                              "Rotation must be 0, 90, 180, or 270.");
              }
          }
      
          @Override
          public void analyze(ImageProxy imageProxy, int degrees) {
              if (imageProxy == null || imageProxy.getImage() == null) {
                  return;
              }
              Image mediaImage = imageProxy.getImage();
              int rotation = degreesToFirebaseRotation(degrees);
              FirebaseVisionImage image =
                      FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
              // Pass image to an ML Vision API
              // ...
          }
      }
      

      Kotlin + KTX

      private class YourImageAnalyzer : ImageAnalysis.Analyzer {
          private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
              0 -> FirebaseVisionImageMetadata.ROTATION_0
              90 -> FirebaseVisionImageMetadata.ROTATION_90
              180 -> FirebaseVisionImageMetadata.ROTATION_180
              270 -> FirebaseVisionImageMetadata.ROTATION_270
              else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
          }
      
          override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
              val mediaImage = imageProxy?.image
              val imageRotation = degreesToFirebaseRotation(degrees)
              if (mediaImage != null) {
                  val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
                  // Pass image to an ML Vision API
                  // ...
              }
          }
      }
      

      如果不使用提供圖像旋轉角度的相機庫,則可以根據設備的旋轉角度和設備中相機傳感器的方向來計算圖像:

      爪哇

      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      Kotlin + KTX

      private val ORIENTATIONS = SparseIntArray()
      
      init {
          ORIENTATIONS.append(Surface.ROTATION_0, 90)
          ORIENTATIONS.append(Surface.ROTATION_90, 0)
          ORIENTATIONS.append(Surface.ROTATION_180, 270)
          ORIENTATIONS.append(Surface.ROTATION_270, 180)
      }
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      @Throws(CameraAccessException::class)
      private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          val deviceRotation = activity.windowManager.defaultDisplay.rotation
          var rotationCompensation = ORIENTATIONS.get(deviceRotation)
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
          val sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          val result: Int
          when (rotationCompensation) {
              0 -> result = FirebaseVisionImageMetadata.ROTATION_0
              90 -> result = FirebaseVisionImageMetadata.ROTATION_90
              180 -> result = FirebaseVisionImageMetadata.ROTATION_180
              270 -> result = FirebaseVisionImageMetadata.ROTATION_270
              else -> {
                  result = FirebaseVisionImageMetadata.ROTATION_0
                  Log.e(TAG, "Bad rotation value: $rotationCompensation")
              }
          }
          return result
      }

      然後,將media.Image對象和旋轉值傳遞給FirebaseVisionImage.fromMediaImage()

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)
    • 要從文件URI創建FirebaseVisionImage對象,請將應用上下文和文件URI傳遞給FirebaseVisionImage.fromFilePath() 。當您使用ACTION_GET_CONTENT意圖提示用戶從其圖庫應用中選擇圖像時,此功能很有用。

      爪哇

      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }

      Kotlin + KTX

      val image: FirebaseVisionImage
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri)
      } catch (e: IOException) {
          e.printStackTrace()
      }
    • 要從ByteBuffer或字節數組創建FirebaseVisionImage對象,請首先按照上述media.Image輸入計算圖像旋轉media.Image

      然後,創建一個FirebaseVisionImageMetadata對象,該對象包含圖像的高度,寬度,顏色編碼格式和旋轉:

      爪哇

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      Kotlin + KTX

      val metadata = FirebaseVisionImageMetadata.Builder()
              .setWidth(480) // 480x360 is typically sufficient for
              .setHeight(360) // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build()

      使用緩衝區或數組以及元數據對象創建FirebaseVisionImage對象:

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
      // Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)
    • 要從Bitmap對象創建FirebaseVisionImage對象,請執行以下操作:

      爪哇

      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);

      Kotlin + KTX

      val image = FirebaseVisionImage.fromBitmap(bitmap)
      Bitmap對象表示的圖像必須是直立的,不需要額外旋轉。

  2. 獲取FirebaseVisionDocumentTextRecognizer的實例:

    爪哇

    FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudDocumentTextRecognizer();
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    FirebaseVisionCloudDocumentRecognizerOptions options =
            new FirebaseVisionCloudDocumentRecognizerOptions.Builder()
                    .setLanguageHints(Arrays.asList("en", "hi"))
                    .build();
    FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudDocumentTextRecognizer(options);

    Kotlin + KTX

    val detector = FirebaseVision.getInstance()
            .cloudDocumentTextRecognizer
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    val options = FirebaseVisionCloudDocumentRecognizerOptions.Builder()
            .setLanguageHints(listOf("en", "hi"))
            .build()
    val detector = FirebaseVision.getInstance()
            .getCloudDocumentTextRecognizer(options)

  3. 最後,將圖像傳遞給processImage方法:

    爪哇

    detector.processImage(myImage)
            .addOnSuccessListener(new OnSuccessListener<FirebaseVisionDocumentText>() {
                @Override
                public void onSuccess(FirebaseVisionDocumentText result) {
                    // Task completed successfully
                    // ...
                }
            })
            .addOnFailureListener(new OnFailureListener() {
                @Override
                public void onFailure(@NonNull Exception e) {
                    // Task failed with an exception
                    // ...
                }
            });

    Kotlin + KTX

    detector.processImage(myImage)
            .addOnSuccessListener { firebaseVisionDocumentText ->
                // Task completed successfully
                // ...
            }
            .addOnFailureListener { e ->
                // Task failed with an exception
                // ...
            }

2.從識別的文本塊中提取文本

如果文本識別操作成功,它將返回FirebaseVisionDocumentText對象。 FirebaseVisionDocumentText對象包含圖像中識別的全文本以及反映已識別文檔結構的對象層次結構:

對於每個BlockParagraphWordSymbol對象,您可以獲取在區域中識別的文本以及該區域的邊界坐標。

例如:

爪哇

String resultText = result.getText();
for (FirebaseVisionDocumentText.Block block: result.getBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockRecognizedLanguages = block.getRecognizedLanguages();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionDocumentText.Paragraph paragraph: block.getParagraphs()) {
        String paragraphText = paragraph.getText();
        Float paragraphConfidence = paragraph.getConfidence();
        List<RecognizedLanguage> paragraphRecognizedLanguages = paragraph.getRecognizedLanguages();
        Rect paragraphFrame = paragraph.getBoundingBox();
        for (FirebaseVisionDocumentText.Word word: paragraph.getWords()) {
            String wordText = word.getText();
            Float wordConfidence = word.getConfidence();
            List<RecognizedLanguage> wordRecognizedLanguages = word.getRecognizedLanguages();
            Rect wordFrame = word.getBoundingBox();
            for (FirebaseVisionDocumentText.Symbol symbol: word.getSymbols()) {
                String symbolText = symbol.getText();
                Float symbolConfidence = symbol.getConfidence();
                List<RecognizedLanguage> symbolRecognizedLanguages = symbol.getRecognizedLanguages();
                Rect symbolFrame = symbol.getBoundingBox();
            }
        }
    }
}

Kotlin + KTX

val resultText = result.text
for (block in result.blocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockRecognizedLanguages = block.recognizedLanguages
    val blockFrame = block.boundingBox
    for (paragraph in block.paragraphs) {
        val paragraphText = paragraph.text
        val paragraphConfidence = paragraph.confidence
        val paragraphRecognizedLanguages = paragraph.recognizedLanguages
        val paragraphFrame = paragraph.boundingBox
        for (word in paragraph.words) {
            val wordText = word.text
            val wordConfidence = word.confidence
            val wordRecognizedLanguages = word.recognizedLanguages
            val wordFrame = word.boundingBox
            for (symbol in word.symbols) {
                val symbolText = symbol.text
                val symbolConfidence = symbol.confidence
                val symbolRecognizedLanguages = symbol.recognizedLanguages
                val symbolFrame = symbol.boundingBox
            }
        }
    }
}

下一步