在 Android 上使用 ML Kit 辨識圖片中的文字

您可以使用 ML Kit 辨識圖片中的文字，ML Kit 提供一般用途的 API，適合辨識圖片中的文字 (例如路標上的文字)，也提供專為辨識文件文字而最佳化的 API。一般用途 API 包含裝置端和雲端模型。文件文字辨識功能僅提供雲端模型。如要比較雲端和裝置端模型，請參閱總覽。

事前準備

如果您尚未將 Firebase 新增至 Android 專案，請先新增。

將 ML Kit Android 程式庫的依附元件新增至模組 (應用程式層級) Gradle 檔案 (通常為 app/build.gradle)：

apply plugin: 'com.android.application'
apply plugin: 'com.google.gms.google-services'

dependencies {
  // ...

  implementation 'com.google.firebase:firebase-ml-vision:24.0.3'
}

選用但建議採用：如果您使用裝置端 API，請將應用程式設為在從 Play 商店安裝後，自動將 ML 模型下載至裝置。
如要這麼做，請在應用程式的 AndroidManifest.xml 檔案中新增下列宣告：
```
<application ...>
  ...
  <meta-data
      android:name="com.google.firebase.ml.vision.DEPENDENCIES"
      android:value="ocr" />
  
</application>
```
如果您未啟用安裝時模型下載功能，系統會在您首次執行裝置端偵測器時下載模型。如果下載尚未完成，您提出的要求將不會產生任何結果。
如要使用雲端型模型，但尚未為專案啟用雲端型 API，請立即啟用：
1. 開啟 Firebase 控制台的 ML Kit API 頁面。
2. 如果尚未將專案升級至 Blaze 定價方案，請按一下「升級」。系統只會在專案未採用 Blaze 方案時，提示您升級。
  
  只有 Blaze 級別的專案才能使用雲端 API。
3. 如果尚未啟用雲端 API，請按一下「啟用雲端 API」。
在正式環境中部署使用 Cloud API 的應用程式之前，請先採取幾個額外步驟，防範未經授權的 API 存取活動，並減輕其影響。

如要只使用裝置端模型，可以略過這個步驟。

現在可以開始辨識圖片中的文字。

輸入圖片規範

如要讓 ML Kit 準確辨識文字，輸入圖片必須包含以足夠像素資料呈現的文字。在理想情況下，拉丁文字的每個字元至少應為 16x16 像素。如果是中文、日文和韓文文字 (僅雲端 API 支援)，每個字元應為 24x24 像素。一般而言，無論使用哪種語言，字元大於 24x24 像素對準確度沒有幫助。

舉例來說，如果名片佔滿圖片寬度，640x480 的圖片可能就非常適合掃描名片。如要掃描印在 Letter 尺寸紙張上的文件，可能需要 720x1280 像素的圖片。
如果圖片對焦不佳，可能會影響文字辨識準確度。如果結果不盡理想，請要求使用者重新拍攝圖片。
如果您要在即時應用程式中辨識文字，可能也需要考量輸入圖片的整體尺寸。系統處理較小的圖片時速度較快，因此為了減少延遲，請以較低的解析度擷取圖片 (請注意上述準確度規定)，並確保文字盡可能占滿圖片。另請參閱「改善即時成效的訣竅」。

辨識圖片中的文字

如要使用裝置端或雲端模型辨識圖片中的文字，請按照下列說明執行文字辨識器。

1. 執行文字辨識器

如要辨識圖片中的文字，請從 Bitmap、media.Image、ByteBuffer、位元組陣列或裝置上的檔案建立 FirebaseVisionImage 物件。然後，將 FirebaseVisionImage 物件傳遞至 FirebaseVisionTextRecognizer 的 processImage 方法。

從圖片建立 FirebaseVisionImage 物件。

如要從 media.Image 物件建立 FirebaseVisionImage 物件 (例如從裝置的相機擷取圖片時)，請將 media.Image 物件和圖片的旋轉角度傳遞至 FirebaseVisionImage.fromMediaImage()。

如果您使用 CameraX 程式庫，OnImageCapturedListener 和 ImageAnalysis.Analyzer 類別會為您計算旋轉值，因此您只需在呼叫 FirebaseVisionImage.fromMediaImage() 之前，將旋轉值轉換為 ML Kit 的 ROTATION_ 常數之一：

Java

private class YourAnalyzer implements ImageAnalysis.Analyzer {

    private int degreesToFirebaseRotation(int degrees) {
        switch (degrees) {
            case 0:
                return FirebaseVisionImageMetadata.ROTATION_0;
            case 90:
                return FirebaseVisionImageMetadata.ROTATION_90;
            case 180:
                return FirebaseVisionImageMetadata.ROTATION_180;
            case 270:
                return FirebaseVisionImageMetadata.ROTATION_270;
            default:
                throw new IllegalArgumentException(
                        "Rotation must be 0, 90, 180, or 270.");
        }
    }

    @Override
    public void analyze(ImageProxy imageProxy, int degrees) {
        if (imageProxy == null || imageProxy.getImage() == null) {
            return;
        }
        Image mediaImage = imageProxy.getImage();
        int rotation = degreesToFirebaseRotation(degrees);
        FirebaseVisionImage image =
                FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
        // Pass image to an ML Kit Vision API
        // ...
    }
}

Kotlin

private class YourImageAnalyzer : ImageAnalysis.Analyzer {
    private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
        0 -> FirebaseVisionImageMetadata.ROTATION_0
        90 -> FirebaseVisionImageMetadata.ROTATION_90
        180 -> FirebaseVisionImageMetadata.ROTATION_180
        270 -> FirebaseVisionImageMetadata.ROTATION_270
        else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
    }

    override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
        val mediaImage = imageProxy?.image
        val imageRotation = degreesToFirebaseRotation(degrees)
        if (mediaImage != null) {
            val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
            // Pass image to an ML Kit Vision API
            // ...
        }
    }
}

如果您使用的相機程式庫未提供圖片的旋轉角度，可以根據裝置的旋轉角度和裝置中相機感應器的方向計算：

Java

private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
static {
    ORIENTATIONS.append(Surface.ROTATION_0, 90);
    ORIENTATIONS.append(Surface.ROTATION_90, 0);
    ORIENTATIONS.append(Surface.ROTATION_180, 270);
    ORIENTATIONS.append(Surface.ROTATION_270, 180);
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
private int getRotationCompensation(String cameraId, Activity activity, Context context)
        throws CameraAccessException {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
    int rotationCompensation = ORIENTATIONS.get(deviceRotation);

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
    int sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION);
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    int result;
    switch (rotationCompensation) {
        case 0:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            break;
        case 90:
            result = FirebaseVisionImageMetadata.ROTATION_90;
            break;
        case 180:
            result = FirebaseVisionImageMetadata.ROTATION_180;
            break;
        case 270:
            result = FirebaseVisionImageMetadata.ROTATION_270;
            break;
        default:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            Log.e(TAG, "Bad rotation value: " + rotationCompensation);
    }
    return result;
}VisionImage.java

Kotlin

private val ORIENTATIONS = SparseIntArray()

init {
    ORIENTATIONS.append(Surface.ROTATION_0, 90)
    ORIENTATIONS.append(Surface.ROTATION_90, 0)
    ORIENTATIONS.append(Surface.ROTATION_180, 270)
    ORIENTATIONS.append(Surface.ROTATION_270, 180)
}
/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
@Throws(CameraAccessException::class)
private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    val deviceRotation = activity.windowManager.defaultDisplay.rotation
    var rotationCompensation = ORIENTATIONS.get(deviceRotation)

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
    val sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    val result: Int
    when (rotationCompensation) {
        0 -> result = FirebaseVisionImageMetadata.ROTATION_0
        90 -> result = FirebaseVisionImageMetadata.ROTATION_90
        180 -> result = FirebaseVisionImageMetadata.ROTATION_180
        270 -> result = FirebaseVisionImageMetadata.ROTATION_270
        else -> {
            result = FirebaseVisionImageMetadata.ROTATION_0
            Log.e(TAG, "Bad rotation value: $rotationCompensation")
        }
    }
    return result
}VisionImage.kt

接著，將 media.Image 物件和旋轉值傳遞至 FirebaseVisionImage.fromMediaImage()：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)VisionImage.kt

如要從檔案 URI 建立 FirebaseVisionImage 物件，請將應用程式內容和檔案 URI 傳遞至 FirebaseVisionImage.fromFilePath()。當您使用 ACTION_GET_CONTENT 意圖提示使用者從相簿應用程式選取圖片時，這項功能就非常實用。

Java

FirebaseVisionImage image;
try {
    image = FirebaseVisionImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}VisionImage.java

Kotlin

val image: FirebaseVisionImage
try {
    image = FirebaseVisionImage.fromFilePath(context, uri)
} catch (e: IOException) {
    e.printStackTrace()
}VisionImage.kt

如要從 ByteBuffer 或位元組陣列建立 FirebaseVisionImage 物件，請先計算圖片旋轉角度，如上文所述，以做為 media.Image 輸入內容。

接著，建立 FirebaseVisionImageMetadata 物件，其中包含圖片的高度、寬度、色彩編碼格式和旋轉角度：

Java

FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
        .setWidth(480)   // 480x360 is typically sufficient for
        .setHeight(360)  // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build();VisionImage.java

Kotlin

val metadata = FirebaseVisionImageMetadata.Builder()
        .setWidth(480) // 480x360 is typically sufficient for
        .setHeight(360) // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build()VisionImage.kt

使用緩衝區或陣列，以及中繼資料物件，建立 FirebaseVisionImage 物件：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
// Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
// Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)VisionImage.kt

如要從 Bitmap 物件建立 FirebaseVisionImage 物件，請執行下列操作：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromBitmap(bitmap)VisionImage.kt

Bitmap 物件代表的圖片必須直立，不需額外旋轉。

取得 FirebaseVisionTextRecognizer 的例項。

如要使用裝置端模型，請按照下列步驟操作：

Java

FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
        .getOnDeviceTextRecognizer();

Kotlin

val detector = FirebaseVision.getInstance()
        .onDeviceTextRecognizer

如要使用雲端式模型，請按照下列步驟操作：

Java

FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
        .getCloudTextRecognizer();
// Or, to change the default settings:
//   FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
//          .getCloudTextRecognizer(options);

// Or, to provide language hints to assist with language detection:
// See https://cloud.google.com/vision/docs/languages for supported languages
FirebaseVisionCloudTextRecognizerOptions options = new FirebaseVisionCloudTextRecognizerOptions.Builder()
        .setLanguageHints(Arrays.asList("en", "hi"))
        .build();

Kotlin

val detector = FirebaseVision.getInstance().cloudTextRecognizer
// Or, to change the default settings:
// val detector = FirebaseVision.getInstance().getCloudTextRecognizer(options)

// Or, to provide language hints to assist with language detection:
// See https://cloud.google.com/vision/docs/languages for supported languages
val options = FirebaseVisionCloudTextRecognizerOptions.Builder()
        .setLanguageHints(listOf("en", "hi"))
        .build()

最後，將圖片傳遞至 processImage 方法：

Java

Task<FirebaseVisionText> result =
        detector.processImage(image)
                .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                    @Override
                    public void onSuccess(FirebaseVisionText firebaseVisionText) {
                        // Task completed successfully
                        // ...
                    }
                })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                // ...
                            }
                        });

Kotlin

val result = detector.processImage(image)
        .addOnSuccessListener { firebaseVisionText ->
            // Task completed successfully
            // ...
        }
        .addOnFailureListener { e ->
            // Task failed with an exception
            // ...
        }

2. 從辨識出的文字區塊擷取文字

如果文字辨識作業成功，系統會將 FirebaseVisionText 物件傳遞至成功事件監聽器。FirebaseVisionText 物件包含圖片中辨識到的完整文字，以及零或多個 TextBlock 物件。

每個 TextBlock 代表一個矩形文字區塊，其中包含零或多個 Line 物件。每個 Line 物件都包含零個或多個 Element 物件，代表字詞和類似字詞的實體 (日期、數字等)。

針對每個 TextBlock、Line 和 Element 物件，您可以取得該區域中辨識的文字，以及該區域的邊界座標。

例如：

Java

String resultText = result.getText();
for (FirebaseVisionText.TextBlock block: result.getTextBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockLanguages = block.getRecognizedLanguages();
    Point[] blockCornerPoints = block.getCornerPoints();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionText.Line line: block.getLines()) {
        String lineText = line.getText();
        Float lineConfidence = line.getConfidence();
        List<RecognizedLanguage> lineLanguages = line.getRecognizedLanguages();
        Point[] lineCornerPoints = line.getCornerPoints();
        Rect lineFrame = line.getBoundingBox();
        for (FirebaseVisionText.Element element: line.getElements()) {
            String elementText = element.getText();
            Float elementConfidence = element.getConfidence();
            List<RecognizedLanguage> elementLanguages = element.getRecognizedLanguages();
            Point[] elementCornerPoints = element.getCornerPoints();
            Rect elementFrame = element.getBoundingBox();
        }
    }
}

Kotlin

val resultText = result.text
for (block in result.textBlocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockLanguages = block.recognizedLanguages
    val blockCornerPoints = block.cornerPoints
    val blockFrame = block.boundingBox
    for (line in block.lines) {
        val lineText = line.text
        val lineConfidence = line.confidence
        val lineLanguages = line.recognizedLanguages
        val lineCornerPoints = line.cornerPoints
        val lineFrame = line.boundingBox
        for (element in line.elements) {
            val elementText = element.text
            val elementConfidence = element.confidence
            val elementLanguages = element.recognizedLanguages
            val elementCornerPoints = element.cornerPoints
            val elementFrame = element.boundingBox
        }
    }
}

提升即時成效的訣竅

如要在即時應用程式中使用裝置端模型辨識文字，請按照下列準則操作，以達到最佳影格速率：

限制對文字辨識工具的呼叫次數。如果文字辨識器執行時有新的視訊影格可用，請捨棄該影格。
如果您要使用文字辨識器的輸出內容，在輸入圖片上疊加圖像，請先從 ML Kit 取得結果，然後在單一步驟中算繪圖片並疊加圖像。這樣做的話，每個輸入影格只會轉譯到顯示表面一次。
如果您使用 Camera2 API，請以 ImageFormat.YUV_420_888 格式擷取圖片。

如果使用舊版 Camera API，請以 ImageFormat.NV21 格式擷取圖片。
建議您以較低的解析度拍攝圖片。但請注意，這個 API 的圖片尺寸也有相關規定。

後續步驟

在正式環境中部署使用 Cloud API 的應用程式之前，請先採取幾個額外步驟，防範未經授權的 API 存取活動，並減輕其影響。

辨識文件圖片中的文字

如要辨識文件中的文字，請按照下列說明設定及執行雲端文件文字辨識器。

下文所述的文件文字辨識 API 提供介面，可更輕鬆地處理文件圖片。不過，如果您偏好 FirebaseVisionTextRecognizer API 提供的介面，可以改用該介面掃描文件，方法是將雲端文字辨識器設定為使用密集文字模型。

如要使用文件文字辨識 API，請按照下列指示操作：

1. 執行文字辨識器

如要辨識圖片中的文字，請從 Bitmap、media.Image、ByteBuffer、位元組陣列或裝置上的檔案建立 FirebaseVisionImage 物件。然後，將 FirebaseVisionImage 物件傳遞至 FirebaseVisionDocumentTextRecognizer 的 processImage 方法。

從圖片建立 FirebaseVisionImage 物件。

Java

private class YourAnalyzer implements ImageAnalysis.Analyzer {

    private int degreesToFirebaseRotation(int degrees) {
        switch (degrees) {
            case 0:
                return FirebaseVisionImageMetadata.ROTATION_0;
            case 90:
                return FirebaseVisionImageMetadata.ROTATION_90;
            case 180:
                return FirebaseVisionImageMetadata.ROTATION_180;
            case 270:
                return FirebaseVisionImageMetadata.ROTATION_270;
            default:
                throw new IllegalArgumentException(
                        "Rotation must be 0, 90, 180, or 270.");
        }
    }

    @Override
    public void analyze(ImageProxy imageProxy, int degrees) {
        if (imageProxy == null || imageProxy.getImage() == null) {
            return;
        }
        Image mediaImage = imageProxy.getImage();
        int rotation = degreesToFirebaseRotation(degrees);
        FirebaseVisionImage image =
                FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
        // Pass image to an ML Kit Vision API
        // ...
    }
}

Kotlin

private class YourImageAnalyzer : ImageAnalysis.Analyzer {
    private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
        0 -> FirebaseVisionImageMetadata.ROTATION_0
        90 -> FirebaseVisionImageMetadata.ROTATION_90
        180 -> FirebaseVisionImageMetadata.ROTATION_180
        270 -> FirebaseVisionImageMetadata.ROTATION_270
        else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
    }

    override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
        val mediaImage = imageProxy?.image
        val imageRotation = degreesToFirebaseRotation(degrees)
        if (mediaImage != null) {
            val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
            // Pass image to an ML Kit Vision API
            // ...
        }
    }
}

如果您使用的相機程式庫未提供圖片的旋轉角度，可以根據裝置的旋轉角度和裝置中相機感應器的方向計算：

Java

private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
static {
    ORIENTATIONS.append(Surface.ROTATION_0, 90);
    ORIENTATIONS.append(Surface.ROTATION_90, 0);
    ORIENTATIONS.append(Surface.ROTATION_180, 270);
    ORIENTATIONS.append(Surface.ROTATION_270, 180);
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
private int getRotationCompensation(String cameraId, Activity activity, Context context)
        throws CameraAccessException {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
    int rotationCompensation = ORIENTATIONS.get(deviceRotation);

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
    int sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION);
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    int result;
    switch (rotationCompensation) {
        case 0:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            break;
        case 90:
            result = FirebaseVisionImageMetadata.ROTATION_90;
            break;
        case 180:
            result = FirebaseVisionImageMetadata.ROTATION_180;
            break;
        case 270:
            result = FirebaseVisionImageMetadata.ROTATION_270;
            break;
        default:
            result = FirebaseVisionImageMetadata.ROTATION_0;
            Log.e(TAG, "Bad rotation value: " + rotationCompensation);
    }
    return result;
}VisionImage.java

Kotlin

private val ORIENTATIONS = SparseIntArray()

init {
    ORIENTATIONS.append(Surface.ROTATION_0, 90)
    ORIENTATIONS.append(Surface.ROTATION_90, 0)
    ORIENTATIONS.append(Surface.ROTATION_180, 270)
    ORIENTATIONS.append(Surface.ROTATION_270, 180)
}
/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
@Throws(CameraAccessException::class)
private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    val deviceRotation = activity.windowManager.defaultDisplay.rotation
    var rotationCompensation = ORIENTATIONS.get(deviceRotation)

    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
    val sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
    rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360

    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    val result: Int
    when (rotationCompensation) {
        0 -> result = FirebaseVisionImageMetadata.ROTATION_0
        90 -> result = FirebaseVisionImageMetadata.ROTATION_90
        180 -> result = FirebaseVisionImageMetadata.ROTATION_180
        270 -> result = FirebaseVisionImageMetadata.ROTATION_270
        else -> {
            result = FirebaseVisionImageMetadata.ROTATION_0
            Log.e(TAG, "Bad rotation value: $rotationCompensation")
        }
    }
    return result
}VisionImage.kt

接著，將 media.Image 物件和旋轉值傳遞至 FirebaseVisionImage.fromMediaImage()：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)VisionImage.kt

Java

FirebaseVisionImage image;
try {
    image = FirebaseVisionImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}VisionImage.java

Kotlin

val image: FirebaseVisionImage
try {
    image = FirebaseVisionImage.fromFilePath(context, uri)
} catch (e: IOException) {
    e.printStackTrace()
}VisionImage.kt

如要從 ByteBuffer 或位元組陣列建立 FirebaseVisionImage 物件，請先計算圖片旋轉角度，如上文所述，以做為 media.Image 輸入內容。

接著，建立 FirebaseVisionImageMetadata 物件，其中包含圖片的高度、寬度、色彩編碼格式和旋轉角度：

Java

FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
        .setWidth(480)   // 480x360 is typically sufficient for
        .setHeight(360)  // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build();VisionImage.java

Kotlin

val metadata = FirebaseVisionImageMetadata.Builder()
        .setWidth(480) // 480x360 is typically sufficient for
        .setHeight(360) // image recognition
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build()VisionImage.kt

使用緩衝區或陣列，以及中繼資料物件，建立 FirebaseVisionImage 物件：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
// Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
// Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)VisionImage.kt

如要從 Bitmap 物件建立 FirebaseVisionImage 物件，請執行下列操作：

Java

FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);VisionImage.java

Kotlin

val image = FirebaseVisionImage.fromBitmap(bitmap)VisionImage.kt

Bitmap 物件代表的圖片必須直立，不需額外旋轉。

取得 FirebaseVisionDocumentTextRecognizer 的例項：

Java

FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
        .getCloudDocumentTextRecognizer();

// Or, to provide language hints to assist with language detection:
// See https://cloud.google.com/vision/docs/languages for supported languages
FirebaseVisionCloudDocumentRecognizerOptions options =
        new FirebaseVisionCloudDocumentRecognizerOptions.Builder()
                .setLanguageHints(Arrays.asList("en", "hi"))
                .build();
FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
        .getCloudDocumentTextRecognizer(options);

Kotlin

val detector = FirebaseVision.getInstance()
        .cloudDocumentTextRecognizer

// Or, to provide language hints to assist with language detection:
// See https://cloud.google.com/vision/docs/languages for supported languages
val options = FirebaseVisionCloudDocumentRecognizerOptions.Builder()
        .setLanguageHints(listOf("en", "hi"))
        .build()
val detector = FirebaseVision.getInstance()
        .getCloudDocumentTextRecognizer(options)

最後，將圖片傳遞至 processImage 方法：

Java

detector.processImage(myImage)
        .addOnSuccessListener(new OnSuccessListener<FirebaseVisionDocumentText>() {
            @Override
            public void onSuccess(FirebaseVisionDocumentText result) {
                // Task completed successfully
                // ...
            }
        })
        .addOnFailureListener(new OnFailureListener() {
            @Override
            public void onFailure(@NonNull Exception e) {
                // Task failed with an exception
                // ...
            }
        });

Kotlin

detector.processImage(myImage)
        .addOnSuccessListener { firebaseVisionDocumentText ->
            // Task completed successfully
            // ...
        }
        .addOnFailureListener { e ->
            // Task failed with an exception
            // ...
        }

2. 從辨識出的文字區塊擷取文字

如果文字辨識作業成功，系統會傳回 FirebaseVisionDocumentText 物件。FirebaseVisionDocumentText 物件包含圖片中辨識的完整文字，以及反映辨識文件結構的物件階層：

針對每個 Block、Paragraph、Word 和 Symbol 物件，您可以取得該區域中辨識到的文字，以及該區域的邊界座標。

例如：

Java

String resultText = result.getText();
for (FirebaseVisionDocumentText.Block block: result.getBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockRecognizedLanguages = block.getRecognizedLanguages();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionDocumentText.Paragraph paragraph: block.getParagraphs()) {
        String paragraphText = paragraph.getText();
        Float paragraphConfidence = paragraph.getConfidence();
        List<RecognizedLanguage> paragraphRecognizedLanguages = paragraph.getRecognizedLanguages();
        Rect paragraphFrame = paragraph.getBoundingBox();
        for (FirebaseVisionDocumentText.Word word: paragraph.getWords()) {
            String wordText = word.getText();
            Float wordConfidence = word.getConfidence();
            List<RecognizedLanguage> wordRecognizedLanguages = word.getRecognizedLanguages();
            Rect wordFrame = word.getBoundingBox();
            for (FirebaseVisionDocumentText.Symbol symbol: word.getSymbols()) {
                String symbolText = symbol.getText();
                Float symbolConfidence = symbol.getConfidence();
                List<RecognizedLanguage> symbolRecognizedLanguages = symbol.getRecognizedLanguages();
                Rect symbolFrame = symbol.getBoundingBox();
            }
        }
    }
}

Kotlin

val resultText = result.text
for (block in result.blocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockRecognizedLanguages = block.recognizedLanguages
    val blockFrame = block.boundingBox
    for (paragraph in block.paragraphs) {
        val paragraphText = paragraph.text
        val paragraphConfidence = paragraph.confidence
        val paragraphRecognizedLanguages = paragraph.recognizedLanguages
        val paragraphFrame = paragraph.boundingBox
        for (word in paragraph.words) {
            val wordText = word.text
            val wordConfidence = word.confidence
            val wordRecognizedLanguages = word.recognizedLanguages
            val wordFrame = word.boundingBox
            for (symbol in word.symbols) {
                val symbolText = symbol.text
                val symbolConfidence = symbol.confidence
                val symbolRecognizedLanguages = symbol.recognizedLanguages
                val symbolFrame = symbol.boundingBox
            }
        }
    }
}

後續步驟

在正式環境中部署使用 Cloud API 的應用程式之前，請先採取幾個額外步驟，防範未經授權的 API 存取活動，並減輕其影響。

在 Android 上使用 ML Kit 辨識圖片中的文字 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

事前準備

輸入圖片規範

辨識圖片中的文字

1. 執行文字辨識器

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

2. 從辨識出的文字區塊擷取文字

Java

Kotlin

提升即時成效的訣竅

後續步驟

辨識文件圖片中的文字

1. 執行文字辨識器

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

Java

Kotlin

2. 從辨識出的文字區塊擷取文字

Java

Kotlin

後續步驟

在 Android 上使用 ML Kit 辨識圖片中的文字