1. Kotlin
  2. On-Device AI Scanning (OCR)

Kotlin

On-Device AI Scanning (OCR)

The Vision SDK allows for on-device AI scanning (OCR), enabling offline extraction of structured information from documents such as shipping labels. This is ideal when low latency and offline functionality are critical, such as in warehouse and logistics environments.

πŸ› οΈ Step 1: Preparing On-Device OCR

Before using the on-device OCR, you must prepare the model using the following methods. This ensures the necessary AI models are downloaded and ready to use.

Option 1: Auto Model Size Configuration

If you want Vision SDK to automatically determine the best model size based on your PackageX subscription:

        suspend fun prepare(context: Context, apiKey: String) {
  val onDeviceOCRManager = OnDeviceOCRManager(
    context = context,
    platformType = PlatformType.Native,
    ocrModule = OCRModule.ShippingLabel(),
  )
  onDeviceOCRManager?.configure(
      apiKey = apiKey
  ) { progress ->
      Log.d(TAG, "Configure Progress: $progress")
  }
}

      

Option 2: Explicit Model Size Configuration

You can also specify the model size manually using .micro (faster, smaller) or .large (slower, more accurate):

        suspend fun prepare(context: Context, apiKey: String) {
  val onDeviceOCRManager = OnDeviceOCRManager(
    context = context,
    platformType = PlatformType.Native,
    ocrModule = OCRModule.ShippingLabel(
      modelSize = ModelSize.Large
    ),
  )
  onDeviceOCRManager.configure(
      apiKey = apiKey
  ) { progress ->
      Log.d(TAG, "Configure Progress: $progress")
  }
}

      

⚠️ You must wait for the configure process to complete before proceeding to OCR extraction.

🧩 Selecting the Model Type

When preparing for on-device AI scanning, you can choose which document model you want to use by specifying the ocrModule parameter. This lets the SDK know which type of document you’re scanning so it can load the appropriate AI model.

Here are the supported options:

πŸ“¦ For Shipping Labels

Use this to scan and extract structured data from shipping labels.

        ocrModule = OCRModule.ShippingLabel()

      

πŸ“„ For Bill of Lading (BOL)

Use this to extract data from Bill of Lading documents.

        ocrModule = OCRModule.BillOfLading()

      

🏷️ For Item Labels

Use this to process and extract structured details from product/item labels.

        ocrModule = OCRModule.ItemLabel()

      

πŸ“ Make sure the model is prepared successfully before scanning. Each model may vary in size and complexity depending on the document type.

🧠 Step 2: Extracting Data from Image

Once the model is prepared successfully, use the following method to perform OCR on a given image.

        suspend fun getPredictions(bitmap: Bitmap, barcodes: List<String>): String? {
  return onDeviceOCRManager.getPredictions(bitmap, barcodes)
}

      

Parameters:

  • bitmap: The Bitmap you want to process.
  • barcodes: A list of String values representing barcodes detected in the image (if any).

πŸ” The returned data follows the same structure as the PackageX Cloud OCR API response.

β™² Step 3: Release resources

It is important that when you are done with predictions and you are not planning to get more predictions on-device, or you want to configure a different AI model, then you release the memory being held by AI model. You can do that by using following code:

        fun releaseResources() {
  return onDeviceOCRManager.destroy()
}

      

Available On-Device AI Scanning

Following table explains all the on-device AI scanning options available currently.

Model Class Nano Micro Small Medium Large XLarge
ShippingLabel X βœ“ X X βœ“ X
BillOfLading X X X X βœ“ X
ItemLabel X X X X βœ“ X
DocumentClassification X βœ“ X X βœ“ X

βœ… Full Sample Flow

        class MyOCRClass(
  context: Context,
  ocrModule: OCRModule,
  val apiKey: String,
) {
  private val onDeviceOCRManager = OnDeviceOCRManager(
    context = context,
    ocrModule = ocrModule,
  )

  suspend fun configure( configureProgress: (Float) -> Unit ) {
    onDeviceOCRManager.configure(
      apiKey = apiKey,
      progressListener = configureProgress
    )
  }

  suspend fun getPrediction(bitmap: Bitmap, barcodes: List<String>): String? {
    return onDeviceOCRManager.getPredictions(bitmap, barcodes)
  }

  fun release() {
    onDeviceOCRManager.destroy()
  }
}