Vision SDK
Android
Barcode and QR Code scanner framework for Android. VisionSDK provides a way to detect barcode and qr codes. It also provides the functionality for information extraction from different kind of logistic labels like shipping labels, inventory labels, bill of ladings, receipts & invoices
Installation
Vision SDK is hosted on JitPack.io
First add JitPack to your root project
Then add the following dependency to your project's build.gradle
file:
Usage
Initialization
In order to use the OCR API for information extraction, you have to set following parameters
- Constants.apiKey to your API key.
- Constants.apiEnvironment you also need to specify the API environment that you have the API key for (sandbox or production).
Please note that these have to be set before using the API call. You can generate your own API key at cloud.packagex.io. You can find the instruction guide here.
Initialise the SDK first:
There are 2 ways for authentication
- Via API key
- Via Token
Basic Usage
You can find the sample code for using the vision-sdk on our github.
To start scanning for barcode and QR codes, use the startScanning method and specify the view type:
Configuration
Set initial setting/configurations of the camera:
To start scanning for barcode, QR codes, text or documents, use the startCamera method after it is initialized. See the code below for example:
Note that, once onBarcodesDetected
or onImageCaptured
callbacks are called, VisionSDK will stop analyzing camera feed for text or barcode/QR codes.
This is to prevent extra processing and battery consumption. When client wants to start analyzing camera feed again, after consuming the results of previous scan, client needs to call the following function:
Scanning Modes
There are 2 types of scanning mode
Auto
mode will auto-detect any Barcode or QR code based on the detection modeManual
mode will detect Barcode or QR code upon callingCapture
Detection Modes
Detection mode is used to specify what the user is trying to detect
Barcode
detects only barcodeQRCode
detects only QR codesOCR
for OCR detection. This mode will capture an image for the user. User can then decide to extract logistic data from it using either API call or on-device OCR capabilities.PriceTag
is under progressPhoto
is used to take images without any processing
Trigger Manual Capture
As mentioned above that we have a manual mode. For manual mode trigger, you can call visionCameraView.capture()
,
based on the mode
it will be giving different callbacks
If detection mode is
DetectionMode.Barcode
then it will trigger theonBarcodeDetected
callback in case of barcode detection or throw an exception.DetectionMode.QRCode
will return QR code or throw exception.DetectionMode.OCR
will capture an image along with the current barcode and return inonImageCaptured
Make sure that when calling capture, scan mode should be manual
Capturing image:
You can capture an image when mode is OCR. In OCR mode when capture
is called, then in the callback,
it will return an image.
In the callback, it will return the image bitmap along with the barcode list in the current frame.
Making OCR Call:
To make an API call, first thing that the user needs to make sure is that VisionSDK
is initialized. See Initialization for details.
Once the user has successfully initialized VisionSDK
, the they can use class ApiManager
for making API calls.
Following are the APIs that are available for our users through VisionSDK:
- Shipping Label (both online and on-device)
- Bill of Lading (only online)
Logistic information can be extracted from image in these two contexts. Users can post these images using the following suspended functions from ApiManager
class.
Shipping Label
Bill of Lading
Making On-Device Shipping Label Call (usage without internet):
You need internet to download the important files for the first time. These files are used for processing and extracting data from the images.
In order to do that, you need to use configure()
function of the class OnDeviceOCRManager
.
PlatformType
is an enum with following values:
ModelClass
is an enum that is used to inform the VisionSDK of the kind of OCR capabilities you require in your app.
Following are its option:
PriceTag
is not currently supported.
ModelSize
is another enum (an optional parameter here) that is used to inform the VisionSDK about the type of files that it should download. Following are the types of model:
Currently, only Micro
and Large
are supported.
After creating the instance of OnDeviceOCRManager
, you need to call its configure()
function. This suspend function will download the important files, if needed, and then load them in memory.
ExecutionProvider
parameter is optional and usually the default value works fine.
The progressListener
(also optional) parameter can be used to track the progress of downloading and loading of files. Its value is from 0.0 to 1.0.
After configuring it successfully, you need to call the function getPredictions()
and pass it the bitmap of the image you want to perform OCR operations on and list of barcode to try to extract information using regex.
The result of getPredictions()
function is a JSON response in String
form.
Once you are done with the OnDeviceOCRManager
, make sure that you destroy its instance by calling the following method:
Report an issue
VisionSDK contains internal error reporting mechanism if it faces any issue. Furthermore, if you get a response from On-Device models, that you consider to be incorrect, then you can report it using the following function: