This project contains code for an OCR REST-ful service which takes a bank check image as input and returns the translated MICR (routing, account, and check numbers) in JSON format.
Ensure you have the following installed on your system:
- Git
- Node.js (v20.x or higher, which includes npm)
- npm (comes with Node.js)
Clone both the SDK and the web repositories:
git clone https://github.com/discoverfinancial/fin-ocr-rest.gitNext, navigate to the fin-ocr-rest directory and run the following commands to install the necessary dependencies and build the project:
cd fin-ocr-rest
npm run buildTo start the server on the default port 3000:
npm run startTo start the server with debug:
OCR_LOG_LEVEL=debug npm run startTo start the server on a different port (e.g. port 3001):
PORT=3001 npm run startAfter starting the server as described above, you can see the swagger doc at http://localhost:3000/swagger.
This endpoint is used to scan a check image and return the check's routing, account, and check numbers.
{
"id": "ANY ID",
"image": {
"format": "tif | jpg | png | gif | bmp | x_ms_bmp",
"buffer": "BASE64 ENCODING OF CHECK IMAGE"
}
}{
"id": "ANY ID",
"translators": {
"tesseract": {
"result": {
"routingNumber": "123456789",
"accountNumber": "123456890",
"checkNumber": "12345678"
}
},
"opencv": {
"result": {
"routingNumber": "123456789",
"accountNumber": "123456890",
"checkNumber": "123456789"
}
}
}
}You can use the following script to send a JSON request to the /check/scan endpoint:
#!/bin/bash
IMAGE_FILE="check_sample_02.png"
BASE64_IMAGE=$(base64 -w 0 "$IMAGE_FILE") # Stream heredoc payload directly from the command to avoid curl argument list too long
cat <<EOF | curl -X POST http://localhost:3000/check/scan \
-H "Content-Type: application/json" \
--data-binary @-
{
"id": "check_sample_02.png",
"image": {
"buffer": "$BASE64_IMAGE",
"format": "image/png"
},
"translators": ["tesseract", "opencv"]
}
EOFThis endpoint is utilized to scan a check image and return the check's routing, account, and check numbers.
curl http://localhost:3000/check/scanFile -F "image=@check_sample_02.png"A successful response will return a JSON object structured as follows:
{
"id": "default-id",
"translators": {
"tesseract": {
"result": {
"micrLine": "T011300142T12345678U01012\n",
"routingNumber": "011300142",
"accountNumber": "12345678",
"checkNumber": "1012"
}
},
"opencv": {
"result": {
"micrLine": "011300142T312345678U010111133357",
"routingNumber": "312345678",
"accountNumber": "010111133357",
"checkNumber": ""
}
}
},
"overlap": true
}The fields of the request body are as follows:
| Field name | type | Optional | Default | Description |
|---|---|---|---|---|
| id | string | false | none | The check identifier which is returned with the response |
| image.buffer | string | false | none | A base64 encoded image of the front of the check |
| image.format | string | true | "tiff" | The image format of the image.base64 field after being base64 decoded |
| translators | string array | true | ["tesseract","opencv"] | The translators whose responses are to be returned |
| debug | string array | true | false | Debug category names including one or more of the following: "images", "check-details", "all-details", "*" |
| correct | boolean | true | false | Whether or not to correct the response of opencv which also teaches the opencv translator for future translations |
| actual | string | true | none | This is the actual MICR line in string format which is used when correct is true |
The required fields of the response body are as follows:
| Field name | type | Optional | Default | Description |
|---|---|---|---|---|
| id | string | false | none | The check identifier from the request body |
| translators | map | false | none | The per-translator values (depending on the value of the request's translators field) for the check's routing, account, and check numbers |
The additional fields that are in the response body if the request's debug field was set to true are as follows:
| Field name | type | Optional | Default | Description |
|---|---|---|---|---|
| match | boolean | true | none | True if any of the translator's results matched the value passed in the request's actual field |
| overlap | boolean | true | none | True if signature overlap of the MICR line was detected in the image |
| images | map array | true | none | An array of name, format, base64 encoding, width, and height of each image generated during OCR processing |
TBD
This document provides guidance for how YOU can collaborate with our project community to improve this technology.
To generate a report containing any vulnerabilities in any dependency please use:
$npm run scannpm run scan-licenseNote: Each of these scans should be run and problems addressed by a developer prior to submitting code that uses new packages.
Copyright 2024 Capital One
Distributed under the Apache License, Version 2.0.
SPDX-License-Identifier: Apache-2.0