This project contains a CLI for performing OCR (Optical Character Recognition) using the SDK at fin-ocr-sdk.
This CLI supports the following for bank checks:
- debugging, testing, and measuring the accuracy of the SDK at fin-ocr-sdk;
- bundling new binary files to be included in the SDK.
NOTE: This CLI is designed to be extended easily to support other use cases in the future.
Ensure you have the following installed on your system:
- Git
- Node.js (v20.x or higher, which includes npm)
- npm (comes with Node.js)
Clone the CLI repository:
git clone https://github.com/finos/fin-ocr-cli.gitNavigate to the CLI directory, install dependencies and build the project:
cd ../fin-ocr-cli
npm run buildOnce the build is complete, you can use the CLI by running the following command:
./scripts/ocrThis command directly invokes the ocr script located in the scripts directory of the project.
To run the CLI locally, you simply execute the command directly from the project directory as shown above (./scripts/ocr). This method does not add the CLI to your system's global PATH, so the ocr command will only be available when run from within the fin-ocr-cli directory.
To make the ocr command available "globally" on your system, you can install the CLI using the following command from within the fin-ocr-cli directory:
npm install -g .After global installation, you can run the ocr command from any directory on your system without needing to specify the script path. This is convenient if you need to use the CLI across multiple projects or from anywhere on your system.
You should now be able to execute the ocr command as follows:
$ ocr
Usage: ocr check scan <path-to-check-image>
check test <checkNum> [<numChecks>]
check debug <comma-separated-list-of-check-nums>
check preprocess <output-dir> <checkNum> [<numChecks>]
buildFiles [<dir>]This section describes how to use the CLI to perform various tasks as follows:
- How to scan a single check image from a local file
- How to prepare check data for scanning multiple images from an X9 file
- How to measure OCR accuracy
- How to debug check mismatches
- How to use the CLI as a client for the REST service
Run the following command to perform OCR on a check image that is stored in a local file:
ocr check scan <path-to-image-file>
The output produced by this command is similar to the following:
{
"id": "check-1.tif",
"translators": {
"tesseract": {
"result": {
"micrLine": "U1234567UT123456789T1234567890U12345678\n",
"routingNumber": "123456789",
"accountNumber": "1234567890",
"checkNumber": "1234567"
}
},
"opencv": {
"result": {
"micrLine": "U1234567UT123456789T1234567890U12345678",
"routingNumber": "123456789",
"accountNumber": "1234567890",
"checkNumber": "1234567"
}
}
},
"overlap": false
}
There are two translators: tesseract and opencv. Each translator returns results including the routing, account, and check numbers. The entire translated MICR line is also returned for each translator.
In order to use only a single translator, you may set the OCR_TRANSLATORS environment variable as follows:
-
to use only the tesseract translator,
OCR_TRANSLATORS=tesseract; -
to use only the opencv translator,
OCR_TRANSLATORS=opencv.
The default setting is OCR_TRANSLATORS=tesseract,opencv.
The ocr commands described below, ocr check test and ocr check debug, support processing of a large number of checks images. This section describes the format of the check data expected by these commands and how to prepare this data.
For each check, there must be two files with the same file prefix but different suffixes: a TIFF file and a JSON file. The TIFF file contains the check image in TIFF format and the JSON file contains the following fields:
{
"auxiliaryOnUs": "1234567",
"payorBankRoutingNumber": "12345678",
"payorBankCheckDigit": "1",
"onUs": "1234567890/"
}
For example, check-1.tiff contains the TIFF image for check number 1 while check-1.json contains what is known as the ground truth for what is on the image.
Preparing this data manually can be very time consuming. If you have one or more X9 files, see the x9-extract tool for how to prepare this check data automatically.
This section assumes that you have prepared the check data over which you will measure the OCR accuracy. The default location for this check data is $HOME/.fin-ocr/checks but can be set explicitly via CHECKS_DIR environment variable.
The following command measures the accuracy:
ocr check test START COUNT
where START is the starting check number and COUNT is the number of checks.
For example, the following measures the accuracy over checks 1 through 100.
ocr check test 1 100
The output is similar to the following (minus the line numbers )
1) 2024-07-29T13:44:57.186Z inf ocr-sdk Initializing tesseract: {"font":"micr_e13b","pageSegmentationMode":"13"}
2) 2024-07-29T13:44:57.215Z inf ocr-sdk Added tesseract MICR translator
2024-07-29T13:44:57.215Z inf ocr-sdk Initializing tesseract: {"font":"eng","pageSegmentationMode":"3"}
3) 2024-07-29T13:44:57.245Z inf ocr-sdk Added tesseract full page translator
4) 2024-07-29T13:44:57.245Z inf ocr-sdk Added opencv translator
5) 2024-07-29T13:45:02.722Z inf cli Check 1: match=true (100.00%)
...
6) 2024-07-29T13:45:15.778Z inf cli Mismatches: [23,83]
7) 2024-07-29T13:45:15.778Z inf cli Mismatches to evaluate: [83]
8) 2024-07-29T13:45:15.778Z inf cli Matches to reevaluate: []
9) 2024-07-29T13:45:15.778Z inf cli Counts: match=98, x9Wrong=0, total=100
10) 2024-07-29T13:45:15.778Z inf cli Percentage: match=98.00%, x9Wrong=0.00%
11) Execution time: 0 minutes, 21 seconds
Note the following in the sample output:
- line 5 - a running percentage is logged after each check which specifies the total accuracy thus far;
- line 6 - checks 23 and 83 mismatched; that is, the results of OCR'ing the tiff image file mismatched the values found in the associated JSON file containing the ground truth;
- line 7 - check 83 mismatched and still needs to be evaluated and categorized with the reason for the mismatch; see evaluating a mismatching check for more information.
- line 8 - there are no checks which need to be re-evaluated; a check needs reevaluating if it previously mismatched and was added to the list of mismatching checks with a reason, but now the OCR results match the ground truth;
- line 9 - the number of checks which matched is 98, the number of checks which had the X9 wrong in the JSON was 0, and the total number of checks processed was 100;
- line 10 - the percentage of checks whose OCR results matched the ground truth was 98.00% and the percentage of checks with incorrect OCR results was 0.00%;
- line 11 - the total execution time to OCR these 100 checks was 21 seconds.
The values in the JSON files which are extracted from X9 files may be inaccurate. In this case, you may use the CHECK_EVAL_DATA environment variable to correct these values used by the ocr check commands.
For example, suppose that the data in the file $HOME/.fin-ocr/checks/check-10.json says the check number is 123 but the real check number is 124. You confirmed this by looking at the check image in file $HOME/.fin-ocr/checks/check-10.tiff. In this case, you can create a file named check-eval-data.json in the current directory with the following contents:
{
"mismatchesByReason": {
},
"correctX9": {
"10": "T123456789T123456789012U124"
}
}
The "10" is for check number 10. The characters representing the MICR symbols in the value is as follows:
- "T" - transit symbol
- "U" - on-us symbol
- "A" - amount symbol
- "D" - dash symbol
In the example above, the routing number occurs between the 2 T's (transit symbols) and is "123456789". The account number occurs between the 2nd T (transit symbol) and the U (on-us symbol) and is "123456789012". The check number is to the right of the U (on-us symbol) and is corrected to be "124" rather than "123".
You may now run the following command to use this file:
CHECK_EVAL_DATA=./check-eval-data.json ocr check test 1 100
The ocr check debug command is intended for developers and requires more indepth knowledge. In particular, it is useful for determining the reason for a mismatch. See the Developer's Guide for more information.
If you set the URL environment variable to point to the REST service endpoint, each of the ocr check commands (ocr check scan, ocr check test, and ocr check debug) will send requests remotely to the REST service rather than servicing them locally.
For example, assuming the REST service is running on port 3000 of localhost, the following will measure the accuracy of the REST service for checks 1 through 100:
URL=http://localhost:3000 ocr check test 1 100
The following table is being used to track OCR accuracy progress and contains the results of running the ocr check test 1 20000 command:
| Date | Statistics | Number of checks | Description of changes | Vendor |
|---|---|---|---|---|
| Jul 11, 2024 | matches=97.81%/19561, x9Wrong=1.51%/303 | 20000 | Initial version | Capital One |
TBD
This document provides guidance for how YOU can collaborate with our project community to improve this technology.
To generate a report containing any vulnerabilities in any dependency please use:
$npm run scannpm run scan-licenseNote: Each of these scans should be run and problems addressed by a developer prior to submitting code that uses new packages.
Copyright 2024 Capital One
Distributed under the Apache License, Version 2.0.
SPDX-License-Identifier: Apache-2.0