ocrd_tesserocr

Segment region, line, recognize with tesserocr

Installation

Required ubuntu packages:

Tesseract headers (libtesseract-dev)
Some tesseract language models (tesseract-ocr-{eng,deu,frk,...} or script models (tesseract-ocr-script-{latn,frak,...})
Leptonica headers (libleptonica-dev)

pip install -r requirements
pip install .

If tesserocr fails to compile with an error::

$PREFIX/include/tesseract/unicharset.h:241:10: error: ‘string’ does not name a type; did you mean ‘stdin’?
       static string CleanupString(const char* utf8_str) {
              ^~~~~~
              stdin

This is due to some inconsistencies in the installed tesseract C headers (fix expected for next Ubuntu upgrade, already fixed for Debian). Replace string with std::string in $PREFIX/include/tesseract/unicharset.h:265:5: and $PREFIX/include/tesseract/unichar.h:164:10: ff.

If tesserocr fails with an error about LSTM/CUBE, you have a mismatch between tesseract header/data/pkg-config versions. apt policy libtesseract-dev lists the apt-installable versions, keep it consistent. Make sure there are no spurious pkg-config artifacts, e.g. in /usr/local/lib/pkgconfig/tesseract.pc. The same goes for language models.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
ocrd_tesserocr		ocrd_tesserocr
repo		repo
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
.pylintrc		.pylintrc
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.rst		README.rst
ocrd-tool.json		ocrd-tool.json
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
requirements_test.txt		requirements_test.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ocrd_tesserocr

Installation

About

Uh oh!

Releases

Packages

Languages

License

noahmetzger/ocrd_tesserocr

Folders and files

Latest commit

History

Repository files navigation

ocrd_tesserocr

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages