Skip to content

mrrtmob/kiri-ocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

103 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Kiri OCR πŸ“„

PyPI version License Python Versions Downloads Hugging Face Model Hugging Face Spaces

Kiri OCR is a lightweight OCR library for English and Khmer documents. It provides document-level text detection, recognition, and rendering capabilities.

πŸš€ Try the Live Demo | πŸ“š Full Documentation

Kiri OCR

✨ Key Features

  • High Accuracy: Transformer model with hybrid CTC + attention decoder
  • Bi-lingual: Native support for English and Khmer (and mixed text)
  • Document Processing: Automatic text line and word detection
  • Streaming: Real-time character-by-character output (like LLM streaming)
  • Easy to Use: Simple Python API and CLI

πŸ“¦ Installation

pip install kiri-ocr

πŸ’» Quick Start

CLI Tool

kiri-ocr document.jpg

Python API

from kiri_ocr import OCR

# Initialize (auto-downloads from Hugging Face)
ocr = OCR()

# Extract text from document
text, results = ocr.extract_text('document.jpg')
print(text)

# Get detailed box-by-box results
for line in results:
    print(f"{line['text']} (confidence: {line['confidence']:.1%})")

Decoding Methods

Choose the decoding method based on your speed/quality tradeoff:

# Fast (CTC) - Fastest, good for batch processing
ocr = OCR(decode_method="fast")

# Accurate (Decoder) - Balanced speed and quality (default)
ocr = OCR(decode_method="accurate")

# Beam Search - Best quality, slowest
ocr = OCR(decode_method="beam")

Streaming Recognition

Get character-by-character output like LLM streaming:

from kiri_ocr import OCR

ocr = OCR(decode_method="accurate")

# Stream characters as they're decoded
for chunk in ocr.extract_text_stream_chars('document.jpg'):
    print(chunk['token'], end='', flush=True)
    if chunk['document_finished']:
        print()  # Done!

πŸ“š Documentation

Full documentation is available on the Wiki:

πŸ“Š Benchmark

Results on synthetic test images (10 popular fonts):

Benchmark Graph

πŸ“ Project Structure

kiri_ocr/
β”œβ”€β”€ core.py               # OCR class
β”œβ”€β”€ model.py              # Transformer model
β”œβ”€β”€ training.py           # Training code
β”œβ”€β”€ cli.py                # Command-line interface
└── detector/             # Text detection
    β”œβ”€β”€ db/               # DB detector
    └── craft/            # CRAFT detector

β˜• Support

If you find this project useful:

Join our Discord Community](https://discord.gg/Vcrw274RVC)

βš–οΈ License

Apache License 2.0

About

Kiri OCR is a lightweight, OCR library for English and Khmer documents.

Topics

Resources

License

Stars

Watchers

Forks

Languages