This code turns a scanned Farsi pdf document into images and then converts it to a text file using the powerful Tesseract Open Source OCR Engine developed by Google. you can use this for any other language by changing "lang='fas'" parameter on pytesseract.image_to_string function.
-
Notifications
You must be signed in to change notification settings - Fork 0
shahmohamadi/PDF_TEXT_OCR
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published