Comparative analysis of optical character recognition methods for Sámi texts from the National Library of Norway
… Sámi alphabet, and our OCR models could improve upon NLN’s transcription for Inari Sámi.
… and are good candidates for a re-OCR process. If transcription accuracy is the main focus, …
… and are good candidates for a re-OCR process. If transcription accuracy is the main focus, …
Making Old Kurdish publications processable by augmenting available optical character recognition engines
B Yaseen, H Hassani - arXiv preprint arXiv:2404.06101, 2024 - arxiv.org
… The GT4HistOCR dataset is particularly well-suited for training advanced recognition models
in OCR software that utilize recurrent neural networks, specifically the LSTM architecture, …
in OCR software that utilize recurrent neural networks, specifically the LSTM architecture, …
Challenges in ocr today: Report on experiences from INEL
N Partanen - Электронная Письменность Народов Российской …, 2017 - elibrary.ru
… Optical Character Recognition (henceforth OCR) tools are … more systematic comparisons of
different OCR tools available; see… This is also a very particular case for OCR, since the scripts …
different OCR tools available; see… This is also a very particular case for OCR, since the scripts …
[PDF][PDF] Developing technologies for the documentation and description of the low-resource Uralic languages Zyrian Komi and North Saami
… In our own projects, we have developed language technologies focusing on low-resource
scenarios, specifically for the two Uralic languages Zyrian Komi and North Saami. In addition …
scenarios, specifically for the two Uralic languages Zyrian Komi and North Saami. In addition …
An unsupervised method for OCR post-correction and spelling normalisation for Finnish
Q Duong, M Hämäläinen… - Proceedings of the 23rd …, 2021 - aclanthology.org
… to contain errors introduced by OCR (optical character recognition) methods used in the …
to-sequence NMT (neural machine translation) model to conduct OCR error correction designed …
to-sequence NMT (neural machine translation) model to conduct OCR error correction designed …
[PDF][PDF] Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages
… , neural models, language documentation, tokenisation, corpora and lexicons, optical character
recognition, … , Hungarian and Estonian as well as North Sámi, Livonian and Votic, and …
recognition, … , Hungarian and Estonian as well as North Sámi, Livonian and Votic, and …
The world's first South Sámi TTS–a revitalisation effort of an endangered language by reviving a legacy voice
K Hiovain-Asikainen, TB Kjærstad… - Proceedings of the …, 2025 - aclanthology.org
… To reach an end-user suitable quality of the TTS, we have used a neural, end-toend … ),
which was scanned and OCR-processed for the project. Additional usable material came …
which was scanned and OCR-processed for the project. Additional usable material came …
Endangered languages are not low-resourced!
M Hämäläinen - arXiv preprint arXiv:2103.09567, 2021 - arxiv.org
… versus neural networks. Why would you write rules for an endangered language if neural …
Multilingual dependency parsing for lowresource languages: Case studies on north saami …
Multilingual dependency parsing for lowresource languages: Case studies on north saami …
[PDF][PDF] An OCR system for the unified northern alphabet
N Partanen, M Rießler - … of the Fifth International Workshop on …, 2019 - aclanthology.org
… indeed, has had only one fourth of the exposure to the Kildin Saami special characters that
the monolingual Kildin Saami model received. Nevertheless, the results are very close. Even …
the monolingual Kildin Saami model received. Nevertheless, the results are very close. Even …
Automatic Correction of Text Using Probabilistic Error Approach
P Nagaraj, V Muneeswaran, N Ghous… - 2023 International …, 2023 - ieeexplore.ieee.org
… workplace software, and optical character recognition (OCR) … —Estonian, North Sámi, and
South Sámi— spelling … test set, this work shows that neurological embeddings can be used to …
South Sámi— spelling … test set, this work shows that neurological embeddings can be used to …