KADUNA STATE UNIVERSITY
FINAL YEAR STUDENT PROJECT PROPOSAL TOPIC
BY
ABDULLAHI SA’ID TANKO
KASU/20/CSC/1124
Proposed Topic:
TRANSLATION OF SIGN TEXT (KURMANCI) TO HAUSA LANGUAGE
SUPERVISED BY:
Mrs Hamidatu Abdulkadir
APRIL 2025
TABLE OF CONTENTS
INTRODUCTION......................................................................................................................... 3
BACKGROUND OF THE STUDY............................................................................................. 3
PROBLEM STATEMENT...........................................................................................................4
AIM AND OBJECTIVES.............................................................................................................5
OBJECTIVES:...................................................................................................................................5
RESEARCH QUESTIONS...........................................................................................................5
SCOPE OF THE STUDY............................................................................................................. 6
LIMITATIONS OF THE STUDY...............................................................................................6
PROJECT TIMELINES...............................................................................................................6
METHODOLOGY........................................................................................................................9
EXPECT RESULT........................................................................................................................9
REFERENCES................................................................................................................................. 10
Introduction
Language translation plays a crucial role in fostering communication and understanding among
people from diverse linguistic backgrounds. In regions where minority and indigenous languages
like Kurmancî are spoken, there is a growing need for translation systems that bridge language
barriers, especially in public and multilingual settings. Kurmancî, a widely spoken dialect of the
Kurdish language, is primarily used in parts of Turkey, Syria, and Iraq. Despite its significance,
it remains underrepresented in mainstream translation technologies. On the other hand, Hausa is
one of the most widely spoken languages in West Africa, including Nigeria, Niger, and parts of
Cameroon, and is often used in education, commerce, and governance. However, communication
between Kurmancî and Hausa speakers remains limited due to the absence of dedicated
translation tools.
This project aims to develop a natural language processing (NLP) system capable of translating
sign text from Kurmancî to Hausa, thereby enhancing mutual understanding between speakers of
both languages. By leveraging recent advancements in machine translation models, especially for
low-resource languages, this project focuses on creating a practical solution for real-world
applications such as public signage, announcements, and instructions. The system is designed to
support the linguistic diversity of its users while addressing the technological gap in regional
language processing. Ultimately, this project seeks to contribute to inclusive communication and
digital accessibility across cultures and languages.
Background of the Study
The need for accurate language translation technologies has become increasingly urgent in today’s
multilingual world. In many regions, particularly across Africa and the Middle East, linguistic
diversity presents communication challenges in public, educational, and healthcare domains.
Kurmancî, a major dialect of the Kurdish language, is widely spoken across Turkey, Syria, and Iraq
but is classified as a low-resource language in terms of computational resources (Goyal et al., 2023).
Similarly, Hausa, spoken by over 50 million people in West Africa, is not adequately represented in
many popular machine translation systems despite its widespread use (Ngonga et
al., 2022). The disparity in digital support for these languages limits cross-cultural understanding
and access to information.
Recent developments in neural machine translation (NMT) have shown promise in improving
translation for low-resource language pairs through transfer learning, multilingual modeling, and
fine-tuning on parallel corpora (Aharoni et al., 2023). However, the availability of high-quality
parallel corpora between Kurmancî and Hausa remains minimal, which presents a significant
challenge for building reliable models. Researchers have pointed out that effective translation for
such languages requires both community-driven data collection and the use of adaptable, open-
source language models (Adepoju et al., 2023). The success of these techniques in other under-
resourced languages suggests that similar approaches could be applied to Kurmancî-Hausa
translation.
Furthermore, sign text—commonly found on roads, buildings, and public facilities—plays a key role
in guiding and informing the public. These texts are often short, context-specific, and culturally
nuanced, requiring not only linguistic translation but also contextual adaptation. In multilingual
communities where Kurmancî speakers reside or migrate to Hausa-speaking regions, the inability to
understand such signs may result in confusion or safety concerns. An intelligent sign text translator
designed specifically for these languages can bridge this gap and improve inclusiveness, particularly
for non-native residents, tourists, or refugees (Meftouh et al., 2022).
By developing a system that translates sign text from Kurmancî to Hausa using NLP models, this
project addresses both linguistic and technological inequities. It contributes to the ongoing
efforts in low-resource machine translation research and supports digital inclusivity in
multilingual environments. Leveraging technologies such as transformers and multilingual
encoders, combined with a curated dataset, the project aims to produce translations that are
contextually relevant and culturally sensitive. This aligns with the broader global goal of
promoting language equality in the digital age (Yimam et al., 2023).
Problem Statement
Despite the increasing demand for multilingual communication tools, there is a significant lack
of translation systems that support low-resource language pairs such as Kurmancî and Hausa,
particularly for domain-specific texts like public signage. This communication gap can lead to
confusion, misinformation, and social exclusion for individuals who rely on either language in
multilingual settings, such as border towns, refugee communities, or trade zones. Existing
machine translation platforms either do not support these languages or perform poorly due to the
absence of sufficient parallel corpora and linguistic resources. Therefore, there is an urgent need
to develop a specialized translation system that can accurately convert sign text from Kurmancî
to Hausa, leveraging natural language processing techniques to enhance linguistic accessibility
and promote inclusive communication.
Aim and Objectives
To develop a natural language processing-based system for translating sign text from Kurmancî
to Hausa in order to enhance communication and accessibility for speakers of both languages in
multilingual environments.
Objectives:
1. To collect and preprocess a parallel corpus of Kurmancî and Hausa sign text for training
and evaluation of the translation model.
2. To develop a machine translation model using neural network architectures (e.g.,
Transformer or Seq2Seq) specifically tailored for low-resource language pairs.
3. To evaluate the translation model’s performance using standard metrics such as BLEU and
METEOR scores, as well as human assessments for accuracy and contextual relevance.
4. To design and implement a user-friendly application (web or mobile) that allows users to
input Kurmancî text and receive real-time Hausa translations.
Research Questions
1. How can a reliable and representative parallel corpus of Kurmancî-Hausa sign texts be
collected and preprocessed for machine translation purposes?
2. What natural language processing techniques are most effective for developing a
translation system for low-resource languages like Kurmancî and Hausa?
3. How accurately can a neural machine translation model translate public sign text from
Kurmancî to Hausa compared to human translations?
4. What are the usability and accessibility considerations in designing a user interface for
real-time sign text translation?
Scope of the Study
This project focuses on the development of a machine translation system specifically designed to
translate short sign texts from the Kurmancî dialect into the Hausa language. The system is
tailored for use in environments where both languages are spoken, such as multilingual
communities, border towns, refugee camps, and markets.
Limitations of the Study
Due to the limited availability of publicly accessible Kurmancî-Hausa parallel corpora, the
translation model may initially rely on a small or synthetically generated dataset, which could
affect its accuracy. The system is designed primarily for short sign texts and may not perform
well on longer, complex sentences or texts with ambiguous meanings.
Project Timelines
Literature Review
This study investigates the use of neural machine translation (NMT) models to improve
translation accuracy for low-resource languages, focusing on language pairs with limited parallel
corpora. It explores techniques such as transfer learning and data augmentation, showing that
even with small datasets, NMT models can significantly improve translation performance. The
study suggests leveraging multilingual models and zero-shot learning to enhance translation
quality for languages like Kurmanci and Hausa (Kim et al., 2024).
Sign language recognition using deep learning techniques has seen significant improvements in
recent years. A 2024 paper by Zhang et al. presents a deep convolutional neural network (CNN)
and long short-term memory (LSTM)-based model that automates the recognition of sign
language gestures. The study highlights the importance of multi-modal data, integrating video
and sensor-based data, which can be adapted for Kurmanci sign text to Hausa translation tasks by
first recognizing the sign text and then translating it (Zhang et al., 2024).
The use of transformer models for machine translation has revolutionized the field of NLP. In a
2023 study, Nguyen et al. explore how transformers, particularly the self-attention mechanism,
outperform traditional models like sequence-to-sequence architectures in translation tasks. They
propose methods for optimizing transformer-based systems for resource-scarce language pairs,
which could benefit the development of Kurmanci-Hausa translation systems (Nguyen et al.,
2023).
This paper investigates multilingual NMT systems that can handle translation between multiple
languages simultaneously. By employing shared models trained on multiple language pairs, the
research indicates that multilingual models significantly improve performance for low-resource
languages, such as Kurmanci and Hausa. The authors argue that such systems can help bridge the
translation gap where direct parallel corpora are sparse (Shao et al., 2024).
In the context of low-resource languages, data augmentation plays a critical role in improving the
performance of machine translation models. This method could be particularly useful for
improving the Kurmanci-Hausa translation system by increasing the dataset size without the
need for manually collecting more data (Xie & Wang, 2023).
Cross-lingual transfer learning allows a machine translation model trained on one language pair
to transfer knowledge to another, especially for languages with limited data. In a 2023 paper,
Rodriguez et al. demonstrate how models pre-trained on high-resource languages can be fine-
tuned for low-resource languages, enhancing translation accuracy. This technique is valuable for
building a translation system for Kurmanci-Hausa where direct training data is limited
(Rodriguez et al., 2023).
A 2023 study by Lee et al. examines various sign language recognition and text translation
systems, comparing their effectiveness in translating from sign languages to text-based
languages. The research focuses on how multimodal input, such as combining video data for sign
recognition and text data for translation, can lead to better accuracy. This concept can be applied
to translating Kurmanci sign text into Hausa, with enhanced performance by combining image
recognition with text translation (Lee et al., 2023).
In 2024, Kumar et al. explored unsupervised learning techniques for translating low-resource
languages. By utilizing parallel corpora from similar languages, they demonstrate how
unsupervised models can effectively learn translation patterns, even in the absence of large,
direct translation datasets. This method could be beneficial for Kurmanci-Hausa translation,
where there is a lack of extensive parallel corpora (Kumar et al., 2024).
Optical character recognition (OCR) technology has been a key tool for translating written text
from images. A study by Patel and Singh (2023) focuses on using OCR for recognizing sign
language gestures from images and converting them into text. This technology can be adapted
for Kurmanci sign recognition, where OCR models can detect and transcribe sign text before
translating it into Hausa (Patel & Singh, 2023).
Evaluation of machine translation systems for African languages, such as Hausa, presents unique
challenges. A 2024 paper by Osei et al. evaluates various metrics, including BLEU and TER, for
assessing the quality of translations between African languages. The study emphasizes the need
for more culturally sensitive evaluation methods and suggests modifications to current metrics to
better reflect linguistic nuances in African language pairs like Kurmanci and Hausa (Osei et al.,
2024).
METHODOLOGY
The methodology for this project on translating Kurmanci sign text to Hausa involves several
key steps. First, a parallel corpus of Kurmanci sign text and its Hausa translation will be
collected, followed by preprocessing to standardize the text. If sign language images are
involved, optical character recognition (OCR) and deep learning-based models will be used to
detect and extract Kurmanci text from images or videos of sign language. The translation task
will be performed using a transformer-based neural machine translation (NMT) model,
leveraging multilingual models and transfer learning to overcome the challenges posed by low-
resource languages. The model will be trained on the prepared dataset, employing techniques
like data augmentation to enhance the training process. Evaluation will be done using automated
metrics such as BLEU and human assessments to ensure high translation quality.
EXPECT RESULT
The expected result of this project on translating sign text from Kurmanci to Hausa is the
development of an efficient and accurate machine translation system capable of seamlessly
converting Kurmanci sign text into fluent Hausa. This system should leverage deep learning
models, particularly transformer-based neural machine translation (NMT), to ensure high-quality
translations despite the limited availability of parallel corpora. Additionally, if sign images are
involved, the integration of Optical Character Recognition (OCR) for accurate text extraction
and multimodal approaches for sign language recognition should be incorporated. The system
should demonstrate good performance, with metrics such as BLEU score reflecting high
translation accuracy, while being user-friendly and adaptable to real-world applications, such as
signage in public spaces and communication tools for Kurmanci and Hausa speakers.
References
• Zhang, L., Wang, Y., & Liu, Z. (2020). A Real-time Face Recognition
System Using Raspberry Pi. Journal of Computer Vision.
• Sharma, A., & Patil, A. (2019). Cloud-Based Attendance System
Using Face Recognition. IJERT.
• OpenCV Documentation. https://docs.opencv.org/
• Face Recognition Python Library.
https://github.com/ageitgey/face_recognition
•