ADDIS ABABA UNIVERSITY
COLLEGE OF NATURAL AND COMPUTATIONAL SCIENCES
DEPARTMENT OF COMPUTER SCIENCE
(Stream of Data & Web Engineering)
Name ID
1.Aga Chimdesa GSE/5565/16
April, 2024
Addis Ababa, Ethiopia
Title: Big Data Governance Framework for Ethio Telecom
Year: 2022
Statement of the problem
Depending on the large number of subscribers connected to networks on daily basis, Ethio Telecom
was positioned with the chance of collecting a large amount of data. However, the perspective and
potential of how big data governance frameworks are implemented within the Ethio telecom are still
vague. Due to the lack of proper data management strategy and big data governance framework, the
company has trouble with: network performance monitoring, customer segmentation, customer churn
detection, fraud management & prevention, call drop analysis, network analytics, predictive
campaign, and credit risk analysis. Currently, there is no such framework in place to govern big data,
and there is considerable confusion within the organization about who is responsible for big data,
who is aware of large datasets, and what should they do with it. The existing big data governance
frameworks are contextual and are not customized for Ethio telecom.
The Proposed Big Data Governance Framework
Depending on Ethio telecom big data practice, a big data governance framework with three domains
and nineteen elements was proposed, which include a drive domain (identify organizations structure,
big data scope determination, data risk management, strategy development and activity plane,
policies and standards setting, stakeholder selection, and working mechanism clarification),
governance domain (data quality management, data security and privacy setting, business case
evaluation, data identification, big data collection, big data storage, big data processing and
analysing, and big data visualization), and guideline principle (accountability, transparency, integrity,
and availability). It helps Ethio telecom in efficiently achieving the desired results in innovative data
utilization, which frequently leads to a culture change in organizations for deep data-driven
processes.
Evaluation/Justification of the Proposed Framework
An expert validation was performed to evaluate the proposed big data governance framework. From
the evaluation result revealed the content of the proposed framework is complete, relevant, clear and
scalable as the evaluation result shows a mean of 4.36, 4.57, 4.14 and 4.57 respectively, so we can
conclude that the evaluators strongly agreed on complete, scalability, clarity and relevance of the
proposed framework. The applicability of the proposed framework can improve resource utilization,
scalability, data management and yield data management benefits (4.64) and the implementation of
the proposed framework fits with the organization problems (4.71) revealed the validity of the
framework to be implemented at Ethio telecom. The descriptive analysis (mean and standard
deviation) of the survey result is computed and the mean result of the evaluation variables is found to
be greater than 3 which indicated that the respondents agreed on the clarity, completeness,
usefulness, correctness of the proposed framework.
Title: Web Element Locator Algorithm for Dynamic Web
Application Testing Using Machine Learning
Year: 2021
Statement of the problem
The rapid development of web applications causes some broken taste cases. The main reason for the
appearance of test case breakage is the failure of the element’s locator on the web application like
non-selection, mis-selection, form data problems, and obsolete content problems. The fragility of
GUI tests has been a big problem that disturbs those who use test automation. To repair these
fragility problems in automated testing, test engineers must debug and rewrite those test cases.
The Proposed Artifact/Solution
The designed system encompasses many components working in collaboration to perform different
tasks like scrapping, Machine Learning (ML) model, locating web elements, identifying web
elements, and executing web application testing. It divides the web element locator algorithm
(WELA) algorithm into three phases. The first phase is getting the address of the Application Under
Test (AUT) that is going to be tested. The second phase is identifying change and locating the web
element. The change identification is based on the notion of feature, which reflects the degree of
similarity between the changed web element and the previous web element. A linear Support Vector
Machine (SVM) classifier is used in the proposed ML algorithm to generate a new locator for the
changed web element. The best set of web element features allows building useful ML models. The
last section of the algorithm is doing web application testing after identifying the target web
elements.
Evaluation of the Proposed Framework
An experiment is performed for testing the accuracy and performance of the developed algorithm,
WELA. The experiment used Python as a development tool and Selenium as a web application
testing tool. An experiment is performed over a sample of ten open-source web applications where
WELA can provide a significant amount of improvement in terms of time and effort required for
locating dynamic web elements during web application testing. The experiments are related to the
changes of web elements that are logical change, structural change, and presentation change of a web
application and address the web element locator problems. It also presented the performance and
accuracy of the proposed algorithm in comparison with those previous works. The result is promising
that effectively repair 97% of broken web test scripts and generate the test suites with the minimum
execution time on the developed versions of a Dynamic Web Application (DWA).
Title: Amharic Text to Ethiopian Sign Language Translation Model
using Factored Phrase-Based Statistical Machine Translation Approach
Year: 2021
Statement of the problem
While substantial progress has already been made in the text to sign languages translation systems of
other countries, Amharic text to Ethiopian Sign language machine translation gets less attention of
researchers. Data-driven machine translation approaches such as statistical machine translation and
neural machine translation have become more dependable and efficient as computing power has
increased exponentially. To address this research gap, this work makes use of factored phrase-based
statistical machine translation approach to develop a model that will automatically translate from
Amharic text to Ethiopian Sign Language (EthSL). Building a translation system that can generate
real-time statements via a signing avatar is helpful for the Deaf.
The Proposed Artifact
The architecture of the proposed system is based on three main modules that were trained on
different corpora. The first module, the Amharic PoS was trained using a relatively huge PoS corpus
by making use of state-of-the-art deep learning network, BiLSTM. The second module is a factored
statistical machine translation model. It is composed of three essential components that are
translation model, language model and decoder. The translation model assigns a probability
translation; that is, the faithfulness of EthSL as a translation of Amharic. The language model
component measures the fluency in the translated Ethiopian sign language sentences. The third
component is a decoder which searches for the highest scoring sentence in the Ethiopian sign
language given the corresponding Amharic sentence. The third module is a video mapping module
which takes the translated text as an input and finds matches for each word from the video corpus
based on a simple string-matching algorithm between the transformed text and labels of videos.
Evaluation of the Proposed Framework
First, to conduct the experiment, Amharic tagger trained using Google Collab cloud server from
Google Research. Google Collab allows to train a sizeable deep learning-based PoS tagger on a free
GPU. Three experiments are conducted using the collected parallel corpus to evaluate and compare
the result of the system using three different approaches. The first experiment is conducted using a
standard phrased based statistical approach as a baseline model and achieved a BLEU score of 35.2
and a NIST score of 6.13. The second experiment is carried out using a factored phrased based
approach and achieved a BLEU score of 35.28 and a NIST score of 6.16. The third experiment is
carried out using a neural machine translation approach and achieved a BLEU score of 26.38 and a
NIST score of 5.64. All the developed models are evaluated with the same test-set, which contains
519 sentences. The evaluation scores clearly demonstrate that the use of factored statistical
translation for Amharic to EthSL MT achieves improvement over the standard phrased-based
statistical machine translation and the neural machine translation trained on a plain corpus.