D Project
D Project
April 2025
DECLARATIONS
By student;
I hereby present this work and consent to be my original work and has not been presented for
award of degree in any other university or any other award.
Name……………………………………..
Date…………………………………………………
By supervisor;
I confirm that the work reported in this thesis has been carried out by the student under my
supervisions.
Name……………………………………………….
Date……………………………………………………
i
DEDICATIONS
I would like to dedicate this thesis to my dear parents, fellow friends and my beloved course
mates who have supported me both socially and physically.
Without forgetting my esteemed school Maasai Mara University for enabling me to acquire
both skills and knowledge I used in my thesis. Am very grateful for total contribution towards
this.
ii
ACKNOWLEDGEMENTS
First, I thank the almighty God for this great support and gift of life that he has enabled me
with for the entire period of my study at Maasai Mara University.
Secondly, I give my gratitude to my beloved school Maasai Mara University for the great
support of opportunity to be one of her students.
Thirdly, to my supervisor Mr. Herbert Imboga, in charge of our research project, I pass my
gratitude to him for full support of help during my research process for his step-by-step guide
till my research was complete.
Lastly, I would like to thank my family, friends, relatives and my class mates for the full support
they offered to me both socially, financially and physically.
iii
ABSTRACT
Infant mortality is dead of infants before attaining one year of age. It remains a critical public
health issue, with significant implications for social, economic and health systems worldwide.
It is influenced by factors such as maternal health, neonatal car, socioeconomic status and
cultural belief systems.
Study employed primary data collected from eligible mothers for interview for predicting
infant mortality rates.
Machine learning algorithms such as logistic regression, random forest and k-nearest neighbor
has demonstrated a promising result in various predictive modelling applications across health
systems.
The study employed supervised machine learning approaches to predict infant mortality in
Narok County. Different classification methods were used. The methods were; Logistic
regression, K-Nearest Neighbor, Gradient boosting and Random Forest.
Determinant factors contributing to infant deaths were also identified and categorized as social
factors, Economic factors, Cultural factors and Environmental factors.
Chi square test was used to test for independence among key predictors of infant mortality rates
and it was found that all predictors identified contributed significantly to infant deaths.
iv
Table of Contents
DECLARATIONS ...................................................................................................................... i
DEDICATIONS.........................................................................................................................ii
ACKNOWLEDGEMENTS ..................................................................................................... iii
ABSTRACT.............................................................................................................................. iv
CHAPTER ONE ........................................................................................................................ 1
INTRODUCTION ..................................................................................................................... 1
1.2 Problem Statement ........................................................................................................... 2
1.3. Machine learning ............................................................................................................. 2
1.4. Objectives of Study. ........................................................................................................ 4
1.4.1 General Objective of study. ....................................................................................... 4
1.4.2 Specific Objective of Study. ...................................................................................... 4
1.5 Research Questions. ......................................................................................................... 4
1.6 Significance of study. ....................................................................................................... 4
1.7 Scope and limitations of study. ........................................................................................ 4
Limitations of study. .............................................................................................................. 4
CHAPTER TWO ....................................................................................................................... 5
REVIEW OF LITERATURE. ................................................................................................... 5
2.1 Introductions..................................................................................................................... 5
2.2 Concept of Infant Mortality.............................................................................................. 5
2.3 Social Determinants of Infant Mortality. ......................................................................... 6
2.4 Economic Determinants of Infant Mortality .................................................................... 6
2.5 Cultural Determinants of Infant Mortality. ...................................................................... 7
2.6 Environmental Determinants of Infant Mortality............................................................. 7
2.7 Literature Gap .................................................................................................................. 8
2.8 Conceptual framework. .................................................................................................... 9
CHAPTER THREE ................................................................................................................. 10
RESEARCH DESIGN AND METHODOLOGY ................................................................... 10
3.1 Research Design ............................................................................................................. 10
3.2 Study variable and measurements. ................................................................................. 10
3.3 Supervised Machine Learning methods. ........................................................................ 10
3.3.1 Logistic Regression ................................................................................................. 10
3.3.2 Random Forest......................................................................................................... 11
3.4 Training and testing data. ............................................................................................... 11
3.5 Performance Evaluation. ................................................................................................ 12
3.5.1 Confusion Matrix. .................................................................................................... 12
v
3.6 Target population. .......................................................................................................... 12
3.7 Sampling techniques and sample sizes........................................................................... 13
3.7.1 Sampling Techniques .............................................................................................. 13
3.7.2 Sample size. ............................................................................................................. 13
3.8 Research Instruments. .................................................................................................... 13
3.9 data collection Procedure. .............................................................................................. 13
CHAPTER FOUR.................................................................................................................... 14
FINDINGS AND DISCUSSION OF STUDY ........................................................................ 14
4.1 Response Return Rate .................................................................................................... 14
4.2. Demographic Information. ............................................................................................ 14
4.2.1 Age of the respondents ............................................................................................ 14
4.2.2 Infant Mortality ........................................................................................................... 15
Table 4.3: Gender ........................................................................................................... 16
4.3. Factors of Determinants of Infant Mortality ................................................................. 17
4.3.2 Cultural factors influencing infant mortality rates. ................................................. 19
4.3.3 Environmental factors influencing infant mortality rates. ....................................... 19
4.3.4 Economic Determinants influencing infant mortality rates. .................................... 20
CHAPTER FIVE ................................................................................................................... 24
DISCUSSION AND CONCLUSION. .................................................................................. 24
5.2 CONCLUSION. ............................................................................................................. 24
5.3. Recommendations ......................................................................................................... 24
REFERENCES....................................................................................................................... 25
APPENDIX II: HOUSEHOLD QUESTIONNAIRE........................................................... 26
vi
CHAPTER ONE
INTRODUCTION
Infant mortality is death of children under one year of age. It is assessed by infant mortality
rate (IMR).
Infant mortality is most sensitive indicator of population health as well as reflects county’s
social, economic and environmental conditions.
World Health Organization (2017) reported that risk of infant dying under one year of age is
highest in Africa Region. In Kenya context, the population and housing census 2019
enumerated population at 38.6 million where data trend indicated that total population tripled
between 1969 and 1999 with an increase of 1 million infants born yearly as per National
Council for Population Development (NCPD).
Migration of residents from rural to urban in some parts of Kenya, Narok county being included
has stirred by search for employment and settlement has resulted to urban growth. This rapid
migration must some extend created disparities in the economic, social cultural and
environmental status of the people in the county as acknowledged by boundary changes and
classification of the society.
Explosion of urban population in a region has been found to go together with high rate of
poverty and poor health conditioning. This has contributed to high infant mortality Rate in
slums and rural areas than in more urban areas.
According to World Health Organization (WHO) data of 2017, IMR has been decreasing from
65 deaths per 1000 live births in 1990 to 29 deaths per 1000 live births in 2017.
Present condition of infant mortality in Africa is becoming critical factor. High rates of
mortality levels have been frequently documented especially infant mortality which has been
a critical challenge in Africa.
In Kenya, the Kenya Service Provision Assessment (KSPA) revealed that most health care
providers are not taking care of sick children holistically, but rather are treating children only
for presenting illness. It was established that the factors associated with promotion of child
health using holistic approach such as Integrated Management of Childhood Illness (IMCI)
strategy to manage sick child. The IMCI aims to reduce morbidity and infant mortality by
implementing three main components:
1
a) Improving Health Workers Skills in case management
b) Improving Health systems
c) Community child practices.
It has been discovered by researchers that infant mortality is affected by the following factors.
These factors include Environmental factors, Political factors, Economic factors and Cultural
factors.
According to word health organization (WHO), infant mortality remains to be a major concern
in many parts of the world. In developing countries, higher level of infant mortality has been a
serious problem.
Previous studies have investigated several predictors of infant mortality in Kenya. Studies
[Montel el, at 2016] reported socioeconomic determinants of infant mortality using KNHS
dataset of 2003.
Although many studies have been carried out previously to identify factors resulting to infant
mortality in Kenya, no studies have employed machine learning techniques to predict risk
factors of infant mortality. According to [Montel 2018] machine learning provides solutions
for all possible problems in vision, speech and health.
The study aims to determine risk factors of infant mortality using the best performing
supervised machine learning models using primary data collected from eligible women at
Narok county within reproductive ages.
Machine learning (ML) is a subset of artificial intelligence that focuses on the development of
algorithms and statistical models that allow computers to perform specific tasks without
explicit instructions. Instead, the system learns from data patterns and makes predictions or
decisions based on that data. In the context of data analysis, machine learning plays a crucial
role by enabling the extraction insights, identifying patterns, and making predictions from large
datasets.
1. Supervised Learning:
2
In supervised learning, the model is trained on labeled data, meaning that the input data is
paired with the correct output.
Functionality: The model learns the relationship between the input features and the output
labels. Once trained, it can make predictions on new, unseen data.
Common algorithms include linear regression, logistic regression, decision trees and gradient
boosting.
2. Unsupervised Learning:
Unlike supervised learning, unsupervised learning deals with unlabeled data. The model must
identify underlying patterns and group structures in the data without pre-existing labels.
Functionality: The system tries to learn the structure and distribution of the data, often for the
purpose of clustering or association.
3. Semi-Supervised Learning:
This approach combines both labeled and unlabeled data for training. It is particularly useful
when obtaining labeled data is costly or time-consuming.
Functionality: The model leverages the available labeled data to better structure the
understanding of the unlabeled data.
Common Algorithms: Variations of supervised algorithms that can incorporate unlabeled data
(e.g., semi-supervised SVM).
4. Reinforcement Learning:
Functionality: The model learns to maximize some notion of cumulative reward through
exploration and exploitation.
Common algorithm includes Q-learning, Deep Q-networks and policy gradient methods.
3
1.4. Objectives of Study.
To develop and compare various machine learning models for predicting infant mortality rates.
• Which machine learning model is best for predicting infant mortality rates?
• What are key determinants influencing infant mortality rates in Narok County.
Mortality is a key driver of population change and understanding the effect of infant mortality
is very crucial for policy makers and establishment of programs to enhance reduction of infant
mortality.
Identified determinants of infant mortality rates will also help government and non-
governmental institutions within Narok county implement health programs with a view of
reducing infant mortality rates. This will lead to affordable infant programs such as vaccines
and intensive care of newborns and their mothers.
• The study employed primary data to developed a predictive model of infant mortality
and identify key determinants contributing to high infant mortality rates.
• Machine learning approaches like logistic regression, Gradient Boosting, K-Nearest
Neighbor and Random Forest were employed for model building and evaluation.
• Effectiveness of model was assessed with metrics like Accuracy, Precision and area
Under the Roc Curve (AUC_ROC).
Limitations of study.
• There was inability to traverse a whole county for study due to scarcity of resources
and time.
• Some questionnaires were never recovered from some of respondents.
• The cost of conducting research was expensive.
4
CHAPTER TWO
REVIEW OF LITERATURE.
2.1 Introductions.
In this chapter, relevant literature was reviewed. It covered the concept of IMR, social, cultural,
economic and environmental determinants of infant mortality.
Infant mortality has been examined to be an unavoidable aspect of human experiences. Its
control has been one of society’s most staggering accomplishment in pursuit to manage the
environment to their improvement.
According to Fedele and Stefaneli (2017), literature was reviewed and it was found that most
infant deaths are brought about by infections that are easily preventable with demonstrated,
affordable and quality-conveyed interventions. They noted that infections and neonatal
complications oversee by far most of infant deaths.
Also, Lan and Tavros (2017) literature were reviewed and observed that high infant mortality
is not perfectly correlated with average income in each country.
McGough (2017) also carried out a study of disease transmission and community health and
was of the view that the effect of hereditary anomalies on infant mortality decreased the rate of
infant mortalities. However, it differed in some nations like in central Latin American and
Eastern Europe where the abnormalities of the heart and the central nervous system was
attributed to high infant deaths due to natural discrepancies.
Miller (2017) observed that infant mortality explains reasons why most of governments mainly
in developing countries opt for solutions of reducing high IMR to lowest levels that they could
afford. To do this, they need to know exactly what changes has contributed to high IMR and
what factors, if any, must be addressed.
Literature according to Osuorah (2017) also found that low weight makes up 60-80 percent of
infant mortality in developing countries. They observed that the lowest mortality rates are
between infants weighing between 3kg to 3.5kg and those weighing less have high IMR. In the
African situation, the level of IMR has decreased considerably as Paget (2019) argued that the
level of mortality in Africa is among the highest in the world.
5
Using WHO (2018) information, it was predicted that IMR was 42 per 1000 worldwide while
Kenya’s IMR was 39 per 1000 live births. The IMR in Kenya has decreased from 2009 to 2019.
In 2009, IMR was 40 per 1000 live births, in 2010 it was 39 per 1000 live births, in 2015 it was
35 per 1000 live births, in 2018 it was 32 per 1000 live births and in 2019, it was 31 per 1000
live births. This now led to research in factors that are attributed to high infant mortality in
Narok County.
Empirical studies have shown that literate mothers have more healthy infants and higher
likelihood that their children may live longer than illiterate mothers. A study by Taramsari
(2018) found that IMR was highest in countries with higher economic inequalities, which
determines their social class as a major factor. The study showed that lower household income
was associated with high infant mortality in society. Working class mothers had low Infant
mortality since they could provide for their young ones. Women working in industries had less
time to take care of their young ones which exposed their infants to more risk of infections
because they had little time to look after their young ones.
A woman’s level of education has been observed to be an important factor behind the nation’s
higher IMR. According to research by Hossain, (2018), he demonstrated that infant born to an
illiterate mothers had higher chances of dying during their first month compared to literate
mothers. Post-neonatal rate was also high among illiterate mothers compared to literate ones.
The study concluded that there was a positive relationship between mothers’ level of education
and infant mortality in nation.
In Kenya, study by Salawu (2020) found out that youthful mothers were exposed to
complications during delivery hence higher IMR. It was revealed that young mothers had high
rate of infant mortality as compared to older mothers who had experience in their motherhood
and proper feeding. These factors assisted in evaluating cause of infant mortality in Narok
County.
The relationship between IMR and level of income was looked at with specific a focus
according to research done by Pabayo,2019. Also, according to study done by Tang (2019) the
connection between infant mortality rate and dimensions of salary is an important variable for
country’s level of income, with the goal that views infant survival as an element of country’s
level of development. It was also examined that relationship between parents’ socioeconomic
6
status and infant mortality revealed that Education and Social economic status of parents had
a significance role in decline of IMR.
A study by Haldar (2019) looked up for causes of infant mortality and found out that an increase
in Gross domestic Product (GDP) causes decrease in IMR. Infant Mortality associated with
birth defects was inversely proportional to GDP. Therefore, here was a significant approval
that intrinsic abnormalities represented an expansion of infant deaths in both developing and
developed nations. However, infant deaths were higher in poor nations than in wealthier
nations.
Most of empirical studies reviewed are based on social-economic factors on infant mortality
while current studies tried to look at economic factors to build up the relationship between
infant mortality in Narok County.
Several studies concerned with cultural factors affecting IMR in sub-Saharan Africa have been
featured. Among them was the attribute of higher infant mortality in society by (Harkness and
Super 2020). It was revealed that most of studies done indicated that cultural factors are
integrated in the enormous ethnic and religious beliefs which exists in Sub-Saharan Africa.
Most of cultural beliefs and practices reflects some strength on whether mothers had sought
medical care during pregnancy and when their newborns were sick.
According to study by Dhingra and Pingali, (2021), in most societies, the culture inclination of
women’s social status whether she can conceive and give birth is an additional opportunity for
marriage in the Sub-Saharan Africa. High fertility results in women having short birth interval
between pregnancies that put both the infants and mothers at risk, leading to death in most
cases. However, they found out that in a culture where young people under 18 years old engage
in unhealthy sex before marriage, there are complications during delivery because most of them
don’t seek for health care and this could lead to high infant mortality.
Research has shown that Environmental and social barriers prevent access to medical resources
which contribute to a higher level of IMR. According to Domnaru, (2015), Infant Mortality in
developing countries was due to infections, premature births, complications during delivery as
well as birth injuries. However, Domnaru (2015) observed that many of these causes can be
prevented at a very low cost.
7
WHO (2018) report indicated that among the ten leading mortality risks that contributes to
higher infant mortality in developing countries are;
• Dirty water
• Sanitation
• Smoke from Carbon fuels.
Study by Vakili (2015) on a different observation reported that diseases of infant mortality are
linked to several common trends, scientific development and social programs. The scholars felt
that trend for this decline could include improvement of sanitation, and especially access to
safe drinking water which would dramatically help in decrease of high infant mortality fatal
diseases. Pasteurization of milk and other living standards in urban settings would as well assist
in increase of education and awareness regarding infant mortality in region.
The reviewed literature on empirical studies indicates that mothers with formal education
experience lower risk of infant mortality than those without much education. The economic
status showed incompatible findings given that in some of the studies reviewed families with
accumulative wealth had lower cases of infant mortality due to their potential in the provision
of improved conditions of healthcare and diet. Other studies indicated that mothers from such
households had higher chances of infant mortality due to improper methods of breastfeeding
Attributes on environmental factors indicated a significant consistency that area with cleaner
water sources had lower rate of infant mortality. Countries such as Kenya and Asia are still
struggling to improve their areas in sanitation that threatens the health sector especially that
of vulnerable infants. On the cultural factors, the reviewed literature indicated that Work status,
education and exposure of the mothers influenced the rate of infant mortality.
Though many studies have been carried out previously to identify factors contributing to infant
mortality rates, very few studies have employed machine learning techniques in predicting
infant mortality rates. From all these reviewed literatures, the following gaps was identified.
8
a. Under-utilization of machine learning. Few studies have effectively harnessed
machine learning techniques, which could more accurately capture non-linear
relationships and complex interactions amongst key predictors of infant deaths.
b. Regional disparities. Existing literature tends to emphasize only on developed regions,
thereby neglecting low and middle-income countries where cases of infant mortality is
high.
With these gaps identified, the project aims to fill the gap by developing a predictive model
of infant mortality rates using machine learning methods and identify key determinants
contributing to infant mortality rates. This will help healthcare practitioners implement
necessary policies implement necessary policies to reduce infant deaths.
2.8 Conceptual framework.
This study adopted a modified conceptual framework. The conceptual framework helps in
demonstrating relationship between independent and dependent variables. Each component of
independent variable is critical in determining infant mortality rate among the communities
living in Narok County.
In this conceptual framework, the independent variable considered in this study were;
All these determinants had an impact on infant Mortality rate within the study area because
they are all considered as direct influences on infant mortality rates.
The moderating variables that enhance influence of independent variables on infant mortality
rate were maternal reproductive health and attitude towards primary health services. They
indicated that good maternal reproductive health and positive attitude towards primary health
services lowers IMR whereas reverse is true.
9
CHAPTER THREE
A research design provides an outline for the purpose of data collection and analysis. The type
of design used in this study is mixed methods research design. This method involves use of
both qualitative and quantitative approaches so that an overall strength of study is high. The
descriptive survey research design was chosen because it involves collecting quantitative data
in order to answer current status of infant mortality situations in Narok County. The study used
health workers and mothers to collect data on factors contributing to infant mortality rate in
Narok County. The data collection tools used were questionnaires and interview guide to
collect data from participants.
In this study, the outcome of interest was infant mortality measured as a binary outcome. Infant
mortality was measured as being alive (coded as 1) or dead (coded as 0).
Factors relating to infant mortality was grouped into four factor levels namely; Social factors,
Economic factors, Environmental factors and Cultural factors.
It is the analysis to conduct when the independent variable is binary ie (0 or 1). Here,
independent variable (x) is combined linearly using coefficients values/weights to predict
dependent variable (y).
Logistic regression(p)=ln(p/1-p)
• Step 1; Set of (input, output) training pair samples, calling input sample feature x1,
x2…xn and output result to be y.
• Step 2; let p(x) be a linear function of x. Every increment of a component of (x)
would add or subtract so much of probability.
• Step 3; Calculate odds ratio in favor of a particular event.
10
• Step 4; Define the logit function to calculate logarithm of odds ratio i.e.
logit(p)= log(p/1-p).
• Step 5; Logit function takes input values in the range of 0 and 1 and transforms it to
values over the entire number range, which expresses linear relationship between
feature value and log-odds.
• Step 6; Predict the probability in order to classify the class by use of logistic function.
• Step 7; Output. Set of weights or (wi) for each feature, whose linear combination
predicts value of y.
Random Forest creates multiple decision trees and makes it random. It builds multiple decision
trees and merges them to get a more accurate and stable prediction.
Input: Set of (input, output) training pair samples i.e. input sample feature call it (x1, x2..xn)
and output results as y.
Step 1; Randomly select “k” features from a total of “m” features of dataset where k<m.
Step 2; Among the “k” features, calculate the node “d” using best split point.
Step 3; Split the node into daughter nodes using best split
Step 4; repeat 1 to 3 steps until “i” number of nodes has been reached.
Step 5; Build Forest by repeating steps 1 to 4 for “n” number times to create “n” number of
trees.
Output; On average it takes all the prediction which cancels the biases and attains performance
by selecting the best feature from decision instead of most important features.
Randomized split of 80% and 20% was done, where 80% of total sample (0.8*342=274) was
used as training data to prepare the models and remaining 20% of the sample (0.2*342=68)
was used as a test data to predict measure of model performance.
11
3.5 Performance Evaluation.
The algorithm evaluation is mostly judged by prediction accuracy. The most widely used
technique for summarizing performance of supervised machine learning model is a confusion
matrix.
This is a table used to describe performance of a classification model on the set of test data for
which true values are known.
Model accuracy metrics are metrics that shows how well the model performs in predicting.
Here, the dead and alive cases metrics were;
o Sensitivity; Refers to proportion of subject who have dead cases and have given
positive results
o Specificity; Refers to proportion of subjects who are alive and gives negative test
results.
o Positive predictive value; Refers to proportion of results that are true positives (truly
dead) and negative predictive value refers to proportion of negative results that are
true negatives (truly alive).
o Accuracy; Defined as fraction of prediction that the model got correct. Accuracy
equation is given as;
The study targeted all mothers within the county in reproductive age of 18-49 years and some
key health informants.
12
3.7 Sampling techniques and sample sizes.
The study used both probability and non-probability procedures to select the sample sizes. Non
probability sampling is when the researcher intends to get information from sources with
significant information for the study (interviewing health informants). Probability sampling is
when a researcher specifies participants of sampled population by chance (collecting data from
mothers by means of questionnaires). Therefore, the study used simple random sampling
techniques which is a probability sampling technique.
The study targeted all mothers within the county in reproductive age of between 18 to 49 years
and health officials who provided additional information concerning IMR in the County. A
sample of 381 mothers were interviewed with questionnaires, and 4 key health officials were
directly interviewed, making a total sample used in study to be 385, out of which only 342
questionnaires were recovered.
Research instruments are means by which primary data is collected from sample size. This
study data was collected by use of both questionnaires and interview schedule. Questionnaires
were used to collect data from mothers who were aged between 18-49 years of reproductive
while interview was administered from health informants.
Health officials were visited at their various offices so that data was collected from them
concerning IMR through face-to-face interview.
13
CHAPTER FOUR
This section reports study’s findings. The following are results of study.
Study sampled 381 households and 4 health practitioners at four facilities within the county.
The household mothers in the sample study were administered with questionnaires while the
health care official who were the key informants were interviewed. The response rate was
presented in Table 4.1.
Table 1.1: Response Return Rate
Of the 381 sampled household mothers, 342 returned fully filled and complete questionnaires
which accounted for 89.8% return rate.
All the 4 key informants provided complete interviews accounting for 100.0% return rate.
The study obtained an overall response return rate of 89.9% which study by [Mugenda 2003]
acknowledges that a response rate of at least 70% is sufficient for social science analysis.
Hence the response rate was found to be sufficient for the study.
4.2. Demographic Information.
The background and demographic information of the respondents was captured in a study in
order to understand their profile in relation to the study objectives. This information captured
were presented below.
4.2.1 Age of the respondents
The current age and age at first birth of the respondents were captured. This enabled the
researcher to have enabled an understanding into the child bearing stage at which the
participants were at. Findings on age are presented in Table 1.2.
14
Age group Frequency Percentage
Study found that majority of infants whose mothers participated in study (51.75%) aged above
40 years.
For the case of age at first infant birth, majority of mothers (73.09%) who participated in study
had their first birth while old enough (20 years and above).
Study also found out that there were cases of teenage pregnancies amongst participants
(1.46%).
In this study, the variable of investigation was infant mortality which was considered as the
dependent variable. Infant mortality was measured by establishing whether there were dead
infants aged at age 1, gender of dead infants and age at death.
4.2.2.1 Dead Infants at age ≤ 1 year
The study sought information from the participating women whether; they had in their history,
an infant who died within one year of age. The findings are presented in Table below.
Table 4.2: Any dead infant aged ≤ 1 Year?
15
From the findings, majority of the women participants (80.7%) reported having lost an infant
aged less than 1 year with only (19.3%) reported that they had never had an experience.
4.2.2.2 Infant Mortality by Gender
The study sought to find out how infant mortality varies according to gender. Thus, information
on the gender of dead infants was captured and analysed and presented as shown in Table
below.
Table 4.3: Gender
Total Percentage
Gender 342 100%
175 51.17%
Male
Female 167 48.83%
The study found that there was a slightly higher cases of infant deaths among male infants
(51.17%) compared to female infants (48.83%).
The study also sought to establish infant mortality at various ages and stages of infancy. Thus,
information on age at infant death was captured, analysed and presented in Table below.
Table 4.4: Age at Infant Death
Age at Infant Death Percentage
7 - 12 months 76 22.22%
16
Totals. 342 100%
The study found that majority of mothers (153) lost their infant within 1-6 months. This
concluded that age is also a predictor variable to infant mortality.
4.3. Factors of Determinants of Infant Mortality
The study sought to find out key factors of determinants contributing to infant mortality rates.
It was found that infant mortality was influenced by four main factors namely;
• Social factors
• Economic factors
• Environmental factors
• Social factors.
Key determinants under each factor were examined and results presented in the table below.
4.3.1 Social factors influencing infant mortality rates.
people
17
Place of Urban. 74 55 21.580a .001
residence.
Totals 342 276
Under maternal education, majority of respondent mothers had primary education. Cases of
infant mortality were found to be high amongst mothers with primary level of education,
followed by those with secondary level and lastly the tertiary level.
For the household, majority of mothers came from household of more than 6 people. Cases of
infant deaths were found to be high amongst families of more than 6 persons.
For place of residence, majority of participants (268) came from rural areas and (64) came from
urban areas. Number of infant deaths was found to be high amongst those mothers who came
from rural area (221) followed by those living in urban areas (55).
Chi-square test was used to test for independence and it was found that all determinants
investigated under social factors had significant influence to infant mortality (p-value<0.05).
Participant’s
type of Monogamous 260 205
marriage marriage
Hospital 134 98
18
medical Base Totals 342 276 66.239a .001
attention.
Study found out that under participant’s type of marriage, majority of mothers came from
monogamous marriage.
Under place of medical attention, majority of mothers seek medical attention from homes
(herbalists).
Number of infant deaths was found to be high amongst mothers from polygamous marriages.
It was also found to be high amongst mothers who seek their medical attention from herbalists.
Chi-square test was used to test for independence and key determinants under cultural factors
were found to be statistically significant to infant mortality rates(p-value<0.05).
19
Environmental Category. Freq No. of Chi- Significant
Determinants. infant squared value.
deaths value
Study found that majority of mothers use river as source of their drinking water. Also, majority
of mothers use pit latrines as a mean of toilet facility. Under type of cooking oil, majority of
mothers was found to use firewood.
For the cases of infant deaths, it was found that majority of mothers who use river as source of
drinking water experienced large number of infant deaths. Also, infant mortality was found to
be high amongst mothers who uses pit latrines. Under type of cooking oil, infant mortality was
found to be high amongst those mothers who use firewood.
All key determinants under environmental factors had significant impact to infant mortality (p-
value<0.05).
20
Study found that majority of mothers were not employed. Also, majority of mothers had a
monthly income of below 15000/=.
For the cases of infant deaths, unemployed mothers had highest number of infant deaths
followed by those who own businesses. Also, number of infant deaths was found to be high
amongst mothers with monthly income of below 15000/=.
All key determinants under social factors had significant influence to infant mortality rates (p-
value<0.05).
Predicting infant mortality rates.
Infant mortality was predicted by use of a confusion matrix. The four machine learning models
predicted the following results. The predicted results were compared with actual/real results
and the following was obtained.
21
Gradient boosting. Logistic Regression. K-Nearest Neighbor. Random forest.
Alive Dead Tot Alive Dead Tot Alive Dead Tot Alive Dead To
Actua t
l Alive 37 8 45 30 15 45 26 19 45 43 2 45
Dead 5 18 23 8 15 23 11 12 23 7 16 23
Tot 42 26 68 38 30 68 37 31 68 50 18 68
Model performance
Accur 80.9% 66.2% 55.9% 86.8%
acy
Specif 78.3% 65.2% 52.2% 69.6%
icity
Precis 88.1% 78.9% 70.3% 86%
ion
(PPV)
Sensit 82.2% 66.7% 57.8% 95.6%
ivity
AUC- 75% 50% 55% 67%.
ROC
curve
.
The table above gave out results of four machine learning models namely; Logistic regression,
gradient boosting, random forest and k-Nearest Neighbor.
Infant mortality prediction accuracy was found to be high in random forest (86.8%) followed
by gradient boosting model (80.9%), then logistic regression model (66.2%), then followed by
k-nearest neighbor (55.9%).
Gradient Boosting model had high specificity (78.3%) followed by random forest (69.6%) then
logistic regression model (65.2%) and lastly k-NN (52.2%).
22
Sensitivity was found to be high in random forest models (95.6%) followed by gradient
boosting models (82.2%) then logistic regression (66.7%) and lastly k-nearest neighbor
(57.8%).
Precision was found to be high in gradient boosting model (88.1%) followed by random forest
(86%), followed by logistic regression (78.9%) and lastly K-Nearest Neighbor (70.3%).
Gradient boosting model had highest Area Under ROC curve (75%) the followed by k-NN
(55%) and lastly logistic regression (50%).
Receiver Operating Characteristics (ROC) Curve.
The diagram below shows Area Under the Curve for the four Machine learning models.
From the curves, Gradient boosting model had highest AUC value, hence was best performing
model in differentiating between dead cases and alive cases.
23
CHAPTER FIVE
The results of this study found out that random forest model had higher prediction accuracy
compared other models, hence it was the best performing model.
Study also found out that random forest model had high accuracy results (83.8%) compared to
other models.
The results of the study also found that all factors of determinants of infant mortality rates
(maternal education, place of residence, source of drinking water, place of medical attention,
gender of infant) had significant influence to infant mortality rates (p-value<0.05).
From findings of best performing model, male infants showed importance in predicting infant
mortality compared to female infants. Studies [Monzel, 2018] indicated that male infants are
at high risk of dying at first month because of high vulnerability to infectious diseases. This
might because female infants have biological advantage against many causes of infant
mortality than boys hence less vulnerable to infectious diseases.
5.2 CONCLUSION.
The study obtained an overall response return rate of 88.9% which was found to be adequate
for the study. Majority of the respondents were aged above 40 years and most of the participants
had had their first births when they were at least 19 years. Further, majority of the participants
reported having lost an infant with majority of the infant deaths occurring within the first six
months after delivery.
The study also employed four supervised machine learning algorithms to predict infant
mortality rates and identified important risk factors that will help in policy making.
The model revealed some important predictors of infant mortality rates; therefore, the model
can be used for policy making decisions regarding survival of infants in Narok County. Factors
such as family size, age of mother at her first birth and gender of a child plays an important
role in childhood survival chances within the Narok county.
The study found that all the factors of determinants had significant influence to infant mortality
rates with p < .05.
5.3. Recommendations
Study suggests that future work should be done using regression methods to investigate how
these factors affects infant mortality quantitatively.
24
REFERENCES
Abdalla, M. M., Oliveira, L. G. L., Azevedo, C. E. F., & Gonzalez, R. K. (2018). Quality in
qualitative organizational research: Types of triangulations as a methodological
alternative. Administração: ensino e pesquisa, 19(1), 66-98.
Adewusi, A. O., & Nwokocha, E. E. (2018). Maternal education and child mortality in Nigeria.
The Nigerian Journal of Sociology and Anthropology Vol, 16(1), 112.
Aldirawi, A., El-Khateeb, A., Mustafa, A. A., & Abuzerr, S. (2019). Mothers‟ Knowledge of
Health Caring for Premature Infants after Discharge from Neonatal Intensive Care
Units in the Gaza Strip, Palestine. Open Journal of Pediatrics, 9(03), 239.
Anele, C. R., Hirakata, V. N., Goldani, M. Z., & da Silva, C. H. (2020). The influence of the
Municipal Human Development Index compared to maternal education on infant
mortality: a retrospective cohort study in the extreme south of Brazil.
Anyon, Y., Bender, K., Kennedy, H., & Dechants, J. (2018). A systematic review of youth
participatory action research (YPAR) in the United States: Methodologies, youth
outcomes, and future directions. Health Education & Behavior, 45(6), 865-878.
Baker, J. L., & Gadgil, G. U. (Eds.). (2017). East Asia and Pacific cities: Expanding
opportunities for the urban poor. The World Bank.
Banda, L. G. (2020). Limitations of the Use of Modernization Theory in Formulating and
Implementing Development Policies in Africa–The case of Tanzania and Malawi.
Journal of Development Economics, Forthcoming.
Bassani, D. G., Jha, P., Dhingra, N., & Kumar, R. (2010). Child mortality from solidfuel use in
India: a nationally-representative case-control study. BMC public health, 10(1): 1-9.
Bednarczuk, N., Milner, A., & Greenough, A. (2020). The role of maternal smoking in sudden
fetal and infant death pathogenesis. Frontiers in Neurology, 11, 1256.
25
APPENDIX II: HOUSEHOLD QUESTIONNAIRE
Below 20 years
20 – 29 years
30 – 39 years
Below 19 years
Yes
No
If yes, state the number…………….
4. Have you ever had any case of infant mortality in your parenthood?
Yes
No
If yes, kindly complete the information required in the table below.
Sex of the Age at death (in
child months)
26
SECTION C: SOCIAL FACTORS ON INFANT MORTALITY
5. How many people stay within your house................?
6. What is your highest academic qualification?
Primary (KCPE)
Secondary (KCSE)
College/University (Certificate/Bachelor)
SECTION D: ECONOMIC FACTORS ON INFANT MORTALITY
7. What do you do for a living?
Farmer
Not employed
Business
8. What is your approximate your monthly income?
Below 10 000
Above 30 000
SECTION E: CULTURAL FACTORS ON INFANT MORTALITY.
9. In most occasions where do mothers conduct their child delivery from?
Home
Village attendant
Hospital
10. In what kind of marriage are you?
Monogamy
Polygamy
11. What kind of medication do you seek for when your children are sick?
27
Herbalist
Hospital
SECTION F: ENVIRONMENTAL FACTORS ON INFANT MORTALITY
14. What is the main source of the drinking water in this area?
River/ Stream
Borehole
Tap
15. What type of toilet facilities do you mainly have around?
Pit latrine
Bush
16. What type of cooking energy do you use at home?
Firewood
Charcoal
28