Kirkpartrick Model
Kirkpartrick Model
Introduction
One of the best-known and most widely used models for evaluating training and
development programs is Kirkpatrick’s (1959) four-level evaluation model (the four levels
being a reaction, learning, behavioral and results). Between November 1959 and February
1960, Kirkpatrick published a series of four articles titled “Techniques for evaluation of
training programs” in the Journal of the American Society of Training Directors. Developed
for use within business organizations (Bates, 2004), the model has received much attention
from researchers and organizations (Lien et al., 2007), as it provides a sound foundation for
the examination of the effect of training on organizations (Watkins et al., 1998). Over the
years, it has evolved to become one of the most widely accepted and influential models for
the assessment of training and development programs in a range of settings (Phillips, 1991;
Bassi and Cheney, 1997; Tamkin et al., 2002). It has also formed the basis of other
approaches to the assessment of training and development; as Bates (2004, p. 342) states,
“[t]he Kirkpatrick model has been the seed from which a number of other evaluation models
Received 17 December 2020 have germinated.”
Revised 9 March 2021
22 June 2021
21 July 2021
As 60 years have now passed, as Kirkpatrick created his model, this bibliometric study
Accepted 28 July 2021 intends to reconsider the model, its utility, its effectiveness in meeting the need to evaluate
PAGE 36 j INDUSTRIAL AND COMMERCIAL TRAINING j VOL. 54 NO. 1 2022, pp. 36-63, © Emerald Publishing Limited, ISSN 0019-7858 DOI 10.1108/ICT-12-2020-0115
training activities, its importance in the field measured by the growth in studies on the model
and its applications in various settings and contexts. Furthermore, a bibliometric analysis of
business and training journals will provide better insight for scholars in the field. The paper
also examines the criticisms leveled at the model.
Bibliometrics, which uses scientometric methods, is defined as “the application of
mathematical and statistical methods to books and other means of communication, which
are mainly in charge of the management of libraries and documentation centers”
(Repanovici, 2010, p. 2). Typically used within the field of library and information science
(Repanovici, 2010), a bibliometric review is a type of domain-based review and a type of
systematic literature review (Paul and Criado, 2020). It involves analysis of a large number
of different types of publications using statistical tools to explore specific research areas for
current trends and for citations and/or co-citations; the scope of the analysis can be
restricted by several criteria, such as spatial factors (e.g. country), temporal factors (e.g.
year), author, journal, method, theory and research problem (Paul and Criado, 2020).
The process of defining questions in mapping a study is ill-defined and guidance is needed
to define the research questions (Jia et al., 2016). To identify a starting point and the key
elements of the research to achieve the study’s aim, this study adopted a “5Ws þ1H” model
(why, when, who, where, what and how), commonly used in journalism to discover the most
important aspects of a story (Kipling, 1912). In this case, the model was used to explore
how the Kirkpatrick model has been applied. The 5Ws þ 1H model uncovers essential
information that needs to be considered for any story to be told, namely, motivation, time,
actor, location, content and causality (Jia et al., 2016). A helpful starting point for these
questions is shown below as follows (Tattersall, 2015a, 2015b; Jia et al., 2016; Calero et al.,
2020):
䊏 Why: why was this research conducted? Why was there a need for it?
䊏 When: when was this study conducted? When did the project start and when did it
finish?
䊏 Who: who conducted the research?
䊏 Where: where were the results published?
䊏 What: what were the results of this research?
䊏 How: how was this study conducted?
Various studies have performed bibliometric analysis using a 5Ws approach for information-
gathering or problem-solving. For example, Calero et al. (2020) conducted a bibliometric
analysis of green and sustainable software using a 5Ws or 5Ws þ 1H approach (Galindo
et al., 2019; Bibi et al., 2020) and Rahimi and Rosman (2020) used the 5Ws approach to
investigate enterprise content management.
For the purpose of this study, the when and how questions are related, showing the time of
the study and the mechanism adopted; therefore, when and how were merged. This is a
valid approach that has been adopted by other authors, such as Hart (1996), who reports
that the sixth item, how, can be rephrased or merged with any of the 5 W questions. The
approach allows researchers to explore key subjects while retaining control of what they say
and how they say it. The use of this approach may also help researchers to present their
findings to an audience without much effort (Tattersall, 2015a, 2015b).
Literature review
Related work
The goal of a systemic literature review (SLR) is to identify and evaluate relevant research to
answer a specific research question (Papamitsiou and Economides, 2014). Several SLRs
Kirkpatrick model
Between November 1959 and February 1960, Kirkpatrick published four articles titled
“Techniques for evaluation of training programs.” The purpose of Kirkpatrick’s model was to
provide an efficient means and systematic way for managers to evaluate training outcomes
among employees and in organizational systems (Cahapay, 2021). The Kirkpatrick model
comprises the following four levels: Level 1 – Reaction, which assesses participants’
satisfaction and interest in the training; Level 2 – Learning, which assesses the extent of
skills and knowledge gained; Level 3 – Behavioral, which measures trainees’ ability to apply
learned knowledge and skills in the workplace; and Level 4 – Results, which measures the
effect of training on the organization.
The Kirkpatrick model has inspired the development of several other assessment models
(Bates, 2004; Holton, 1996; Kaufman et al., 1995). Therefore, most of the evaluation models
found in the literature are based on the Kirkpatrick model (Holton, 1996; Nickols, 2005; Reio
et al., 2017).
However, the ascending order of the value of its levels and the assumptions surrounding
causality in the Kirkpatrick model have been criticized (Alliger and Janak, 1989; Bernthal,
1995; Clement, 1982). According to Giangreco et al. (2010), the implication of the first
criticism is that behavioral change has higher significance than positive reactions and a
positive outcome at Level 4 is the ultimate goal of every training program. Thus, the model is
hierarchical in nature, with its four levels arranged in ascending order (Reio et al., 2017).
However, in practice, an outcome is not necessary at each level in the training program; for
example, those designed to instill company pride in employees can only be expected to
have an output (that is, impact) at Level 1 (Reaction) while those involving learning about
organizational history and philosophy may only be evaluated at Level 2 (Learning) (Alliger
and Janak, 1989). Therefore, studies and empirical results do not furnish sufficient proof to
support the assumption that every subsequent level presents more valuable information
than the previous one (Bates, 2004).
The second assumption is the causal chain between levels, with a pragmatic response
required for learning to occur and learning being essential for the transfer (Holton, 1996).
However, Levels 1 and 2 can be evaluated simultaneously (Alliger and Janak, 1989),
suggesting that one of the levels does not necessarily have an impact on the other.
Furthermore, learning can often be difficult and methods of ensuring effective learning may
be uncomfortable for participants (Knowles, 1980; Rodin and Rodin, 1973). However,
research has found that an unpleasant experience in training can somewhat encourage
learning (Rodin and Rodin, 1973), which suggests the possibility of a negative correlation
between Levels 1 and 2. Moreover, a fun lecture does not necessarily cause more learning
(Kaplan and Pascoe, 1977), as good reactions do not imply good learning (Rowold, 2007).
Therefore, it could be argued that there is a relationship between the levels, but that this
relationship is very complex. The issue of causality, therefore, remains controversial within
Formulating 5Ws þ 1H
In this study, the researchers developed their own specific research questions, each one
corresponding to the appropriate “W” or “H.” Here, we present the research questions and
then answer the “why” question. The answers to the remaining questions on research into
the use of Kirkpatrick’s model are provided in the methodology and results in sections.
Research questions
Why
RQ1. Why is the field of research relevant?
When and how
RQ2. When was this study conducted? How?
Who
RQ3. Who are the main authors in this area?
RQ4. Which are the main institutions in this area?
Methodology (“when”)
RQ2. When was this study conducted? How?
There are many multidisciplinary bibliographic databases, including Scopus, Web of
Science (WoS), Google Scholar (Balstad and Berg, 2020; Wei, 2020) and Dimensions (van
den Besselaar and Sandström, 2020). In this study, the Scopus, Dimensions and WoS
databases were selected as sources of publications related to the study. Google Scholar
was omitted for several reasons. First, it does not always produce consistent search results,
as its indexing methods are not as strict as those of other databases, such as Scopus or
WoS (Shareefa and Moosa, 2020). Second, its search results cannot be downloaded, unlike
those from other databases, for example, WoS or Dimensions. Third, there is no application
programming interface (API) that can be used for search purposes (Moral-Muñoz et al.,
2020). Therefore, three databases were used to retrieve bibliometric data sets.
The research methodology is presented in Figure 1. The researchers set different
parameters for the search. Initially, it covered all languages to retrieve all publications
focusing on the search area. Similarly, in the first instance, the study focused on all subject
domains but later narrowed this to social sciences; however, a bibliometric analysis of both
areas (general and social sciences) was performed. The date was set to the default, which
showed no limits in terms of date. The focus in the search fields was set to title, abstract and
keywords for all the index databases.
The choice of keywords to be used in the search was based on different considerations to
obtain the highest degree of accuracy in search results. The study focuses on the
Kirkpatrick model and its use as a tool for assessment or evaluation. As the Kirkpatrick
model has been called “a model, system, framework, taxonomy, methodology, typography
and a vocabulary” (Holton, 1996, p. 50), all these terms were included within the search
string, as shown below.
An electronic search of the selected databases (Scopus, Dimensions and WoS) was
performed on July 14, 2020, using the keywords in Figure 2: String 1. The records from all
the databases can be seen in Figure 3, which shows that the highest number of records
was retrieved from the Scopus database. The chronological order of scientific publications
within these databases can be seen in Figure 4. Vı́lchez-Roma n et al. (2020) give three
reasons for the superiority of Scopus over other databases as follows: Scopus holds twice
as many indexed journals as other databases (Shareefa and Moosa, 2020), as seen in
Figure 3; it includes non-English indexed sourcing; and previous bibliometric analysis
studies support the superiority of Scopus over other databases.
Several tools were used to analyze or visualize the data set retrieved from the Scopus
database. Two bibliometric programs were used as follows: Bibliometrix (Version 3.0) and
VOSviewer. Bibliometrix is an open-source program in the R language that was developed
to conduct a comprehensive quantitative science mapping analysis (Aria and Cuccurullo,
2017). It uses metadata from indexing databases, such as Scopus, WoS, Dimensions and
PubMed, to survey publications. Bibliometrix helps to calculate, evaluate and rank co-
citation, co-authorship, co-occurrence and other types of measures on metadata. Its results
show different types of outputs, such as co-wording of keywords, abstract, scientific
publications by country, publication sourcing and citations and network collaborations in
terms of countries and authors. It uses various types of algorithms to conduct these data-
mining techniques (Moral-Muñoz et al., 2020).
Figure 4 Number of publications (1972–2020) for the nine most commonly used models
VOSviewer is a software tool that creates maps based on network data and visualizes and
explores these data (Eck and Waltman, 2020). It uses metadata from indexing databases,
such as Scopus, WoS and others, to show co-citation, bibliographic coupling, co-
authorship and co-occurrence using clustering methods; it also has the advantage over
other bibliometric tools of visualization (Moral-Muñoz et al., 2020). Other tools, such as MS
Excel and Google Sheets, were also used for data visualization.
publications and journal publications. The majority of publications (more than 70%) are
journals articles, as Table 1 shows.
The growth in the number of publications in all fields, including social sciences, is shown in
Figure 5. This indicates the increased interest of researchers and publishers, in the
Kirkpatrick model for evaluating training programs. For social sciences, 1998 appears to be
the starting point for such research.
The annual growth rate (AGR) refers to the total number of publications that appeared
compared with the previous year. It is calculated using the number of publications in one
year and the number of publications from the previous year, as shown in equation (1):
!
N:publicationYear N:publicationprevious Year
AGR ¼ 100 (1)
N:publicationprevious Year
Table 2 shows the AGR of our data in all fields and in social sciences for a nine-year period
from 2011 to 2019 (we have removed 2020 since the year had not finished when the search
was conducted). As can be seen, for all fields, the positive values are maintained except in
2013 when there was a decrease (21.7%). Similarly, in the social sciences field, the positive
values are maintained except for two years in which they decreased, namely, 2013 (27.3%)
and 2016 (46.7%). This indicates that the research on Kirkpatrick’s model or its application
is an active area and is growing in all fields, including social sciences.
To estimate the trend in this field in the future and predict the number of publications in
coming years, the least squares method was used, as shown in equation (2):
P P !
Y ðX Y Þ
Y ¼ P þ P 2 X (2)
NYears X
A straight line was calculated using the data for the period 2008–2019, where Y is the
number of publications, N is the number of years and X is the base, which is assumed to be
zero. As per the analysis in Table 3, the estimate indicates that we would expect five
additional publications to be produced each year compared with the previous year in all
fields, including social sciences.
Who
This section presents statistics related to the authorship of papers on the Kirkpatrick model.
Therefore, it will answer RQ4 and RQ5.
RQ3. Who are the main authors in this area?
Table 4 shows the most prolific authors in general and in social sciences specifically, who
use or discuss the Kirkpatrick model. It shows that the most prolific authors in general are
Kumar, Liu, Reeves, Sotomayor, Steinert and Triviño, each with three papers; Reeves has a
further three papers in social sciences, making him the most prolific author in this field, as
Figure 6 confirms. Using VOSviewer to provide a graphical representation of the interaction
2025 21 83 21 29
2024 19 78 19 27
2023 17 73 17 26
2022 15 69 15 24
2021 13 64 13 23
2020 35 11 18 11
2019 68 9 612 81 27 9 243 81
2018 53 7 371 49 15 7 105 49
2017 34 5 170 25 15 5 75 25
2016 33 3 99 9 8 3 24 9
2015 30 1 30 1 15 1 15 1
2014 22 1 22 1 11 1 11 1
2013 18 3 54 9 8 3 24 9
2012 23 5 115 25 11 5 55 25
2011 18 7 126 49 9 7 63 49
2010 17 9 153 81 6 9 54 81
Kumar, A. 3 Reeves, S. 3
Liu, Z. 3 Abdulghani, H. M. 2
Reeves, S. 3 Ahten, S. M. 2
Sotomayor, T. M. 3 Baker, L. 2
Steinert, Y. 3 Barr, H. 2
Triviño, X. 3 Bezuidenhout, J. 2
Abdulghani, H. M. 2 Borduas, F. 2
Ahten, S. M. 2 Catalano, G. D. 2
Figure 6 Authors with the most co-authorship interaction on publications using the
Kirkpatrick model
of authors in social sciences, we can see that there are three clusters as follows: Cluster 1
comprises Birch, Boot, Davies, Fletcher, Kitto, McFadyen and Rivera; Cluster 2 comprises
Baker, Cameron, Egan-Lee, Esdaile, Friesen and Onyura; and Cluster 3 comprises Reeves
(with Barr, Freeth, Hammick and Koppel). We can conclude that Cluster 1 is the most
interrelated.
Figure 6 provides a graphical representation of co-authorship interaction on publications
using this model.
Table 5 shows the percentages of single-authored publications and multi-authored
publications by year over a 20-year period from 2000 to July 2020. The vast majority of
publications are multi-authored. Single-authored publications account for only 11.2% of the
total, the remaining 88.8% being multi-authored; this indicates a high degree of
collaboration among authors in this area.
RQ4. Which are the main institutions in this area?
An analysis of the most prolific institutions is shown in Table 6. Although papers on studies
using the Kirkpatrick model originate in several institutions in various countries, the three
most prolific institutions in publishing in all fields are also the most prolific institutions in
social sciences, which are the University of Toronto, the University of British Columbia and
the University of Alberta.
Table 6 Most prolific institutions with regard to publishing on the Kirkpatrick model
All fields Social sciences
No. of No. of
No. Institution publications (%) Institution publications (%)
Where
In this section, we will present the results of the analysis of publication types and countries.
RQ5. Which countries/regions produce the majority of publications on the Kirkpatrick
model?
Figures 7 and 8 show the distribution of scientific publications by country in all fields and in
social sciences, respectively.
As can be seen from both figures, the United States (USA) is the most prolific country and
the highest contributor to the body of knowledge in this area, followed by the United
Kingdom (UK), Canada and Australia. The Kirkpatrick model is well-established in the USA
and the abundance of publishing in the USA and the other countries listed above may be
attributed to the scientific progress in those countries, as well as to cooperation between
authors. Both figures reveal that the publication of scientific articles on the application of the
Kirkpatrick model in the evaluation of training is still in its initial stage of growth in the Middle
East and Africa. It can be argued that the reasons some parts of the world still do not apply
this model may be cultural (e.g. research on the model was published in English and these
countries use their own language in research) or educational (e.g. there may be a
difference between developing and developed countries in terms of education and
experience). Furthermore, knowledge level and field maturity can make a significant
difference between region-specific contexts; for example, in developing countries, the
focus may be on fundamental issues to be solved, whereas, in developed countries, the
focus is usually on developing, promoting and enhancing systems in a specific context,
such as enhancing the evaluation of training in education, management, medicine or other
fields.
The analysis was taken a step further to determine the clustering and co-occurrences of
countries. This was done using VOSviewer, where parameters were set to “analysis type:
co-authorship” and “minimum number of country occurrences: five.” Regarding
collaboration among countries, results in both Figures 9 and 10 indicate that there are three
groups of countries. The USA, Australia and The Netherlands are in the first group; the UK,
Iran and India are in the second group; and Canada and South Africa are in the third group.
RQ6. Which journals and conferences are the most effective (in terms of the number of
publications) in sharing information about the Kirkpatrick model?
The analysis of the extracted data set shows that many types of publications, including
articles and conference papers, constitute the majority of the body of knowledge on the
Kirkpatrick model.
Table 7 shows the top 10 journals in all fields and in social sciences. According to Table 7,
the top journals, in general, were in the social science domain, with the evaluation of
Table 7 Most effective journals for sharing information about the Kirkpatrick model
All fields Social science
Forum N. publications Forum N. publications
What
In this section, we will present the results of the analysis of authors’ keywords and search
domains.
RQ7. What are the most common keywords in publications?
Table 8 Most effective conferences for sharing information about the Kirkpatrick model
All fields
Forum N. publications
The analysis was taken a step further to determine the clustering and co-occurrence of
authors’ keywords using VOSviewer, where parameters were set to “analysis type: co-
occurrence” and “minimum number of keyword occurrences: seven.” The retrieved
keywords in social sciences totaled 29, as seen in Figure 13 and were divided into seven
clusters. Table 9 shows four clusters. Figure 14 shows that the retrieved keywords in all
fields totaled 32, with the recent keywords being “teaching,” “students,” “quality
improvement,” “patient safety,” “continuing education,” “simulations,” “medical,” “dental
education” and “program evaluation.” This showed that the focus tends to be on higher
education, especially in medical schools; this is also shown in Figure 15, which presents a
timeline of trending topics in all fields.
As expected, Figures 13 and 14 indicate that research into the Kirkpatrick model focuses
on the field of social sciences, especially on assessments of curriculum, students and
education. According to the clustering table, SLR studies related to the Kirkpatrick model
occurred only in the field of medical education. Figure 15 affirms these findings by showing
that the most common topics in all fields in 2016 were evaluation, training, curriculum and
Sensitivity analysis
A sensitivity analysis of the publications was conducted to ensure that the search string had
been built correctly and minimize the level of risk to the sample accuracy. To calculate the
required number of random papers to be manually examined, we used Cochran’s sample
size formula (Cochran, 1977), as shown in equation (3):
Z 2 pð1p Þ
e
2
(3)
1 þ Z 2 p 1p
e2 N
In the above formula, N is the total number of publications obtained from the Scopus
database using the search string (416). Z represents the parameter matching the
confidence interval. It is 95%, which implies that Z = 1.96. The population proportion or
population expected to have the characteristic of the study interest is 50%, which implies
that p = 0.5. Furthermore, e stands for the desired margin of error; it is set to 8%, which
implies that e = 0.08.
Of the 110 randomly selected papers, 107 (97%) were accurate in that they were about the
Kirkpatrick model; only 3% were inaccurate, namely, there was no relationship between the
papers and the search string. Therefore, around 6% out of 10% were estimated
(i.e. confidence level e showed an adequate level of quality for the obtained data to be
analyzed).
Conclusion
This paper has discussed the emergence of the Kirkpatrick model for the evaluation of
training processes and the growth in its applications. It has presented the results of a
bibliometric analysis of publications on the Kirkpatrick model from 1959/1960 to July 2020
retrieved from Scopus. The search identified 416 papers in all fields, including 169
specifically in social sciences. The paper has answered the 5Ws þ 1H (why, when, who,
where, what and how) to explore key topics in this area. It concludes that the interest in
publishing on the evaluation of training started after 2000; however, Kirkpatrick’s model
remains the most commonly used model for evaluating training. The research using
Kirkpatrick’s model is an active and growing area. Moreover, our estimates indicate that five
additional publications are likely to be produced each year compared with the previous
year. The vast majority of publications in this area are multi-authored, with 88.8% of papers
demonstrating this trend.
The majority of the body of knowledge in this area comprises research papers and the most
effective journals for this type of paper are those in the medical field. Currently, there are no
effective conferences for publishing studies on the Kirkpatrick model.
Some parts of the world, such as the USA, the UK and Canada, are very active in research
in this area, whereas other parts of the world, including the Middle East, Africa and Russia,
are still not applying this model.
The most common subjects for the application of the Kirkpatrick model are social sciences,
medicine and computer science. This proves the popularity and applicability of using the
Kirkpatrick model in various fields as it was first developed for use within business
organizations.
Note
1. shorturl.at/hrxHQ
References
Agarwal, N., Pande, N. and Ahuja, V. (2019), “Expanding the Kirkpatrick evaluation model-towards more
efficient training in the IT sector”, Human Performance Technology, pp. 1092-1109.
Antos, M. and Bruening, T. (2006), “A model hypothesizing the effect of leadership style on the transfer of
training”, Journal of leadership Education, Vol. 5 No. 3, pp. 31-52.
Aria, M. and Cuccurullo, C. (2017), “Bibliometrix: an R-tool for comprehensive science mapping
analysis”, Journal of Informetrics, Vol. 11 No. 4, pp. 959-975.
Balstad, M.T. and Berg, T. (2020), “A long-term bibliometric analysis of journals influencing management
accounting and control research”, Journal of Management Control, Springer Berlin Heidelberg, Vol. 30
No. 4, pp. 357-380.
Bassi, L.J. and Cheney, S. (1997), “Benchmarking the best’, training and development”, Training &
Development, Vol. 51 No. 11, p. 60.
Bates, R. (2004), “A critical analysis of evaluation practice: the Kirkpatrick model and the principle of
beneficence”, Evaluation and Program Planning, Vol. 27 No. 3, pp. 341-347.
Beech, B. and Leather, P. (2006), “Workplace violence in the health care sector: a review of staff training
and integration of training evaluation models”, Aggression and Violent Behavior, Vol. 11 No. 1, pp. 27-43.
Bernardino, G. and Curado, C. (2020), “Training evaluation: a configurational analysis of success and failure of
trainers and trainees”, European Journal of Training and Development, Vol. 44 Nos 4/5, pp. 531-546.
Bernthal, P.R. (1995), “Evaluation that goes the distance”, Training & Development, Vol. 49 No. 9,
pp. 41-46.
Bibi, S., Zozas, I., Ampatzoglou, A., Sarigiannidis, P.G., Kalampokis, G. and Stamelos, I. (2020),
“Crowdsourcing in software development: empirical support for configuring contests”, IEEE Access,
IEEE, Vol. 8, pp. 58094-58117.
Bomberger, D.W. (2003), Evaluation of Training in Human Service Organizations: A Qualitative Case
Study, The PA State University.
Brauckmann, S. and Pashiardis, P. (2011), “Contextual framing for school leadership training: empirical
findings from the commonwealth project on leadership assessment and development (Co-LEAD)”,
Journal of Management Development, Vol. 31 No. 1, pp. 18-33.
Brinkerhoff, R.O. (1987), “Achieving results from training”, San Francisco, CA: Jossey-Bass.
Bushnell, D.S. (1990), “Input, process, output: a model for evaluating training”, Training & Development
Journal, Vol. 44 No. 3, pp. 41-43.
Cahapay, M.B. (2021), “Kirkpatrick model: its limitations as used in higher education evaluation”,
International Journal of Assessment Tools in Education, Vol. 8 No. 1, pp. 135-144.
Calero, C., Mancebo, J., Garcia, F., Moraga, M.A., Berna, J.A.G., Fernandez-Aleman, J.L. and Toval, A. (2020),
“5Ws of green and sustainable software”, Tsinghua Science and Technology, Vol. 25 No. 3, pp. 401-414.
Campbell, K., Taylor, V. and Douglas, S. (2019), “Effectiveness of online cancer education for nurses and
allied health professionals; a systematic review using Kirkpatrick evaluation framework”, Journal of
Cancer Education, Vol. 34 No. 2, pp. 339-356.
Cannon-Bowers, J.A. and Salas, E. (1997), “A framework for developing team performance measures in
training”, in Brannick, M.T., Salas, E. and Prince, C. (Eds), Series in Applied Psychology. Team
Performance Assessment and Measurement: Theory, Methods, and Applications, Lawrence Erlbaum
Associates Publishers, NJ, pp. 45-62.
Carlfjord, S., Roback, K. and Nilsen, P. (2017), “Five years’ experience of an annual course on
implementation science: an evaluation among course participants”, Implementation Science, Vol. 12
No. 1, pp. 1-8.
Chang, Y.E. (2010), “An empirical study of Kirkpatrick’s evaluation model in the hospitality industry”,
Florida International University, Doctoral thesis, Florida International University, 12 November, doi:
10.25148/etd.FI10120807.
Fitzpatrick, J.L., Sanders, J.R. and Worthen, B.R. (2003), Program Evaluation: Alternative Approaches
and Practical Guidelines Hardcover, 3rd ed., Pearson, London.
Frye, A.W. and Hemmer, P.A. (2012), “Program evaluation models and related theories: AMEE guide no.
67”, Medical Teacher, Vol. 34 No. 5, pp. 288-299.
ndez, A.M. and Ruiz-Cortés, A. (2019),
Galindo, J.A., Benavides, D., Trinidad, P., Gutiérrez-Ferna
“Automated analysis of feature models: quo Vadis?”, Computing, Springer Vienna, Vol. 101 No. 5,
pp. 387-433.
Garousi, V. and Mäntylä, M.V. (2016), “Citations, research topics and active countries in software
engineering: a bibliometrics study”, Computer Science Review, Elsevier Inc, Vol. 19, pp. 56-77.
Ghorbandoost, R., Zeinabadi, H., Mohammadi, Z. and Shafiabadi, M.S. (2018), “Evaluating the
effectiveness of neonatal resuscitation training course on nurses of kowsar medical center in Qazvin
university of medical sciences based on Kirkpatrick model”, Annals of Tropical Medicine and Public
Health, Vol. 3.
Giangreco, A., Carugati, A. and Sebastiano, A. (2010), “Are we doing the right thing? Food for thought on
training evaluation and its context”, Personnel Review, Vol. 39 No. 2, pp. 162-177.
Hart, G. (1996), “The five W’s: an old tool for the new task of audience analysis”, Technical
Communication, Vol. 43 No. 2, pp. 139-145.
Hill, A.G., Yu, T.C., Barrow, M. and Hattie, J. (2009), “A systematic review of resident-as-teacher
programmes”, Medical Education, Vol. 43 No. 12, pp. 1129-1140.
Ho, A.D.D., Arendt, S.W., Zheng, T. and Hanisch, K.A. (2016), “Exploration of hotel managers’ training
evaluation practices and perceptions utilizing Kirkpatrick’s and Phillips’s models”, Journal of Human
Resources in Hospitality & Tourism, Taylor & Francis, Vol. 15 No. 2, pp. 184-208.
Holton, E.F. (1996), “The flawed four-level evaluation model”, Human Resource Development Quarterly,
Vol. 7 No. 1, pp. 5-21.
Holton, E.F. (2005), “Holton’s evaluation model: new evidence and construct elaborations”, Advances in
Developing Human Resources, Vol. 7 No. 1, pp. 37-54.
Jain, G., Sharma, N. and Shrivastava, A. (2021), “Enhancing training effectiveness for organizations
through blockchain-enabled training effectiveness measurement (BETEM)”, Journal of Organizational
Change Management, Vol. 34 No. 2, doi: 10.1108/JOCM-10-2020-0303.
Jia, C., Cai, Y., Yu, Y.T. and Tse, T.H. (2016), “5Wþ1H pattern: a perspective of systematic mapping
studies and a case study on cloud software testing”, Journal of Systems and Software, Elsevier Ltd,
Vol. 116, pp. 206-219.
Johnston, S., Coyer, F.M. and Nash, R. (2018), “Kirkpatrick’s evaluation of simulation and debriefing in
health care education: a systematic review”, Journal of Nursing Education, Vol. 57 No. 7, pp. 393-398.
Kang, J., Yang, E.B., Chang, Y.J., Choi, J.Y., Jho, H.J., Koh, S.J., Kim, W.C., Choi, E., Kim, Y. and Park, S.
(2015), “Evaluation of the national train-the-trainer program for hospice and palliative care in Korea”,
Asian Pacific Journal of Cancer Prevention, Vol. 16 No. 2, pp. 501-506.
Kaplan, R.M. and Pascoe, G.C. (1977), “Humorous lectures and humorous examples: some effects upon
comprehension and retention”, Journal of Educational Psychology, Vol. 69 No. 1, pp. 61-65.
Kaufman, R. and Keller, J.M. (1994), “Levels of Evaluation: Beyond Kirkpatrick”, Human Resource
Development Querterly, Vol. 5, pp. 371-380.
Kaufman, R., Keller, J. and Watkins, R. (1995), “What works and what doesn’t: evaluation beyond
Kirkpatrick”, Performance þ Instruction, Vol. 35 No. 2, pp. 8-12.
Kazan, E., Usmen, M., Desruisseaux, B., Kaya, S. and Seyoum, M. (2019), “Training effectiveness
analysis of osha silica and excavation standards for construction”, in Passerini, G., Garzia, F. and
Lombardi, M. (Eds), 8th International Conference on Safety and Security Engineering, SAFE 2019, WIT
Press, Italy, pp. 33-42.
Kipling, R. (1912), Just so Stories, MacMillan, London.
Knowles, S.M. (1980), The Modern Practice of Adult Education: From Pedagogy to Andragogy,
Cambridge Book Company, Cambridge.
Kraiger, K., Ford, J.K. and Salas, E. (1993), “Application of cognitive, skill-based, and affective theories of
learning outcomes to new methods of training evaluation”, Journal of Applied Psychology, Vol. 78 No. 2,
pp. 311-328.
Kirkpatrick, D.L. (1959), “Techniques for evaluating training programs”, Journal of ASTD, pp. 1-13.
Kumpikaite,_ V. (2007), “Human resource training evaluation”, Engineering Economics, Vol. 55 No. 5,
pp. 29-36.
Lee, S.Y., Fisher, J., Wand, A.P.F., Milisen, K., Detroyer, E., Sockalingam, S., Agar, M., Hosie, E. and
Teodorczuk, A. (2020), “Developing delirium best practice: a systematic review of education
interventions for healthcare professionals working in inpatient settings”, European Geriatric Medicine,
Springer International Publishing, Vol. 11 No. 1, doi: 10.1007/s41999-019-00278-x.
Lewis, S.C., Zamith, R. and Hermida, A. (2013), “Content analysis in an era of big data: a hybrid approach
to computational and manual methods”, Journal of Broadcasting & Electronic Media, Vol. 57 No. 1,
pp. 34-52.
Li, T., Yang, Y. and Liu, Z. (2008), ““An improved neural network algorithm and its application on
enterprise strategic management performance measurement based on Kirkpatrick model”, Proceedings
– 2008 2nd International Symposium on Intelligent Information Technology Application, IITA 2008, Vol. 1,
IEEE, Washington, DC, pp. 861-865.
Li, Z., Cheng, J., Zhou, T., Wang, S., Huang, S. and Wang, H. (2020), “Evaluating a nurse training
program in the emergency surgery department based on the Kirkpatrick’s model and clinical demand
during the COVID-19 pandemic”, Telemedicine and e-Health, Vol. 26 No. 8, pp. 985-991.
Liao, P.W. (2019), “Experiential learning is an effective training model to improve self-esteem”,
Humanities & Social Sciences Reviews, Vol. 7 No. 5, pp. 165-173.
Lien, B.Y.H., Yu Yuan Hung, R. and McLean, G.N. (2007), “Training evaluation based on cases of
Taiwanese benchmarked high-tech companies”, International Journal of Training and Development,
Vol. 11 No. 1, pp. 35-48.
Liu, Z., Zhang, S. and Xiong, F. (2008), ““A novel FNN algorithm and its application on M&A performance
evaluation based on Kirkpatrick model”, Proceedings – 2008 International Conference on MultiMedia and
Information Technology, MMIT 2008, 30 Decmber, IEEE, Gorges, China, pp. 66-69.
MacKie, D. (2007), “Evaluating the effectiveness of executive coaching: where are we now and where do
we need to be?”, Australian Psychologist, Vol. 42 No. 4, pp. 310-318.
MacLure, K. and Stewart, D. (2016), “Digital literacy knowledge and needs of pharmacy staff: a
systematic review”, Journal of Innovation in Health Informatics, Vol. 23 No. 3, pp. 560-571.
Maiti and Bidinger (2016), “A BEME systematic review of the effects of interprofessional education: BEME
guide no. 39”, Journal of Chemical Information and Modeling, Vol. 53 No. 9, pp. 1689-1699.
Maudsley, G. and Taylor, D. (2020), “Analysing synthesis of evidence in a systematic review in health
professions education: observations on struggling beyond Kirkpatrick”, Medical Education Online,
Vol. 25 No. 1, doi: 10.1080/10872981.2020.1731278.
Moral-Muñoz, J.A., Herrera-Viedma, E., Santisteban-Espejo, A. and Cobo, M.J. (2020), “Software tools
for conducting bibliometric analysis in science: an up-to-date review”, El Profesional de La Informacion,
Vol. 29 No. 1, pp. 1-20.
Moreau, K.A. (2017), “Has the new Kirkpatrick generation built a better hammer for our evaluation
toolbox?”, Medical Teacher, Vol. 39 No. 9, pp. 999-1001.
Nickols, F.W. (2005), “Why a stakeholder approach to evaluating training”, Advances in Developing
Human Resources, Vol. 7 No. 1, pp. 121-134.
Papamitsiou, Z. and Economides, A.A. (2014), “Learning analytics and educational data mining in
practice: a systemic literature review of empirical evidence”, Educational Technology and Society,
Vol. 17 No. 4, pp. 49-64.
Passmore, J. and Velez, M. (2012), “SOAP-M: a training evaluation model for HR”, Industrial and
Commercial Training, Vol. 44 No. 6, pp. 315-325.
Paul, J. and Criado, A.R. (2020), “The art of writing literature review: what do we know and what do we
need to know?”, International Business Review, Vol. 29 No. 4, pp. 1-7.
Pearlstein, R.B. (2010), “How to use Kirkpatrick’s taxonomy effectively in the workplace”, in Moseley, J.L.
and Dessinger, J.C. (Eds), Handbook of Improving Performance in the Workplace, International Society
for Performance Improvement, pp. 38-57.
Phillips, J.J. (1991), Handbook of training evaluation and measurement methods, Gulf Publishing
Company, Houston.
Phillips, J.J. (1996), “How much is the training worth?”, Journal of ASTDTraining and Development,
Vol. 50 No. 4, pp. 20-24.
Phillips, P. (2003), Training Evaluation in the Public Sector, The University of Southern MS, Hattiesburg.
Rahimi, M. and Rosman, M. (2020), “The 5Ws of enterprise content management (ECM) research: is it
worth?”, Open Journal of Science and Technology, Vol. 3 No. 1, pp. 46-70.
Raja Kasim, R.S. and Ali, S. (2011), “Measuring training transfer performance items among academic
staff of higher education institution in Malaysia using Rasch measurement”, 2011 IEEE Colloquium on
Humanities, Science and Engineering, CHUSER 2011, IEEE, No. Chuser 2011, pp. 756-760.
Reio, T.G., Rocco, T.S., Smith, D.H. and Chang, E. (2017), “A critique of Kirkpatrick’s evaluation model”,
New Horizons in Adult Education and Human Resource Development, Vol. 29 No. 2, pp. 35-53.
Repanovici, A. (2010), “Measuring the visibility of the university’s scientific production using
GoogleScholar, ‘publish or perish’ software and scientometrics”, World Library and Information
Congress: 76th IFLA General Conference and Assembly. Retrieved December, Science and Technology
Libraries, Gothenburg, Sweden, pp. 1-14.
Rodin, M. and Rodin, B. (1973), “Student evaluations of teachers”, The Journal of Economic Education,
Vol. 5 No. 1, pp. 5-9.
Rossett, A. (2007), “Leveling the levels”, T and D, Vol. 61 No. 2, pp. 49-53.
Russ-Eft, D. and Preskill, H. (2005), “In search of the holy grail: ROI evaluation in HRD”, Advances in
Developing Human Resources, pp. 71-85.
Salas, E. and Cannon-Bowers, J.A. (2001), “The science of training: a decade of progress”, Annual
Review of Psychology, Vol. 52 No. 1, pp. 471-685.
Schuettfort, V.M., Ludwig, T.A., Marks, P., Vetterlein, M.W., Maurer, V., Fuehner, C., Janisch, F., Soave,
A., Rink, M., Riechardt, S., Engel, O., Fisch, M., Dahlem, R. and Meyer, C.P. (2020), “Learning benefits of
live surgery and semi-live surgery in urology – informing the debate with results from the international
meeting of reconstructive urology (IMORU) VIII”, World Journal of Urology, Springer Berlin Heidelberg,
doi: 10.1007/s00345-020-03506-3.
Shareefa, M. and Moosa, V. (2020), “The most-cited educational research publications on differentiated
instruction: a bibliometric analysis”, European Journal of Educational Research, Vol. 9 No. 1, pp. 331-349.
Smidt, A., Balandin, S., Sigafoos, J. and Reed, V.A. (2009), “The Kirkpatrick model: a useful tool for
evaluating training outcomes”, Journal of Intellectual & Developmental Disability, Vol. 34 No. 3,
pp. 266-274.
Steensma, H. and Groeneveld, K. (2010), “Evaluating a training using the ‘four levels model’”, Journal of
Workplace Learning, Vol. 22 No. 5, pp. 319-331.
Steinert, Y., Mann, K., Centeno, A., Dolmans, D., Spencer, J., Gelula, M. and Prideaux, D. (2006), “A
systematic review of faculty development initiatives designed to improve teaching effectiveness in
medical education: BEME guide no. 8”, Medical Teacher, Vol. 28 No. 6, pp. 497-526.
Stufflebeam, D.L. (1983), “The CIPP model for program evaluation”, in Madaus, F.F., Scriven, M. and
Stufflebeam, D.L. (Eds.), Evaluation Models: Viewpoints on Educational and Human Services Evaluation,
Norwell, MA: Kluwer, pp. 117-141.
Suharto, N.T., Slamet, P.H., Jaedun, A. and Purwanta, H. (2020), “The effectiveness of a school-based
disaster risk reduction program in Indonesia: a case study in the klaten regency’S junior high schools”,
International Journal of Innovation, Creativity and Change, Vol. 12 No. 12, pp. 949-962.
Tamkin, P., Yarnall, J. and Kerrin, M. (2002), Kirkpatrick and beyond: A Review of Training Evaluation, The
Institute for Employment Studies, Brighton, The UK.
Tan, J.A., Hall, R.J. and Boyce, C. (2003), “The role of employee reactions in predicting training
effectiveness”, Human Resource Development Quarterly, Vol. 14 No. 4, pp. 397-411.
Tattersall, A. (2015a), “Who, what, where, when, why: using the 5 Ws to communicate your research j
impact of social sciences”, LSE Impact Blog, pp. 8-10.
Tattersall, A. (2015b), “Who, what, where, when, why: using the 5 Ws to communicate your research j
impact of social sciences”, LSE Impact Blog, pp. 8-10.
Topno, H. (2012), “Evaluation of training and development: an analysis of vmarious models”, IOSR
Journal of Business and Management, Vol. 5 No. 2, pp. 16-22.
Tzeng, G.H., Chiang, C.H. and Li, C.W. (2007), “Evaluating intertwined effects in e-learning programs: a
novel hybrid MCDM model based on factor analysis and DEMATEL”, Expert Systems with Applications,
Vol. 32 No. 4, pp. 1028-1044.
van den Besselaar, P. and Sandström, U. (2020), “Bibliometrically disciplined peer review: on
using indicators in research evaluation”, Scholarly Assessment Reports, Vol. 2 No. 1, doi:
10.29024/sar.16.
Vı́lchez-Roma n, C., Sanguinetti, S. and Mauricio-Salas, M. (2020), “Applied bibliometrics and information
visualization for decision-making processes in higher education institutions”, Library Hi Tech, Vol. 39
No. 1, doi: 10.1108/LHT-10-2019-0209.
Warr, P., Bird, M. and Rackham, N. (1970), Evaluation of Management Training: A Practical Framework,
with Cases, for Evaluating Training Needs and Results, 2nd edition, Gower Press, CA: London, England.
Watkins, R., Leigh, D., Foshay, R. and Kaufman, R. (1998), “Kirkpatrick plus: evaluation and continuous
improvement with a community focus”, Educational Technology Research and Development, Vol. 46
No. 4, pp. 90-96.
Wei, M. (2020), “Research on impact evaluation of open access journals”, Scientometrics, Vol. 122 No. 2,
Springer International Publishing, pp. 1027-1049.
Werner, J.M. and De Simor, R.L. (2012), Human Resource Development, 6th ed., South-Western,
Cengage Learning, Mason.
Xiong, F., Zhang, Y., Li, T. and Liu, Z. (2008), “Integration of a novel neural network algorithm and
Kirkpatrick model and its application in R&D performance evaluation”, Proceedings - International
Conference on Computer Science and Software Engineering, CSSE. 12-14 December, Vol. 1, IEEE,
Wuhan, China, pp. 353-356.
Yahya, A.B., Noordin, M.K., Ali, D.F., Boon, Y., Hashim, S., Ahmad, J. and Ibrahim, M.A. (2017), “The
impact of ASNAF development training program on the quality of life of the poor and needy”, Man in India,
Vol. 97 No. 13, pp. 307-315.
Yang, S. and Zhu, Q. (2008), “Research on manager training effectiveness evaluation based on
Kirkpatrick model and fuzzy neural network algorithm”, 2008 4th International Conference on Wireless
Communications, Networking and Mobile Computing, IEEE, Dalian, China, pp. 1-4.
Zheng, L., Huang, R. and Yu, J. (2013), “Evaluation of the effectiveness of e-training: a case study on in-
service teachers’ training”, Proceedings – 2013 IEEE 13th International Conference on Advanced
Learning Technologies, ICALT 2013, 15-18 July, IEEE, Beijing, China, pp. 229-231.
Corresponding author
Aljawharah Alsalamah can be contacted at: 14587906@students.lincoln.ac.uk
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com