Effects of Game Based Learning On Students' Mathematics Achievement: A Meta Analysis
Effects of Game Based Learning On Students' Mathematics Achievement: A Meta Analysis
net/publication/330759904
CITATIONS READS
327 15,818
3 authors, including:
All content following this page was uploaded by Elena Novak on 31 January 2019.
a
College of Nursing, University of Missouri – St. Louis, St. Louis, MO
b
School of Teaching, Learning and Curriculum Studies, Kent State University, Kent, OH
c
Educational Psychology, Texas A&M University, College Station, TX
compared to traditional classroom instructional methods. Results from the 24 collected studies showed heterogeneity among effect
sizes, both in magnitude and direction. Using a random-effects model, a small but marginally significant overall effect (𝑑̅RE =
0.13, 𝑝 = .02) suggested that mathematics video games contributed to higher learning gains as compared to traditional instructional
methods. In addition, moderator analyses were mixed in terms of statistical significance and explored effect-size heterogeneity across
effects using grade level, instrument type, length of game-based intervention, country, publication type and study year characteristics.
Overall findings indicate that video games are a slightly effective instructional strategy for teaching mathematics across PreK-12th
grade levels.
Learning mathematics presents various challenges for many children due to the difficult and often tedious nature of the subject
(Sedig, 2008). Educational video games have the potential to address these challenges and positively impact mathematics learning
and attitudes. Video games are able to consume children’s attention for hours while providing instruction and an engaging learning
experience. Video games have been used to promote children’s mathematics achievement in various domains including problem-
solving and algebra skills (Abramovich, 2010), strategic and reasoning abilities (Bottino, Ferlino, Ott, & Tavella, 2007), critical
geometry skills (Yang & Chen, 2010), and arithmetic procedures (Moreno & Duran, 2004). Nevertheless, the National Mathematics
Advisory Panel (NMAP, 2008) and others (e.g., Martinez-Garza, Clark, & Nelson, 2013; Pellegrino & Hilton, 2012; Young et al.,
2012) do not provide a direct recommendation for using games “as a potentially useful tool in introducing and teaching specific
subject-matter content to specific populations” (NMAP, 2008; p. 51) due to the limited number of rigorous studies exploring effects of
game-based learning on mathematics skills development. This meta-analysis addresses this concern by systematically examining the
effects of mathematics video games used in PreK-12 curricula on student mathematics achievement compared to traditional, non-
video game-based classroom instructional methods (i.e., media comparisons). In addition, we assess several possible moderating
effects: Grade Level, Instrument Type, Length of Game-Based Intervention, Country, Publication Type and Publication Year.
In spite of the increased popularity of educational video games over the last two decades, empirical research on the effects of
math video gaming on student academic achievement is inconsistent. For example, Kebritchi (2008) found that high school students
who interacted with the mathematics video game DimensionM outperformed their non-gaming peers. Beserra, Nussbaum, Zeni,
Rodriguez, and Wurman (2014) examined third grade students’ arithmetic performance in game-based and traditional classroom
conditions in three different countries (Brazil, Chile, and Costa Rica). The authors found that game-based learning was more effective
than a traditional classroom approach. Conversely, several studies did not show positive benefits of using video games in a
mathematical classroom (e.g., Costabile, De Angeli, Roselli, Lanzilotti, & Plantamura, 2003; Gelman, 2010; Jones, 2011). For
instance, Ferguson (2014) found that traditional instruction was more effective than game-based instruction for high school Algebra 1
students. Swearingen’s (2011) research showed that both game-based and traditional instructional approaches were equally beneficial
for teaching high school mathematics. Moreover, not only did different research teams that used different games for promoting
distinct learning outcomes report mixed results, some findings by the same researchers who used the same math video games were
A meta-analytic review that quantitatively integrates findings of math video-gaming studies can provide an understanding of
the effectiveness of game-based learning for student mathematics achievement. Previous computer-assisted instruction (CAI) meta-
analyses focused on a broad range of interactive technologies, including drill-and-practice programs, web-based learning materials,
simulations, virtual reality technologies, digital visualization tools, and video games (e.g., Bayraktar, 2001; Chambers, 2002;
Christmann & Badgett, 2003; Cohen & Decanay, 1992). More recent media comparisons meta-analyses refine and focus on video
games, but due to the paucity of video gaming research, video games are lumped together with simulations and virtual reality
technologies (e.g., Merchant, Goetz, Cifuentes, Keeney-Kennicutt, & Davis, 2014). Like older CAI reviews, these meta-analyses did
not set game-based learning apart from learning with other interactive technologies, such as simulations and virtual reality. Moreover,
due to scarce empirical video gaming research these CAI meta-analyses examined general academic achievement or combined
mathematics and science academic achievement, without explicitly focusing on mathematics education. As such, no findings about
Several meta-analyses have attempted to quantitatively synthesize findings of empirical research on game-based learning and
academic achievement (e.g., Connolly et al., 2012; Young et al., 2012). However, due to methodological challenges associated with
the shortage of rigorous research in game-based learning, a qualitative synthesis of video gaming studies was implemented instead of
the initially planned meta-analysis. In addition, recent media comparisons meta-analyses (e.g., Clark, Tanner-Smith, & Killingsworth,
2016; Merchant et al., 2014; Vogel, Greenwood-Ericksen, Cannon-Bowers, & Bowers, 2006) spanned multiple content areas without
addressing the specific instructional needs and requirements of a single subject area (such as mathematics).
The term "game" refers to a learning video game. Garris et al. (2002) note that there is little consensus in the literature on how
to define educational games. Some consider Caillois’s (1961) definition as the most comprehensive analysis of games, characterizing
a game as “an activity that is voluntary and enjoyable, separate from the real world, uncertain, unproductive in that the activity does
not produce any goods of external value, and governed by rules” (p. 442).
In this meta-analysis, we distinguish between games and simulations, as simulations and games offer different learning
affordances. Simulations model real systems by displaying a procedure or phenomenon (Alessi & Trollip, 2001) and do not
necessarily include learning objectives, but do allow students to interact with the simulation and observe how variable manipulation
affects the observed phenomenon. Games, on the other hand, are designed to motivate students and create game-like learning
experiences. Games often encourage competition by setting clear objectives to score as many points as possible, move up in difficulty
level, or win in general (Young et al., 2012). For this meta-analysis we use Shute and Ke (2012) definition of video games, which
suggests that good games must have interactive problem solving, specific goals/rules, adaptive challenges, control, ongoing feedback,
uncertainty that evokes suspense and player engagement, and sensory stimuli (a combination of graphics, sounds, and/or storyline
used to excite the senses). These gaming characteristics are essential for creating an effective learning environment that enhances
To examine the learning effectiveness of mathematics video games that included game attributes suggested by Shute and Ke
(2012), this meta-analysis focused on modern digital games – the latest generation of learning video games developed using advanced
technologies and recent pedagogical approaches (Kebritchi, Hirumi, & Bai, 2010). Unlike the earlier generation of educational games
created in 1980s and 1990s, modern learning video games offer more advanced graphics and interface design, telecommunication and
networking capabilities, immersive 3D learning environments, visual and audio effects, multi-player options, and a learner-centered
This meta-analysis examined the relationships between game-based learning for mathematics skills development and student
mathematics achievement in PreK-12th grades. Specifically, we assessed the relative effectiveness of game-based interventions
compared to a traditional, non-video game-based classroom instruction. Our work was motivated by previous meta-analyses that
emphasized the importance of studying factors which can influence the relationships among mathematics game-based learning and
academic achievement, including mathematics skills and knowledge promoted in a game, game design elements, and design of game-
based interventions (Clark et al., 2016; de Boer, Donker, & van der Werf, 2014; Sitzmann, 2011).
de Boer et al. (2014)’s meta-analysis revealed that instructional intervention methodology influenced students’ academic
performance. For instance, factors such as cooperative/individual learning and implementer of instructional interventions moderated
the intervention effect. This area of research is certainly relevant and important for mathematics video game research, as the
implementation of game-based interventions can directly affect students’ mathematics achievement and engagement.
We conducted an extensive search of published and unpublished research on mathematics video gaming (described in detail
later) in order to collect studies that compared the effectiveness of video gaming in mathematics with a traditional, non-video game-
study participant characteristics (age, gender, race, learning disabilities, and socio-economic status)
general study characteristics (length of game-based intervention, implementer of game-based intervention, teacher training on
game-based instruction, teacher’s familiarity with the learning game(s), amount of time students spent interacting with the
game characteristics
research characteristics (assessment type, outcome format, number of test items, instrument reliability)
However, this list of moderator variables was eventually reduced to six: grade level, instrument type, length of game-based
intervention, country, publication year and type. The main reason for this reduction was due to methodological challenges and
partially reported study characteristics and study methods. Specifically, classifying video games and math instructional approaches
with regard to their characteristics and types presented considerable challenges for several reasons. First, many studies reported
longitudinal research interventions that took place over an academic quarter or semester (e.g., Carr, 2012; Din & Caleo, 2000;
Swearingen, 2011; Weiss et al., 2010). These studies employed multiple video games for teaching mathematics and likely used
various mathematics classroom instructional approaches. In addition, studies often failed to describe the type of mathematics
classroom instruction used for the so-called "traditional mathematics classroom instruction.” We also considered classifying video
games as general-purpose commercial games and serious games that are “designed with the intention of improving some specific
aspect of learning” (Derryberry, 2010). However, a vast majority of video games employed in the selected studies were serious games
designed to improve specific math skills. Only three studies used general-purpose commercial games (Brain Age 2 in Gelman, 2010;
My Sims in Hawkins, 2008; Sims 2 in Panoutsopoulos & Sampson, 2012). As such, for a quantitative comparison perspective this
study characteristic was excluded. Furthermore, we attempted to code the studies with regard to math content (e.g., geometry,
arithmetic, algebra). However, due to considerably different school settings and paucity of experimental and quasi-experimental
RQ1: What is the overall relative learning effectiveness of game-based interventions as compared to a traditional, non-video
RQ2: How heterogeneous are results from studies on the overall relative learning effectiveness of game-based interventions as
compared to a traditional, non-video game-based classroom instruction for student mathematics achievement in PreK-12th grades?
RQ3: To what extent do study characteristics, namely grade level, instrument type, length of game-based intervention, country,
4. Method
4.1 Literature search
Searches of ERIC, PsycINFO, Wilson, Google Scholar, JSTOR, and ISI Web of Science databases were performed to collect
empirical studies, peer-reviewed journals, book chapters, thesis and dissertations, and conference papers, focusing on the effects of
computer games on student mathematics achievement. The following keywords were used: computer games, electronic games, video
games, computer software, mathematics achievement, mathematics education, number sense, numerical skills, numbers, experiment,
All studies from the initial search were examined by two reviewers and assessed for inclusion in the meta-analysis using the
following criteria:
2. Study employed game-based and traditional, non-video game-based classroom instructional interventions
3. Study used at least one gamed-based classroom and one traditional classroom
6. It was possible to infer that the video games could be characterized as “good video games” (Shute & Ke, 2012)
performance. Studies focusing on related or general learning outcomes and benefits (e.g., creativity, cognitive strategies, self-efficacy,
work ethics, enjoyment, motivation) were excluded. Upon review of the initial set of 860 studies there were 48 studies that satisfied
the first five inclusion criteria items. However, many studies did not include a clear description of the employed learning games.
These studies were further examined by the first two authors to determine if they could be characterized as learning games and if they
included Shute and Ke’s (2012) attributes of good learning games. The first author’s field of expertise is Educational Research
Methods, Measurement and Statistics, and Instructional Design. The second author specializes in Instructional Design and
Technology with a research focus on video games. The first step in determining whether the employed video games could be
characterized as “good video games” was to search the Web, including video game websites and YouTube videos about the games. If
the initial Web search was insufficient, we searched whether the games were mentioned in the literature with regard to the game
Moreover, some studies did not report sufficient data to calculate effect sizes. A total of 24 studies satisfied the inclusion criteria
and were included in the meta-analysis. From these 24 studies we extracted 39 statistically independent effect sizes. The reason for
having more effect sizes than the number of studies is that, in nine studies there were multiple groups (i.e., pairs) that produced more
than one effect size. Unlike instances were, say, multiple outcomes are provided by the same sample of students (e.g., one reading
outcome and one math outcome), these nine studies provided statistically independent effect sizes from multiple groups. Figure 1
provides a flowchart of the inclusion and exclusion decisions that lead to the final dataset.
4.3 Selection of variables
Part of our methods included moderator analyses of select study characteristics. The following moderator variables were
equivalent if outside of the United States), 1st-6th grade students, and 7th grade and above. This variable evaluated whether the
effectiveness of mathematics game-based learning varied across different grade levels as to accommodate for a continued increase in
difficulty of mathematics skills and decrease in student motivation to learn from preschool-elementary to middle-high school settings
(Harter, 1981).
4.3.2 Instrument type. The instrument type variable classified the instrument types used in studies. This moderator consisted
of three levels, the first of which was researcher-made scale (surveys, questionnaires, and tests created or partially created by
researchers of the study). If researchers used selected questions or portions from a standardized instrument or large-scale assessment,
this was also considered as a researcher-made instrument. The rationale for this is that once an instrument is altered from its original
form, psychometric qualities often fluctuate and the instrument is no longer presented as intended by the original instrument creator(s).
The other two factor levels were commercial/standardized test and research-based scale. Commercial/standardized tests
consisted of utilitarian standardized instruments and large-scale assessments. These instruments have long-standing validity,
reliability, and psychometric properties which are generally accepted in research literature. Alternatively, research-based scales are
instruments previously used by researchers and are generally accepted and used in studies in their respective field. Overall, the
Instrument Type moderator variable was used in order to examine whether using different types of instruments (research-made,
commercial/standardized, and research-based scales) had an effect on student math achievement and showed significant differences
among studies.
4.3.3 Length of game-based intervention. This moderator represented the duration of the game-based intervention in order to
determine whether the intervention length contributed the effectiveness of game-based learning. Intervention length was categorized
into three levels: up to one hour, between one hour and eight hours, and over eight hours.
4.3.4 Country. A dummy variable was created for the country moderator: Studies completed in the United States and studies
completed outside of the United States. The United States and other countries have different educational systems. This moderator
4.3.5 Publication year. We observed a somewhat large increase in the amount of video-gaming research over the last decade.
Thus, the publication year moderator assessed differences among study results over time. We scaled publication year (i.e., first year
of publication within the set of studies we set to zero, then next publication year to one, and so forth) for ease of interpretation.
4.3.6 Publication type. As is often in meta-analysis we assessed any differences of effects based on the publication type
classification of a study. The publication type of a study consisted of two factor levels: journal and thesis/dissertation. Journal articles
were considered published documents and theses and dissertations were considered unpublished documents.
use an unbiased version of the standardized mean difference proposed by Hedges (1981). The unbiased sample standardized mean
3 ̅
Y𝑘T − ̅Y𝑘C
𝑑𝑘 = (1 − ) , (1)
4(𝑛𝑘T + 𝑛𝑘C − 2) − 1 𝑆𝑘P
where ̅
Y𝑘T and ̅
Y𝑘C are respective mean mathematics achievement outcomes for the treatment and control groups, 𝑆𝑘P is a pooled
standard deviation, and 𝑛𝑘T and 𝑛𝑘C are respective treatment and control sample sizes. A positive effect size is interpreted as a mean
difference favoring the treatment group and a negative effect size favoring the control group. In all instances the treatment group
For five effect sizes, insufficient group-mean results were reported. In these cases we computed the standardized mean
difference as
𝐹𝑘 (𝑛𝑘T + 𝑛𝑘C )
𝑑𝑘 = ± √ , (2)
𝑛𝑘T 𝑛𝑘C
where 𝐹𝑘 is the 𝐹 statistic from a one-way analysis of variance (see Borenstein, 2009). After computing effect-size estimates using (1)
An important step when conducting a meta-analysis is determining the degree of homogeneity of a collection of effects.
Several homogeneity measures were computed to assess the similarity of effects. First we computed the 𝑄 statistic (Hedges, 1982), a
𝐾
2
𝑄 = ∑ 𝑣𝑘−1 (𝑑𝑘 − 𝑑̅FE ) , (4)
𝑘=1
where 𝑑̅FE is the fixed-effect (i.e., inverse-variance weighted) mean. Under the null hypothesis of effect-size homogeneity, 𝑄 follows
an approximate chi-square distribution with 𝐾 − 1 degrees of freedom. Larger 𝑄 values correspond to more disagreement among
effect sizes.
The second homogeneity measure was 𝐼 2 (Higgins & Thompson, 2002; Higgins, Thompson, Deeks, & Altman, 2003),
computed as
𝑄−𝐾+1
𝐼2 = × 100%. (5)
𝑄
Rough interpretations of 𝐼 2 are no variation, low variation, moderate variation, and high variation for values 0, 25, 50, and 75,
Based on the results from homogeneity tests (shown in the Results section), as well as our intention to generalize results to
more than the set of the collected studies, we adopted a random-effects model when making inferences to the overall results. A
benefit of the random-effects model is the incorporation of a non-zero variance parameter, 𝜏 2 , which represents a between-studies
heterogeneity. This between-studies variance components represents the degree of heterogeneity among the study-specific effects
(i.e., not at the primary-study level). Furthermore, under a random-effects model we allow individual study effects to differ across
studies. The random effects model used in our research distributional form is
𝑑𝑘 ~ 𝑁(𝛿𝑘 , 𝑣𝑘 )
(6)
𝛿𝑘 ~ 𝑁(𝜇, 𝜏 2 ),
where 𝛿𝑘 represents the true value of the 𝑘th effect size, 𝜇 population parameter of the overall mean of effects, and 𝜏 2 is the between-
For the overall effect we computed a random-effects mean, its variance, and the associated 95% confidence interval. The
1
𝑣RE = . (8)
∑𝐾
𝑘=1(𝑣𝑘 + 𝜏̂ 2 )−1
4.7 Moderator analyses
As a follow-up to our unconditional overall models, we considered two types of conditional random-effects models to explain
systematic effect-size heterogeneity: ANOVA-like models and meta-regression. Both modeling techniques use effect sizes as
dependent variables and study characteristics as independent variables. In total, we examined six study characteristics (Grade Level,
Instrument Type, Length of Game-Based Intervention, Country, Publication Year and Type) as moderators of effect sizes. Of the six
coded variables, five were categorical (analyzed using ANOVA-like models) and one was continuous (analyzed using meta-
regression).
For the ANOVA-like models, we report within-group effect means and their standard errors, as well as 95% confidence
intervals. We also report two forms of chi-square statistics, 𝑄𝑏 and 𝑄𝑤 . These two measures assess predictor-specific significance in
terms of explaining systematic effect-size heterogeneity (𝑄𝑏 ) and within-group variability of effects (𝑄𝑤 ). Because 𝑄𝑤 is group-
specific, each group within a moderator (i.e., factor level) will have an estimate, while 𝑄𝑏 will have a single value for each specific
factor.
The meta-regression results include coefficients and their standard errors, as well as 95% confidence intervals. Similar to the
ANOVA-like modeling, we also provide two chi-square statistics, 𝑄𝑚 and 𝑄𝑒 . Both are related measures of a model fit, with larger
values of 𝑄𝑚 and smaller values of 𝑄𝑒 corresponding to greater explanatory power of the effect-size heterogeneity by the set of model
predictors.
Publication bias concerns studies with statistically significant and/or larger effects having publication preference over smaller
and/or non-statistically significant effects (see Rothstein, Sutton, & Borenstein, 2006). We used several methods to check for
publication bias. First, we provided a funnel plot (Figure 3) and assessed the expected relationship between effect-size magnitudes
and their standard errors. Along with this graphic, we used three quantitative assessment methods: Trim and Fill (Duval & Tweedie,
2000), Egger’s regression test (Egger et al., 1997), and Failsafe-N (Rosenthal, 1979). None of these tools prove the presence or
absence of publication bias, rather they collectively indicate the likelihood of publication bias. All analyses and graphics were
completed in R (R Core Team, 2016) using the meta package (Schwarzer, 2015) and metaphor package (Viechtbauer 2010).
5. Results
The 39 effect sizes in the meta-analysis ranged from -0.73 to 0.87, with roughly 67% of point estimates being positive (i.e.,
favorability to the video-gaming instruction group). Of the 24 unique studies, 9 contributed multiple (but statistically independent)
effect sizes. Primary study sample sizes ranged from 41 to 437 students (combined intervention and control group samples) with
publication dates from 2008 to 2016. Based on point estimates and confidence intervals shown in Figure 2, the variability of effect-
size magnitudes and precision appear somewhat heterogeneous. Not only do the effect-size magnitudes vary, the range spanned both
positive and negative sides of the spectrum, further suggesting a diverse collection of effects. Both statistical assessments of
homogeneity, 𝑄(38) = 92.25, 𝑝 < .001 and 𝐼 2 = 60.19%, supported parts of our interpretations of Figure 2 regarding the effect-size
heterogeneity. Given these results and our generalizability intentions when answering our research questions, we forwent using a
more restrictive fixed-effect model and adopted random-effects models when making statistical inferences. Overall and moderator
0.24]. The overall effect of video-gaming instruction on mathematical achievement was marginally significant and quite variable, as
denoted by the rather wide confidence interval and relatively large standard error (𝑆𝐸 = 0.06). Furthermore, Figure 2 suggests that
effects likely vary from study-to-study. This is supported by the between-studies variability estimate of 𝜏̂ 2 = 0.07 (𝑆𝐸 = .03).
When assessing publication bias, all indicators lead to a small likelihood of publication bias. The funnel plot (Figure 3) shows
a moderate amount of effect-size symmetry. Trim-and-Fill results required no imputed effects to achieve asymmetry, which also
aligns with the lack of statistical significance of Egger’s regression test (−0.820, 𝑝 = .41). Last, the Failsafe-N assessment required
141 additional non-statistically significant effects to be added for our overall mean results for its observed significance level to a non-
In this section we discuss results of moderator analyses. The selected study characteristics had varying degrees of an
explanatory power of the effect-size variability. We discuss each of the six moderators separately.
5.1.1 Grade Level. The grade level moderator did not explain a statistically significant amount effect-size heterogeneity
(𝑄𝑏 (2) = 4.00, 𝑝 = .14). However, variability within groups was statistically significant for two of the three groups. More
specifically, effect sizes varied within the 1st Grade – 6th Grade group (𝑄𝑤 (25) = 52.84, 𝑝 < .001) and the 7th grade and above group
(𝑄𝑤 (9) = 24.22, 𝑝 < .01). Effect-size variability was not statistically significant for the Preschool-Kindergarten group (𝑄𝑤 (2) =
6.00, 𝑝 = .05). This result, in terms of statistical significance is dependent on the choice of Type I error level (i.e., 𝛼). In our case we
choose to be more conservative and not assume statistical significance rather than risk the inflation of a result (i.e., assume statistical
significance).
5.1.2 Instrument type. A non-significant amount of the effect-size heterogeneity was explained by the instrument type
moderator (𝑄𝑏 (2) = 3.01, 𝑝 = .22). Thus, the use of different measurement instruments for the assessment of mathematics
achievement did not seem to impact the effect of the intervention. However, two of the three measure types did show significant
within-group variability, specifically both commercial/standardized tests (𝑄𝑤 (9) = 28.10, 𝑝 < .001) and researcher-made scales
(𝑄𝑤 (17) = 41.51, 𝑝 < .001). The effect-size variability was not statistically significant for the researcher-based scales (𝑄𝑤 (10) =
13.62, 𝑝 = .19).
5.1.3 Length of game-based intervention. Like the grade level, country, and instrument type moderators, the length of game-
based intervention did not explain a statistically significant amount of effect-size heterogeneity (𝑄𝑏 (2) = 2.51, 𝑝 = .28). However,
all three groups showed a within-group variability: up to one hour (𝑄𝑤 (10) = 25.04, 𝑝 < .01), between one hour and eight hours
(𝑄𝑤 (20) = 41.05, 𝑝 < .01), and over eight hours (𝑄𝑤 (6) = 22.92, 𝑝 < .001).
5.1.4 Country. Similar to the grade level moderator, we did not find a significant amount of explained effect-size
heterogeneity for the country moderator (𝑄𝑏 (1) = .29, 𝑝 = .60). Nevertheless, variability within groups was found to be statistically
significant. For studies from the United States, the homogeneity test statistic was 𝑄𝑤 (16) = 35.99, 𝑝 < .01 and for studies outside of
the United States the homogeneity statistic was 𝑄𝑤 (21) = 54.20, 𝑝 < .001.
5.1.5 Publication type. The publication type moderator was one of the few moderators found to have explained a significant
effect-size heterogeneity (𝑄𝑏 (1) = 6.49, 𝑝 = .01). This indicates that effect sizes varied between the type of study (journal or
thesis/dissertation). The mean effect of the journal group was 0.21(𝑆𝐸 = 0.06), while the mean for the thesis/dissertation group was
−0.07(𝑆𝐸 = 0.09), showing a disagreement in both magnitude and direction. Moreover, the variance within groups differed between
the source types. While the variance of effects within the thesis/dissertation type was not statistically significant (𝑄𝑤 (8) = 11.14, 𝑝 =
.19), the variance of effects within the journal type was significant (𝑄𝑤 (29) = 58.92, 𝑝 < .001).
5.1.6 Publication year. The publication year variable was scaled so that the year 2000 (i.e., the earliest year of collected
studies) was set to 0, then 2001 set to 1, and so forth. Our analysis revealed that the publication year slightly influenced the amount of
the effect-size variability (𝑄𝑚 (1) = 4.46, 𝑝 = .03). The slope from this regression model was 0.01(𝑆𝐸 = .01), indicating a small
increase in effect-size magnitude (i.e., an increase in the effect of video-gaming instruction on mathematics achievement) as the
publication year of a study increased. However, the publication year moderator did not explain all effect-size variability (𝑄𝑒 (37) =
Table 1
𝐾 Coefficient(SE) 95% CI 𝑄𝑒
Publication Year [𝑄𝑚 (1) = 4.46, p = .03] 39 0.01(0.01) [0.00, 0.02] 79.88***
*p < .05; **p < .01; ***p < .001
Note: Means within groups are weighted under the mixed-effects model; 𝑗 indicates a specific group; Regression coefficient is standardized.
6. Discussion
The empirical research on video games in mathematics education remains limited (Connolly et al., 2012) and our present study
has further confirmed the paucity of research in this area. Despite a considerably large number of the reviewed studies (over 800),
only 24 studies that compared mathematics game-based learning with traditional instructional methods were included in the meta-
analysis (see Table 2). To generalize beyond the collection of studies in this meta-analysis, two types of random-effects models were
used. One random-effects model provided an unconditional representation of the overall effect of video-gaming instruction on
mathematics achievement. The second set of random-effects models looked at potential explanatory factors of systematic effect-size
variation.
Our first research question was what is the overall relative learning effectiveness of game-based interventions as compared to a
traditional, non-video game-based classroom instruction for student mathematics achievement in PreK-12th grades? A small but
marginally significant overall effect (𝑑̅RE = 0.13, 𝑝 = .02) suggests that mathematics video games contribute to a higher degree of
mathematics achievement compared to traditional instructional methods. Although previous meta-analyses did not focus specifically
on mathematics achievement, our findings converge with previous media comparisons meta-analyses that revealed benefits of CAI
relative to non-CAI conditions (Clark et al., 2016; Merchant et al., 2014; Vogel et al., 2006).
Table 2
Length of Game-
Sample Grade
Study Country Instrument Type Game(s) based
Size Level
Intervention
researcher-made
Bai et al. (2012) 437 7th-12th USA DimensionM long
instrument
Brazil,
st th
Beserra et al. (2014) 271 1 -6 Chile, and research-based scale Researcher-developed game medium
Costa Rica
researcher-made
Carr (2012) 104 1st-6th USA iPad math games long
instrument
commercial/standardized
Chang et al. (2012) 92 1st-6th Taiwan Researcher-developed game medium
test
commercial/standardized
Chang et al. (2015) 306 1st-6th USA Researcher-developed game medium
test
researcher-made
Ferguson (2014) 222 7th-12th USA Slope game medium
instrument
commercial/standardized
Garneli et al. (2016) 80 1st-6th Greece Gem-game short
test
researcher-made
Gelman (2010) 80 7th-12th USA Brain Age 2, Nintendo DS long
instrument
researcher-made
Hall (2015) 405 1st-6th USA iPad multiplication games medium
instrument
researcher-made
Hawkins (2008) 139 1st-6th USA MySims Wii, Nintendo Wii medium
instrument
Researcher-developed Brick
Hung et al. (2014) 69 1st-6th China research-based scale medium
Breaker game
ASTRA EAGLE (a series of
web-based computer games;
researcher-made academic content is based on
Ke (2008) 358 1st-6th USA medium
instrument the Pennsylvania System of
School Assessment (PSSA)
standards for mathematics)
VmathLive (academic content is
based on the National Council
King (2011) 128 7th-12th USA research-based scale medium
of Teachers of Mathematics
(NCTM) standards)
Researcher-developed
Lin et al. (2013) 64 1st-6th Taiwan research-based scale short
Monopoly game
Panoutsopoulos & commercial/standardized
57 1st-6th Greece Sims 2–Open for Business medium
Sampson (2012) test
researcher-made Researcher-developed
Pareto et al. (2012) 47 1st-6th Sweden long
instrument Teachable Agents game
Ploger & Hecht (2009) 196 1st-6th USA research-based scale Chartworld medium
Researcher-developed Super
Sedig (2008) 59 1st-6th Canada research-based scale medium
Tangrams game
Shin et al. (2012) 41 1st-6th USA research-based scale Skills Arena long
researcher-made
Starkey (2013) 168 7th-12th USA Lure of the Labyrinth medium
instrument
Sung, Chang & Lee commercial/standardized
60 PreK-K Taiwan Researcher-developed game short
(2008) test
Researcher-developed massive
th th
Swearingen (2011) 280 7 -12 USA research-based scale multiplayer online game long
(MMOG)
Weiss et al. (2006) 116 1st-6th Israel research-based scale Goldilocks series games long
EFFECTS OF GAME-BASED LEARNING 31
Related to this, our second research question asked how variable are results from studies
video game-based classroom instruction for student mathematics achievement? Based on our
study results. In other words, not all studies agreed on the effectiveness of video-gaming
achievement.
To further explore potential reasons for this effect-size heterogeneity, our third research
question asked to what extent did study characteristics (grade level, country, instrument type,
length of game-based intervention, publication type and publication year) moderate the effect.
Of the six moderators used in our study, only two (publication type and publication year) had
With regard to grade level, results suggest that mathematics video games were similarly
beneficial for students from various grade levels. These findings converge with Vogel et al.
(2006) media comparisons meta-analysis that examined the effects of games and simulations on
student general performance across various disciplines. Vogel et al. (2006) found that interactive
games and simulations were more beneficial for cognitive gains than traditional teaching
methods for all age groups. Our literature search revealed only three published studies
students. Thus, the findings should be interpreted with caution when generalizing to the PreK-K
EFFECTS OF GAME-BASED LEARNING 32
population since the results were obtained for the most part from studies with 1st-12th grade
students.
Studies used in the meta-analysis varied considerably in terms of the length of the game-
based interventions (a single game session of 33 minutes as the shortest and multiple game
sessions with a total of 10080 minutes as the longest). In line with previous research (Clark et
al., 2016; Merchant et al., 2014), the length of game-based interventions did not have significant
explanatory power. This finding may appear counterintuitive – one might expect that longer
interventions should be more effective than shorter interventions. However, prior research on
the effects of interventions in mathematics classrooms reports that intervention duration has only
a small impact on students’ academic achievement in both primary and secondary schools
Similar to Clark et al. (2016) and Merchant et al. (2014), this meta-analysis revealed non-
associated with higher quality, some video-gaming studies assess learning outcomes for which
validated instruments were unavailable. We note that if a study utilized only selected questions
researcher-made instrument in the present meta-analysis. Given that the present and previous
validated instruments are unavailable would not necessarily diminish the quality of video game
research.
The publication year of a study had significant explanatory power with respect to effect-
size heterogeneity. Though only to a small degree, the effect of game-based learning on
mathematics achievement increased as the year of publication increased. There are several
possible explanations for this finding. First, because game-based learning is a developing area of
research, it is plausible to assume that game-based interventions utilized in more recent studies
were informed by previous video gaming research and therefore capitalized on past results.
Furthermore, the video games industry is constantly seeking new ways to produce more
engaging and higher quality games. These innovations could possibly contribute to higher
In addition to publication year, the publication type moderator demonstrated that the
unpublished studies. The set of published studies had a statistically significant mean effect size
which was positive and small-to-moderately large in magnitude. Conversely, the collection of
unpublished studies had a mean effect size which was not statistically different from zero. This
implies that published studies tended to claim larger effectiveness of the video-gaming
This meta-analysis highlighted the need for more empirical research on mathematics
video games in order to deepen our understanding of how video games can enhance mathematics
EFFECTS OF GAME-BASED LEARNING 34
learning. Our initial intention was to examine various factors that could affect the relationship
individual differences, video game design characteristics, and attributes of video game-based
interventions. However, most of the identified mathematics video game studies only provided
partial information about the video games and game-based instructional interventions, thus
limiting our ability to systematically examine the effects of several moderator variables. To
advance research on mathematics game-based learning, we urge authors to include more detailed
learning game(s) and expected learning outcomes. For example, it is important to report how the
employed game(s) align with the classroom curriculum, the amount of video game training that
teachers received, teacher familiarity with the game(s), how the video game intervention was
implemented and who implemented it, the duration and frequency of video game interventions,
With regard to future meta-analysis research on mathematics video games, there is a need
to examine how video games facilitate acquisition of mathematics skills and concepts within
different mathematical domains (e.g., geometry, arithmetic, algebra). Examining how video
games facilitate acquisition of various skills can advance our understanding of how to select an
optimal video game for enhanced learning of specific mathematics concepts and skills. Thus,
future research should attempt to examine whether mathematics learning tasks can explain the
Clark et al. (2016) emphasized the importance of studying the relationships among game
design and learning outcomes. This is certainly true for mathematics video game research. We
should devote more attention to connecting game design characteristics with specific learning
EFFECTS OF GAME-BASED LEARNING 35
outcomes across various mathematical domains. However, current literature reviews suggest
that this type of research investigation can be a challenging task. Earlier research on educational
video games usually employed a single video game for an instructional intervention, which
allowed for a focus on game design and its impact on learning and engagement. However,
technological advances in digital video games created new opportunities and expectations for
teaching and learning. More recent studies implemented mathematics game-based learning using
a series of video games or multiple video-gaming apps that utilized various game designs and
genres and were played in the same game sessions, thus making the task of examining the role of
can improve the quality of video game research in general and mathematics video gaming in
particular (de Boer et al., 2014). This area of research is limited within the game-based learning
relationships among these attributes and learning outcomes would be an important contribution
Funding
This research did not receive any specific grant from funding agencies in the public,
References
Alessi, S. M., & Trollip, S. R. (2001). Multimedia for learning: Methods and development. New
Jersey: Pearson.
*Bai, H., Pan, W., Hirumi, A., & Kebritchi, M. (2012). Assessing the effectiveness of a 3-D
doi:10.1111/j.1467-8535.2011.01269.x
*Beserra, V., Nussbaum, M., Zeni, R., Rodriguez, W., & Wurman, G. (2014). Practising
Borenstein, M. (2009). Effect sizes for continuous data. In H. M. Cooper, L.V. Hedges, & J. C.
Valentine (Eds.), The handbook of research synthesis and meta-analysis. (pp. 221-235).
Bottino, R. M., Ferlino, L., Ott, M., & Tavella, M. (2007). Developing strategic and reasoning
abilities with computer games at primary school level, Computers & Education, 49(4),
1272-1286.
EFFECTS OF GAME-BASED LEARNING 37
Caillois, R. (1961) Man, Play and Games. Translated by Meyer Barash. New York: Free Press.
*Carr, J. M. (2012). Does math achievement h'APP'en when iPads and game-based learning are
*Chang, K.-E., Wu, L.-J., Weng, S.-E., & Sung, Y.-T. (2012). Embedding game-based problem-
solving phase into problem-posing system for mathematics learning. Computers &
*Chang, M., Evans, M. A., Kim, S., Norton, A., & Samur, Y. (2015). Differential effects of
57.
Clark, D. B., Tanner-Smith, E., & Killingsworth, S. (2016). Digital games, design, and learning:
Cohen, P. A., & Dacanay, L. S. (1992). Computer-based instruction and health professions
259-281.
EFFECTS OF GAME-BASED LEARNING 38
Connolly, T. M., Boyle, E. A., MacArthur, E., Hainey, T., & Boyle, J. M. (2012). A systematic
Costabile, M., De Angeli, A., Roselli, T., Lanzilotti, R., & Plantamura, P. (2003). Evaluating the
de Boer, H., Donker, A. S., & van der Werf, M. P. C. (2014). Effects of the attributes of
explanations of scoring and incentives on math learning, game performance, and help
Dignath, C., & Büttner, G. (2008). Components of fostering self-regulated learning among
*Din, F. S., & Caleo, J. (2000). Playing computer games versus better learning. Paper presented
Duval, S., & Tweedie, R. (2000). A nonparametric ‘Trim and Fill’ method of assessing
95(449), 89-98.
EFFECTS OF GAME-BASED LEARNING 39
Egger, M., Smith, G. D., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by
*Ferguson, T. (2014). Mathematics achievement with digital game-based learning in high school
Fu, K. (1999). My first math adventure – Counting and classification [Computer software].
*Garneli, V., Giannakos, M., & Chorianopoulos, K. (early view). Serious games as a malleable
Garris, R., Ahlers, R., & Driskell, J. E. (2002). Games, motivation, and learning: A research and
*Gelman, A. (2010). Mario math with millennials: the impact of playing the Nintendo DS on
*Giannakos, M. N. (2013). Enjoy and learn with educational games: Examining factors affecting
*Hall, M. M. (2015). Traditional vs. technology based math fluency practice and its effect on
of St. Francis.
Harter, S. (1981). A new self-report scale of intrinsic versus extrinsic orientation in the
300-312.
EFFECTS OF GAME-BASED LEARNING 40
Hedges, L. V. (1981). Distribution theory for Glass’ estimator of effect size and related
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic
Press.
Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring
*Hung, C.-M., Huang, I., & Hwang, G.-J. (2014). Effects of digital game-based learning on
*Jones, V. C. (2011). The effects of computer gaming on student motivation and basic
*Ke, F. (2006). Computer-Based playing within alternative classroom goal structures on fifth-
grades' math learning outcomes: cognitive, metacognitive, and affective evaluation and
Ke, F. (2008a). Alternative goal structures for computer game-based learning. Computer-
*Ke, F. (2008b). A case study of computer gaming for math: Engaged learning from gameplay?
Ke, F. (2008c). Computer games application within alternative classroom goal structures:
*Ke, F., & Grabowski, B. (2007). Gameplaying for maths learning: cooperative or not? British
*Kebritchi, M., Hirumi, A., & Bai, H. (2010). The effects of modern mathematics computer
55(2), 427-443.
*King, A. (2011). Using interactive games to improve math achievement among middle school
Lin, C-H., Liu, E. Z.-F., Chen Y.-L., Liou, P.-Y., Chang, M., Wu, C.-H., Yuan, S.-M. (2013).
Martinez-Garza, M., Clark, D. B., & Nelson, B. (2013). Digital games and the US National
Research Council‘s science proficiency goals. Studies in Science Education, 49, 170-208.
Merchant, Z., Goetz, E. T., Cifuentes, L., Keeney-Kennicutt, W., & Davis, T. J. (2014).
Moreno, R., & Duran, R. (2004). Do multiple representations need explanations? The role of
National Mathematics Advisory Panel (2008). Foundations for Success: The Final Report of the
games into school context. Educational Technology & Society, 15(1), 15–27.
*Pareto, L., Haake, M., Lindström, P., Sjödén, B., & Gulz, A. (2012). A teachable-agent-based
Pellegrino, J., & Hilton, M. (2012). Education for Life and Work: Developing Transferable
Knowledge and Skills in the 21st Century. Washington, D.C. National Research
Academy.
EFFECTS OF GAME-BASED LEARNING 43
*Ploger, D., & Hecht, S. (2009). Enhancing children's conceptual understanding of mathematics
277.
R Core Team. (2016). R: A language and environment for statistical computing (version 3.3.2).
project.org
Rosenthal, R. (1979). The “file drawer problem” and tolerance for null results. Psychological
Rothstein, H. R., Sutton, A. J., & Borenstein, M. (Eds.). (2006). Publication bias in meta-
*Sedig, K. (2008). From play to thoughtful learning: A design strategy to engage children with
*Shin, N., Sutherland, L. M., Norris, C. A., & Soloway, E. (2012). Effects of game technology
Shute, V. J., & Ke, F. (2012). Games, learning and assessment. In D. Ifenthaler, D. Eseryel &
Siemer, J., & Angelides, M. C. (1995). Evaluating intelligent tutoring with gaming simulations.
SMEC (1999). Toby’ IQ training camp – The seed of logic [Computer Software]. Taipei: SMEC
*Starkey, P. L. (2013). The effects of digital games on middle school students’ mathematical
*Sung, Y.-T., Chang, K.-E., & Lee, M.-D. (2008). Designing multimedia games for young
*Swearingen, D. K. (2011). Effect of digital game based learning on ninth grade students’
*Van Eck, R., & Dempsey, J. (2002). The effect of competition and contextualized advisement
Vogel, J. J., Vogel, D. S., Cannon-Bowers, J., Bowers, C. A., Muse, K., & Wright, M. (2006).
*Weiss, I., Karamarski, B., & Talis, S. (2006) Effect of multimedia environments on
Yang, J. C., & Chen, S. Y. (2010). Effects of gender differences and spatial abilities within a
Young, M. F., Slota, S., Cutter, A. B., Jalette, G., Mullin, G., Lai, B., et al. (2012). Our princess
Table 1
𝐾 Coefficient(SE) 95% CI 𝑄𝑒
Publication Year [𝑄𝑚 (1) = 4.46, p = .03] 39 0.01(0.01) [0.00, 0.02] 79.88***
*p < .05; **p < .01; ***p < .001
Note: Means within groups are weighted under the mixed-effects model; 𝑗 indicates a specific group; Regression coefficient is
standardized.
EFFECTS OF GAME-BASED LEARNING 47
Table 2
Length of Game-
Sample Grade
Study Country Instrument Type Game(s) based
Size Level
Intervention
researcher-made
Bai et al. (2012) 437 7th-12th USA DimensionM long
instrument
Brazil,
st th
Beserra et al. (2014) 271 1 -6 Chile, and research-based scale Researcher-developed game medium
Costa Rica
researcher-made
Carr (2012) 104 1st-6th USA iPad math games long
instrument
commercial/standardized
Chang et al. (2012) 92 1st-6th Taiwan Researcher-developed game medium
test
commercial/standardized
Chang et al. (2015) 306 1st-6th USA Researcher-developed game medium
test
researcher-made
Ferguson (2014) 222 7th-12th USA Slope game medium
instrument
commercial/standardized
Garneli et al. (2016) 80 1st-6th Greece Gem-game short
test
researcher-made
Gelman (2010) 80 7th-12th USA Brain Age 2, Nintendo DS long
instrument
researcher-made
Hall (2015) 405 1st-6th USA iPad multiplication games medium
instrument
researcher-made
Hawkins (2008) 139 1st-6th USA MySims Wii, Nintendo Wii medium
instrument
Researcher-developed Brick
Hung et al. (2014) 69 1st-6th China research-based scale medium
Breaker game
ASTRA EAGLE (a series of
web-based computer games;
researcher-made academic content is based on
Ke (2008) 358 1st-6th USA medium
instrument the Pennsylvania System of
School Assessment (PSSA)
standards for mathematics)
VmathLive (academic content is
based on the National Council
King (2011) 128 7th-12th USA research-based scale medium
of Teachers of Mathematics
(NCTM) standards)
Researcher-developed
Lin et al. (2013) 64 1st-6th Taiwan research-based scale short
Monopoly game
Panoutsopoulos & commercial/standardized
57 1st-6th Greece Sims 2–Open for Business medium
Sampson (2012) test
researcher-made Researcher-developed
Pareto et al. (2012) 47 1st-6th Sweden long
instrument Teachable Agents game
Ploger & Hecht (2009) 196 1st-6th USA research-based scale Chartworld medium
Researcher-developed Super
Sedig (2008) 59 1st-6th Canada research-based scale medium
Tangrams game
Shin et al. (2012) 41 1st-6th USA research-based scale Skills Arena long
researcher-made
Starkey (2013) 168 7th-12th USA Lure of the Labyrinth medium
instrument
Sung, Chang & Lee commercial/standardized
60 PreK-K Taiwan Researcher-developed game short
(2008) test
EFFECTS OF GAME-BASED LEARNING 49
Researcher-developed massive
th th
Swearingen (2011) 280 7 -12 USA research-based scale multiplayer online game long
(MMOG)
Weiss et al. (2006) 116 1st-6th Israel research-based scale Goldilocks series games long
EFFECTS OF GAME-BASED LEARNING 50
Figures: