Mcdermott 2002
Mcdermott 2002
■ Abstract This article reviews the use of experiments in political science. The
beginning section offers an overview of experimental design and measures, as well as
threats to internal and external validity, and discusses advantages and disadvantages
to the use of experimentation. The number and placements of experiments in political
science are reviewed. The bulk of the essay is devoted to an examination of what we
have learned from experiments in the behavioral economics, political economy, and
individual choice literatures.
INTRODUCTION
Anyone who takes an antibiotic, confident that illness will remit, is implicitly
trusting in the power and validity of experiments as applied to real-world contexts.
Indeed, the hard sciences, including biology, chemistry, physics, and medicine,
all rely primarily on experimentation to examine and illuminate basic processes.
Psychology embodies a long and distinguished history of experimentation, and
behavioral economics, which involves a great deal of experimentation, has re-
cently gained increasing prominence within the larger field of economics. But
the methodology of experimentation has been slow to garner a following in po-
litical science. Experimentation might easily dovetail with methods more estab-
lished in political science, such as formal modeling, to produce and cumulate
useful knowledge; however, political scientists typically prefer archival work,
case studies, field work, surveys, quantitative analysis, and formal modeling in-
stead. Yet these other methods need not compete with experimentation. Indeed,
the most exciting opportunity for methodological advancement using experimen-
tation lies at the intersection of formal modeling and experimental testing: For-
mal models present hypotheses that are tested, refined, and explored through
experimentation in a reciprocal manner. This process is widely and success-
fully employed within behavioral economics. As yet, however, political science
remains slow to embrace the added value offered by the methodology of
experimentation.
1094-2939/02/0615-0031$14.00 31
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
32 MCDERMOTT
This essay addresses the use of experiments and the experimental method in
political science. Following a brief background discussion of the experimental
method, including threats to internal and external validity, relative advantages
and disadvantages, and ethics, the essay concentrates on what we have learned
of substance from experiments in behavioral economics and political science that
should be of interest to mainstream political scientists.
My overall goal is to advocate for the utility of experiments for political science.
I do not argue that experiments are the only, nor the best, form of methodological
inquiry. Rather, I argue that experimentation can be particularly useful under certain
circumstances: when existing methods of inquiry have produced inconsistent or
contradictory results; when empirical validation of formal models is required;
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
dence is needed to support strong causal claims. Experiments can combine with
other methods to provide what Campbell described as a “fish scale model of
omniscience,” whereby each methodological layer serves to illuminate and support
other component parts.
EXPERIMENTAL METHODS
EXPERIMENTS 33
Experimental Design
Why do we need experiments? We need experiments because they help to reduce
the bias that can exist in less rigorous forms of observation. Experiments reduce the
impact of bias by introducing standardized procedures, measures, and analyses.
Important aspects of experimental design include standardization, randomiza-
tion, between-subjects versus within-subject design, and experimental bias.
1. Standardization remains crucial in experimentation because it ensures that
the same stimuli, procedures, responses, and variables are coded and ana-
lyzed. This reduces the likelihood that extraneous factors, of which the ex-
perimenter might not even be aware, could influence the results in decisive
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
34 MCDERMOTT
behave or respond (Rosenthal 1966). Results then take the form of a self-fulfilling
by University of Illinois - Chicago on 05/17/12. For personal use only.
prophecy as experimenters create the reactions they hope to elicit with their subtle
signals and not with their controlled manipulation. Ways to overcome this bias
include having various experimenters run some subjects in all conditions, making
the experimenter blind to the subjects’ conditions, designing the experiment to
avoid experimenter involvement (as can be done with computer-generated experi-
ments), or treating the experimenter as a factor or variable in the statistical analysis
at the end of the experiment to determine if any particular experimenter elicited
distinctive responses from the subjects.
EXPERIMENTER BIAS Experimenter bias can overlap with expectancy effects but
is in fact distinct from them theoretically. Many experimental choices originate
from the experimenter’s beliefs and attitudes, and these choices can influence the
design of an experiment in a nonrandom way. Sometimes this may be acceptable,
but concerns arise especially when an investigator remains unaware that his beliefs
have unduly affected the design of the study (Roth 1988).
Experimental Measures
Experimental measures strive for reliability as well as internal and external validity.
Reliability and validity are central concepts in all experimental measurement.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 35
Reliability refers to the extent to which an experimenter tests the same thing
time and time again. A reliable result is one that is easily replicable. Reliability
improves when measures are standardized, when a larger number of measures have
been taken, and when factors that might bias the data are controlled in advance
(Zimbardo & Gerrig 1996).
Experimental measures can take several forms: self-reports, behavioral mea-
sures, physiological measures, and incentives.
form, can be coded into quantitative categories for later analysis. Behavioral mea-
sures require experimenters to observe the behavior of subjects by, for example,
videotaping them and later examining the tapes for characteristics such as facial
expressions or tendency to dominate in a group. Physiological measures include
such data as heart rate, galvanic skin response, blood pressure, or more extensive
tests such as magnetic resonance imaging (MRI) or positron-emission tomogra-
phy (PET) tests. More intrusive tests, such as those that analyze saliva, urine, or
blood, might also be conducted to determine hormone levels or other variables of
interest.
36 MCDERMOTT
THREATS TO INTERNAL VALIDITY There are nine potential threats to internal va-
lidity in experimentation:
1. History refers to any event that occurs outside the experimenter’s control in
the time between the measures on the dependent variable. This phenomenon
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
2. Intersession history refers to events that occur inside the study itself, which
are beyond the control of the investigator and may affect the outcome of the
study. Extreme temperature fluctuation, unexpected fire drills, or unknown
preexisting relationships between some subjects might affect one session
of an experiment but not another. These confounds all threaten the internal
validity of the experiment.
3. Maturation refers to the natural needs, growth, and development of indi-
viduals over time. For example, an experiment that relies on deprivation
for motivation depends on maturation effects for thirst to occur. Maturation
processes that work independently of the investigator over the course of an
experiment can bias results as well.
4. Performance effects. Performance can change as a result of experience. Test
performance can be affected by the very act of having taken the test be-
fore. Therefore, pre- and post-tests with exactly the same questions do not
constitute identical assessments because, independent of the intervening ma-
nipulation, taking the first test may influence the answers to the second test
through the natural process of learning and experience.
5. Regression toward the mean. Since all scores represent some combination of
the real score and some random error, subjects who manifest an extreme score
are likely to move closer to the mean on the next measurement, as the random
error fluctuates. Experimenters who specifically pick subjects because they
manifest an extreme score on some dimension, like authoritarianism, are
likely to confound their results through their failure to incorporate regression
effects into their subject selection procedures.
6. Subject self-selection. Subjects who self-select into particular experiments
or conditions are likely to differ in some systematic way from those who are
randomly assigned to a condition.
7. Mortality occurs not only when subjects die but also when they are lost to
follow-up by the investigator. In political science, mortality occurs most often
in experiments that require the same subject to show up more than once, and
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 37
the subject fails to show up for the second part of the experiment. This can also
happen in longitudinal and field studies that examine dynamics over time; as
people move and change their living situations, they can become hard to trace.
Sometimes financial and other incentives can alleviate this problem. From an
ethical perspective, the most important cases of experimental mortality occur
when subjects leave in the middle of an experiment because some aspect of
the experiment has made them uncomfortable. If a study has a high degree
of interexperimental drop-out, the investigator should take pains to ascertain
that the experiment is being conducted in an appropriate and ethical manner.
8. Selection-maturation interaction occurs when subjects are placed into an
experimental condition in a nonrandom manner and some aspect of the group
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
THREATS TO EXTERNAL VALIDITY Campbell (1968) outlines the six major threats
to external validity in experimentation:
1. Testing interaction effects. Testing can increase subjects’ sensitivity to the
variables under investigation, which makes it difficult to generalize the results
to a population that has not been pretested.
2. Unrepresentative subject population. What can college sophomores tell us
about real-world decision makers? Sears (1986) argues that college sopho-
mores differ in systematic and marked ways from other people: They are more
self-absorbed; they have less crystallized attitudes, a less clear sense of self,
higher rates of compliance, less stable peer relationships, and stronger cog-
nitive skills. Remarkably, many experimental findings using college sopho-
mores have proved remarkably robust (Roth 1988). However, many people
remain concerned about subject pools. Obviously, the best way to deal with
this problem of external validity would be to sample directly from the pop-
ulations of interest. Etheredge (1978) did this in his extensive study of 126
career foreign service officers at the State Department to examine “how emo-
tional predispositions might shape elite foreign policy thinking.” But often
this is not possible because such people are either too busy or not interested
in participating in experiments. Another strategy against this limitation in-
volves simulations with real or former decision makers using a hypothetical
or past crisis as a stimulus. Such simulations have produced very accurate
results. In one of the most powerful examples of a simulation’s prescient
prediction of a real-world outcome, a Joint War Games Agency of the Joint
Chiefs of Staff conducted a major war game simulation of the conflict in
Vietnam. The goal was to start with current resources as of July of 1965 and
simulate the likely outcome through September of 1966. The results indi-
cated that the United States would not be able to win the conflict in the long
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
38 MCDERMOTT
run and was unlikely to do better than stalemate in the short run (as reported
in Burke & Greenstein 1989). Simulations that accurately mimic real-life
problems and resources can engage the same psychological processes that
operate in the real world.
3. Hawthorne effect. The third limitation on external validity involves the so-
called Hawthorne effect (Roethlisberger & Dickson 1939), whereby people
change their behavior merely because they are aware of being observed.
People who know they are in an experiment may behave differently than they
would if they were not in an experiment or were unaware of the experiment.
4. Professional subjects. On large and relatively anonymous college campuses,
a student eager to earn money can participate in many experiments across
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
subject has become. Overly experienced or jaded subjects may be more likely
to guess the underlying hypothesis or manipulation in an experiment if they
have participated in similar ones in the past.
5. Spurious measures. Some unexpected aspect of the experiment may induce
subjects to give systematically irrelevant responses to particular measures,
which are then understood to be experimental effects.
6. Irrelevant measures. Irrelevant aspects of the experimental condition might
produce results that appear to be experimental effects.
Advantages
The comparative advantages of experiments lie in their high degree of internal va-
lidity. No other methodology can offer the strong support for the causal inferences
that experiments allow. Correlational studies, for example, do not show causa-
tion. Since a laboratory setting allows investigators to control all aspects of the
environment so that only the independent variables differ, any differences on the
dependent variable can be attributed to the manipulation, and thus offer support
for causal inferences. Experiments offer at least five such advantages:
1. Ability to derive causal inferences. “The major advantage of laboratory
experiments is in its [sic] ability to provide us with unambiguous evidence
about causation” (Aronson & Carlsmith 1968). Because of the random-
ization of subjects and the control of the environment, experiments allow
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 39
Disadvantages
Experiments are not always the ideal methodology. Most concerns about their
disadvantages within political science revolve around questions of external validity
and how widely the findings of the laboratory apply to real-world actors and
phenomena. There are four main disadvantages to the use of experiments:
1. Artificial environment. Many experimental settings are artificially sterile and
unrepresentative of the environments in which subjects might normally per-
form the behavior under study. There are at least two important aspects of
this limitation. First, it might be impossible or unethical to create the desired
situation within a laboratory. An experimenter could not study the effects of
a life-threatening illness by causing such disease in a subject. Second, it may
be very hard to simulate many phenomena of interest—an election, a war,
an economic recession, and so on.
2. Unrepresentative subject pools. As noted above, subject pools may be un-
representative of the populations of interest.
3. External validity. For political scientists, questions surrounding external va-
lidity pose the greatest concern with experimentation. What can experiments
tell us about real-world political phenomena? Beyond the nature of the sub-
ject pool, this concern is at least twofold. First, in the laboratory it is difficult
to replicate key conditions that operate on political actors in the real world.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
40 MCDERMOTT
Subjects typically meet only for a short period and focus on a limited task.
Even when money serves as a material incentive, subject engagement may
be low. In the real world, actors have histories and shadows of the future with
each other, they interact around many complex issues over long periods, and
they have genuine strategic and material interests, goals, and incentives at
stake. Can the results of a single-session experiment tell us anything about
such a complicated world?
Second, and related, many aspects of real-world complexity are diffi-
cult to simulate in the laboratory. Cultural norms, relationships of authority,
and the multitask nature of the work itself might invalidate any results that
emerge from an experiment that does not, or cannot, fully incorporate these
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
there are no countervailing pressures acting on them, but quite another when
acting within the constrained organizational or bureaucratic environments
in which they work at their political jobs. Material and professional incen-
tives can easily override more natural psychological or ethical concerns that
might manifest themselves more readily in the unconstrained environment
of the laboratory. Failure to mimic or incorporate these constraints into ex-
periments, and difficulty in making these constraints realistic, might restrict
the applicability of experimental results to the real political world.
There are two important things to understand about external validity. First,
external validity is only fully established through replication. Experiments
testing the same model should be conducted on multiple populations using
multiple methods in order to determine the external validity of any given ex-
perimental paradigm. Second, external validity is more closely related to the
realism created within the experiment than to the external trappings of simi-
larity to real-world settings, which is referred to as mundane realism. As long
at the experimental situation engages the subject in an authentic way, exper-
imental realism has been constructed; under these circumstances, mundane
realism may be nice but is hardly required to establish causality. Moreover,
even if the experiment closely approximates real-world conditions, if its sub-
jects fail to engage in an experimentally realistic way, subsequent findings
are useless.
4. Experimenter bias. Experimenter bias, including expectancy effects and de-
mand characteristics, can limit the relevance, generalizability, or accuracy
of certain experimental results.
EXPERIMENTAL ETHICS
As a result of concerns about the ethical treatment of human subjects, the U.S.
Department of Health and Human Services imposes strict guidelines on all research
involving human subjects. Institutional review boards at major research institutions
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 41
oversee the administration of these guidelines and require advance approval on all
experiments with human subjects to ensure their ethical treatment. These boards
have the power to reject proposals that they deem to inadequately protect human
subjects from unnecessary pain and suffering. The job of these boards involves
weighing the risks and benefits of each study for the appropriate balance between
risk to the subject and benefit for science or society. The guidelines include four
important components:
1. Informed consent. Informed consent requires that the experimenter provide
every subject with a disclosure statement prior to the experiment, describing
the experimental procedures, along with expected gains and risks for the
subject. Subjects are told they can leave the experiment at any time without
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
penalty and are given contact information to report any concerns about their
by University of Illinois - Chicago on 05/17/12. For personal use only.
42 MCDERMOTT
EXPERIMENTS 43
outcome. These insights, gained from the experimental procedure, help with theory
development as well as theory testing.
Political science may have significant historical, cultural, or practical reasons
for its lack of affinity to the use of experimentation; nonetheless, the past need
not predict the future in this arena. Experimentation can achieve the same kind of
successful impact in political science that it has had in other fields such as psy-
chology and economics. We can learn from other social sciences that experiments
need not stand on their own, as in biology, in order to be effective and useful in
refining theory, providing evidence, and testing causal claims. Rather, experiments
can dovetail with other methods in order to produce a cumulation of knowledge
and an advancement in both theory and method within political science.
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
assume that experiments in political science need to mimic those in biology, if not
in substance, at least in process. Yet, this stringent standard is not required; we
need only note that experiments can dovetail with other methods to produce useful
knowledge in order to adequately justify their utility.
In order to illustrate this process in action, the remainder of this essay is devoted
to an explication of some of the experimental literature of relevance to mainstream
political science. Certain experimental literature in behavioral economics and so-
cial psychology is addressed as well, since much of this work is relevant to issues
of concern to political science. This article does not have the space to cover ac-
complishments in all experimental areas of interest to political science. Moreover,
it is only through a sequence of experiments exploring related topics in different
ways and on different populations that cumulative knowledge and external validity
emerge. Therefore, following a brief overview of the presence of experiments in
political science, I concentrate on systematic programs of larger research within
behavioral economics and political science that have produced results with great
relevance to major issues of interest in political science.
Overview
A comprehensive overview of experiments published by established political sci-
entists reveals a total of 105 articles between 1926 and 2000. Only about 57 of them
were published in political science journals. Many more strong articles written by
political scientists, often in collaboration with economists, on political topics have
been published in either psychology or economics journals. I examined 48 articles
that appeared in non–political science journals but were written by political scien-
tists; six or seven individuals, either alone or in collaboration, wrote the majority
of them. The list of 105 articles does not include those published in the now-
defunct Experimental Study of Politics. This journal was founded in 1971 because
many believed that their experimental work had been unfairly rejected from the
established political science journals (McConahay 1973); Experimental Study was
created expressly to redress this difficulty. However, most of the articles published
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
44 MCDERMOTT
1970s and the 1980s may have been at least partly due to the demise of the afore-
mentioned Experimental Study of Economics journal in 1975.) Also interesting is
by University of Illinois - Chicago on 05/17/12. For personal use only.
EXPERIMENTS 45
behavior. The second, third, and fourth most popular topics were bargaining (13),
games (10), and international relations topics (10). Other topics that attracted
experimental interest included committee work (8), experimental bias (6), race
(6), field experiments (5), media (4), leadership (4), and experiments embedded
in surveys (3). Given the high percentage of experimental articles focused on vot-
ing, the question arises whether more experimenters actually focus on voting, or
whether experiments on voting simply find more receptive journals available, and
thus are more likely to be published overall.
Several areas of experimental research are relevant to the concerns of main-
stream political scientists. Rather than examine each experiment individually, I
discuss the major relevant findings in two broad areas of experimental research.
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
46 MCDERMOTT
Although the venues and contexts for investigation may differ, a great poten-
tial for substantive overlap exists between behavioral economics, psychology, and
political science, which might be exploited for greater cross-disciplinary and in-
terdisciplinary work. Current areas of research in psychology and economics offer
promising opportunities for collaboration with political scientists who share such
interests: social preferences, including investigations of norms, social networks,
altruism, status, and trust; bounded rationality, involving decision making in com-
plex environments; learning and expectation formation; attitudes toward risk; and
cognitive biases (Laibson 2000).
Behavioral economics has concentrated on six main substantive areas since
its experimental work began in the 1930s. These are (Roth 1995) (a) Prisoner’s
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
Dilemma and public goods issues; (b) problems of coordination and cooperation;
(c) dynamics of bargaining; (d ) experimental markets; (e) auction behavior; and
by University of Illinois - Chicago on 05/17/12. For personal use only.
( f ) individual choice. Some experimental work has also been done in the area
of industrial organization, although the majority of work in this area has been
conducted by social psychologists.
PRISONER’S DILEMMA AND PUBLIC GOODS The Prisoner’s Dilemma game was
developed by Dresher and Flood at the Rand Corporation in 1950 (Flood 1952);
the story was added by Tucker (1950) later (Straffin 1980). The central conundrum
was that although cooperative play was transparently more profitable in the long
run, equilibrium choice favored defection, producing less benefit for each player.
Thus, the game produced a challenging test for equilibrium predictions. It was
ideal for experimental investigation, which could examine alternative predictions
in a controlled setting. In this environment, scholars developed a preference for
repeated-play games. Most of these experiments demonstrated that cooperation
begins early but breaks down over time, such that in multiround repeated games,
players learn to defect earlier and earlier (Selten & Stoecker 1986). In most of
these experiments, subjects know that a better outcome of mutual cooperation
exists, but fear of exploitation makes mutual defection the only stable strategy
over time. Even more interesting are recent findings suggesting that at least some
players hold values independent of the payoffs embedded in the game structure,
such as fairness, altruism, or concern with reputation building (Andreoni & Miller
1993). These findings will no doubt prompt further experimental work to test the
nature and limits of these seemingly noneconomic motivations. Political science
readers will also remember the famous Axelrod testing of Prisoner’s Dilemma
strategies in large-scale computer tournaments (Axelrod 1984). In his simulation,
a “tit for tat” strategy, in which the player begins with cooperation and makes the
same move as the opponent did the previous round, emerged as the most effective
strategy for maximizing payoffs.
Apparently the public goods problem was first presented by Swedish economist
Knut Wicksell in the nineteenth century (Roth 1995). Most experimental work on
the public goods and free rider problem has focused on the conditions under which
it might be most problematic and how its impact can be reduced in those situations.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 47
Early work using single examples of public goods problems indicated little free
riding (Johansen 1977). Like the work in Prisoner’s Dilemma, experimental work
on public goods problems moved toward repeated plays, demonstrating that in
successive rounds, voluntary contributions decline (Isaac et al. 1985). Future work
will no doubt explore the conditions under which free riding produces the greatest
problems and examine ways in which to ameliorate its effects.
ment, some of which are not adequately captured by traditional economic models.
by University of Illinois - Chicago on 05/17/12. For personal use only.
In many cases, coordination problems prevent optimal decision making from a ra-
tional perspective. For example, a factor that should not matter from a traditional
economic perspective may exert a tremendous influence, as when the mere pres-
ence of a dominated strategy affects the equilibrium chosen (Cooper et al. 1990).
In an experiment involving only problems of coordination, with no inherent
conflict of interest, Van Huyck et al. (1990) found that in a repeated game where
outcomes are made public after each round, behavior quickly converges around the
least profitable equilibrium. Obviously, this outcome does not represent economic
rationality in the traditional sense. Crawford (1991) and others have sought to
explain these findings using learning models and other game theory models offered
by evolutionary biologists. Such models suggest that stable equilibria occur when
strategies are not subject to invasion and dominance by new strategies. Crawford
does not argue that the coordination problem itself is evolutionary in nature; rather,
learning within the game allows stable, albeit economically nonrational, equilibria
to emerge and dominate over time.
48 MCDERMOTT
(Roth 1995). This topic is discussed below in the political science section on co-
operation.
mation. For example, work on asset valuation (Forsythe et al. 1982) demonstrates
that prices tend to converge to a perfect equilibrium after replication. In addition,
by University of Illinois - Chicago on 05/17/12. For personal use only.
information aggregation (Forsythe & Lundholm 1990) has been examined within
the context of experimental markets. Security markets have been examined in this
way as well (Plott & Sunder 1988). Forsythe et al. (1992) used an experimental
market whose ultimate value was tied to a future election outcome to examine how
markets aggregate information. They found that their experimental market prices
did a reasonable job of predicting the outcome of the election. On the basis of these
results, Forsythe et al. (1992) argue that market transactions reduce the impact of
biases such as political opinions on subjects’ pricing decisions.
Provocative work on cross-cultural behavior in bargaining and experimen-
tal market environments suggested that market outcomes converged to equilib-
rium and that there were no payoff differences between subjects in Jerusalem,
Ljubljana, Pittsburgh, and Tokyo (Roth et al. 1991). However, differences that
deviated from equilibrium predictions did occur everywhere in both the agree-
ment and the frequency of disagreement. The experimental procedures employed
made the experimenters confident that observed differences in bargaining behavior
did not result from differences in language or currency; rather, the investigators
tentatively attributed these discrepancies to cultural differences.
EXPERIMENTS 49
in the way people process information about probabilities and payoffs. Bids ap-
pear to be governed by payoffs, whereas choices tend to be driven by probabilities.
by University of Illinois - Chicago on 05/17/12. For personal use only.
Although these findings remain inconsistent with expected utility and other rational
models of decision making, psychologists readily explain this result as an exam-
ple of the anchoring and adjustment heuristic (Kahneman et al. 1982), whereby
people initially latch onto a value, which can be arbitrary and irrelevant, and fail
to adequately adjust that value to present circumstances in making subsequent
judgments. The first anchor people fix on is the monetary payoff, and then they
insufficiently adjust choices to shifts in probabilities. Preference reversals result
from this inadequate adjustment from the initial monetary anchor.
The second main area of research within the individual choice literature is
judgment under uncertainty (Kahneman et al. 1982). This experimental work ex-
amines how individuals judge the frequency or likelihood of certain outcomes. This
work consistently and robustly demonstrates at least three important judgmental
heuristics that appear to control people’s assessments of frequency: anchoring and
adjustment (described above), representativeness, and availability. The representa-
tiveness heuristic claims that individuals assess frequency based on the similarity
between the judged object or event and the categories to which it might belong. The
availability heuristic argues that people judge likelihood based on salience, i.e., the
ease of retrieval or imagination of the example from memory. All three judgmental
heuristics contradict central assumptions in most expected utility models, which
expect dominance, invariance, and intransitivity to hold sway in judgments about
probabilities.
Decision making under risk has been most closely examined by the same psy-
chologists who conducted the seminal work on judgmental biases (Kahneman &
Tversky 2000). In attempting to develop a descriptively accurate model of choice
as an alternative to expected utility models, Kahneman & Tversky (1979; Tversky
& Kahneman 1992) delineated prospect theory. Prospect theory incorporates two
successive phases: editing and evaluation. In editing, prospects or choices are
framed for a decision maker. Robust experimental evidence indicates that trivial
aspects of framing options can consistently exert profound impacts on the substance
of choice. Specifically, seemingly trivial changes in the method, order, or form in
which options are presented to a decision maker systematically affect the content of
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
50 MCDERMOTT
choice. Evaluation itself encompasses two components as well: the value function
and the weighting function. The value function has three central characteristics:
(a) outcomes are judged in relative, not absolute, terms; (b) individuals tend to be
risk-seeking in the domain of losses and risk-averse in the domain of gains; and
(c) people tend to be loss-averse in general. The role of the weighting function
is similar to, but distinct from, that of probability assessments in expected utility
models. First, people have great difficulty incorporating extremes such as impos-
sibility and certainty into their decision-making strategies. Second, individuals
tend to overweight low probabilities while simultaneously underweighting mod-
erate and high probabilities. All these empirical results surrounding the value and
weighting functions contradict the predictions of standard expected utility models.
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
by University of Illinois - Chicago on 05/17/12. For personal use only.
EXPERIMENTS 51
& Green (forthcoming1) found that personal canvassing increased voter turnout,
whereas phone calls appeared to have no impact. Direct mail appeared to have a
slight impact on voter turnout. In addition, they found that asking voters whether
they could be “counted on” to vote increased the impact of personal canvassing.
Other topics that have been investigated experimentally under the rubric of
voting and elections include candidate competition (Plott 1991), retrospective vot-
ing (McKelvey et al. 1987), political competition (Boylan et al. 1991), and voter
information costs.
Lau and Sears have used experiments to examine related topics. Their study
of the evaluation of public figures (Lau et al. 1979) concluded that the so-called
positivity bias often found in survey results is not an artifact of the measurement
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
process alone but rests on some real bias in assessment. Related work on political
by University of Illinois - Chicago on 05/17/12. For personal use only.
preferences (Sears & Lau 1983) showed that self-interest may result from political
and personal cues in surveys that trigger artifactual results. Finally, these authors
have experimentally explored the nature of political beliefs (Lau et al. 1991).
Political party identification has also been examined experimentally (Cowden
& McDermott 2000). We were intrigued by previous work, using different metho-
dologies, that achieved somewhat contradictory results regarding the long-term
stability of party identification. We designed an experiment that assessed student
subjects’ party identification, among other things, early in the semester. Later,
after participating in one experiment that manipulated the extremity of real candi-
dates in experimental elections, or another in which subjects role-played either the
prosecutor or defender of Clinton in the impeachment hearing, subjects filled out
a second, standard party identification measure. Our results indicated that party
identification, even in a young population that should have had less time to develop
strong associations, showed remarkable stability.
Media effects on candidate evaluation and voting have been another extremely
productive research topic. Some of the best and most imaginative experimentation
has been conducted in the area of media studies and political communication by
Iyengar and colleagues. Their creative studies have demonstrated that television
news influences how viewers weight problems and evaluate candidates (Iyengar
et al. 1982); that television news frames individuals’ explanation of events (Iyengar
1987); that negative advertising reduces voter turnout (Ansolabehere et al. 1994);
and that candidates gain the most by advertising on issues over which they can
claim “ownership” (Ansolabehere & Iyengar 1994). Iyengar continues to advance
the methodology of experimentation itself as well, with recent studies that use new
technology and field strategies to ameliorate some of the traditional criticisms of
external validity problems (Iyengar 2000). These strategies include bringing the
experiments into natural settings by creating living room environments in shopping
malls and asking subjects to watch television in those settings, with experiments
embedded in the programming. Further, Iyengar has begun to use the internet
1
Gerber A, Green D. The Effects of Canvassing, Phone Calls and Direct Mail on Voter
Turnout: A Field Experiment. Unpublished manuscript.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
52 MCDERMOTT
particular, gender differences in press coverage were more pronounced in the senate
race and for incumbents. This pattern appears to hurt female senatorial candidates.
On the other hand, sex stereotypes produce more positive evaluations of women
and appear to benefit gubernatorial candidates the most. Note that Kahn’s further
experimental testing of her earlier findings allowed her to further refine and condi-
tionalize her results. The findings of Huddy & Terkildsen (1993) on gender stereo-
typing in the perception of candidates are consistent with Kahn’s. They too find that
female candidates are seen in a positive light on traits such as compassion, whereas
men are perceived to be more competent on military issues. Huddy & Terkildsen
suggest that a gender trait approach best explains the differences they find.
EXPERIMENTS 53
2
Falk A, Fischbacher U. 1998. Kindness is the parent of kindness: modeling reciprocity.
Unpublished manuscript.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
54 MCDERMOTT
experiences of previous studies and designing future studies to address past anoma-
lies or to ameliorate procedural difficulties. He finds that beliefs about others are
important and can change over time. These beliefs appear to be contingent on cues
that individuals receive over time about others. In this way, interaction develops
lasting reputations and labels. Wilson’s work suggests that theoretical models of
individual choice might be impaired by their failure to incorporate such seemingly
nonrational factors as altruism, inequality aversion, and mind reading.
have used experimentation to test their cognitive calculus model of decision mak-
ing in foreign policy (Geva & Skorick 1999, Geva et al. 2000, Geva & Skorick
2000). These authors use experimentation to test the predictions of their model
against actual behavior in a laboratory setting.
Work on coordination and cooperation remains closely tied to work in social
psychology and behavioral economics. Typically, scholars investigate this topic
using noncooperative game theory (Palfrey 1991). Experimentalists seek to provide
data related to certain models and push those models further by presenting evidence
that might either refute or extend the current theoretical claims. Specific results
indicate that communication increases group cooperation. Ostrom and colleagues
(e.g., Ostrom & Walker 1991) have demonstrated that face-to-face communication,
particularly in repeated-play settings involving common pool resources, exerts a
powerful impact on propensity for cooperation.
Palfrey and colleagues have undertaken a systematic program of experimental
research on topics related to coordination and cooperation. In one experiment,
discounted repeated play proved more effective in generating cooperation than a
single shot trial in a public goods game with incomplete information; however,
results depended on the ability to monitor others and on the specific environmental
conditions (Palfrey & Rosenthal 1994). Palfrey and colleagues have concentrated
on the centipede game, in which two players alternately have a chance to take a
larger portion of a continually escalating amount of money (McKelvey & Palfrey
1992, Fey et al. 1996). Once one person takes the money, the game ends. Ac-
cording to game theory predictions under assumptions of complete information,
the first player should take the larger pile in the first round of play. However,
this does not happen in reality. Rather, subjects operating under conditions of un-
certainty and incomplete information about the payoff appear willing to consider
the small possibility that they are playing against an altruistic opponent. Although
the probability increases over time that a player will take the pile of money, the game
typically continues into subsequent rounds. Palfrey has also investigated choice
in other games (McKelvey & Palfrey 1995). This work shows great richness in
its ability to combine formal modeling with experimental testing of such models.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 55
The combination of methods allows greater confidence in results that point in the
same direction.
Experimental work by Miller and colleagues has explored a variety of topics,
including committees (Miller & Oppenheimer 1982). In work on games, Eavey &
Miller (1984a) demonstrate that when universalist options, which offer “something
for everyone,” exist in legislatures, concerns about fairness go beyond what ex-
pected value expectations would predict. Further, Miller & Oppenheimer (1982)
find that competitive coalitions with a minimum winning coalition occur only
when universal options are unavailable. In work on bargaining, Eavey & Miller
(1984b) show that a bureaucratic monopoly on agenda setting allows bargaining
with a voting body without necessarily imposing the agenda setter’s preferences
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
on all. They conclude that bureaucratic agenda control in legislative bodies sup-
ports a bargaining model over an imposition one. Although some of this work
by University of Illinois - Chicago on 05/17/12. For personal use only.
(Miller & Oppenheimer 1982, Palfrey & Rosenthal 1994) points out the discrep-
ancies between rational choice theory and the behavior of individuals in the real
world, experiments are used not only to test and critique existing formal models
but also to discover anomalies and challenges that are then incorporated into the
next generation of model development.
Bolton (1991) has used experimentation to investigate how actual bargaining
behavior differs from game theoretic predictions. Bolton & Zwick (1995) demon-
strate that the opportunity to punish an opponent who treats you unfairly presents
a more accurate explanation for deviations from perfect equilibrium solutions than
the existence of anonymity for the subject. Note that although experimental find-
ings may be at odds with some predictions of formal theory, the overall relationship
between game theoretic modeling and experimentation in these exercises is col-
laborative; experiments empirically test formal models and suggest discrepancies
as well as validations, and then formal modelers can attempt to incorporate these
empirical demonstrations into later, more sophisticated models.
In our work on topics related to international relations, we investigate the
impact of factors such as sex, uncertainty, and framing effects on arms races
and aggression. In one experiment involving three rounds of a simulated crisis
(McDermott & Cowden, forthcoming), we find that although uncertainty exerts
no systematic effect on weapons procurement or likelihood of war, men are sig-
nificantly more likely to purchase weapons and engage in aggressive action than
women. In another experiment involving a simulated crisis game (McDermott
et al. 2002), we examine the impact of framing in terms of striving for superiority
or parity with the opponent, two kinds of uncertainty, and the tone of messages
on weapons procurement. We find that embracing the frame of striving for su-
periority does indeed increase weapons procurement on the part of subjects. The
tone of the message exerts a tremendous impact as well; recipients of hostile
messages are much more likely to procure weapons than recipients of friendly
messages. As in our other work, uncertainty appears to have no effect on weapons
procurement. Finally, in more recent work, as yet unanalyzed, we manipulated
the incentive to go to war to further examine the impact of sex differences on
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
56 MCDERMOTT
CONCLUSIONS
3
McDermott R. Experimental methodology in political science. Submitted.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 57
some of the work in political science, the intersection of formal modeling and
experimental testing is highly productive. Experiments can be, and have been,
effectively used to test formal models, demonstrate unpredicted anomalies in out-
comes that then provoke more sophisticated models, and suggest extensions and
limitations of existing models under particular conditions.
In addition, experiments provide effective methodological help in examining
areas in which other methodologies have produced inconsistent or contradictory
findings, as was the case in our work on party identification. Experiments also
offer clear advantages over other methods in particular areas of investigation, such
as the validation of theories developed by formal modeling, or in further theory
testing and refinement. Experiments offer useful insights in work that investigates
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
ACKNOWLEDGMENTS
I would like to thank Jonathan Cowden, Margaret Levi, one anonymous reviewer,
the participants in the CBRSS Experimental Methods Conference at Harvard, and
especially Sidney Tarrow for enormous help in writing this review.
LITERATURE CITED
Andreoni J, Miller J. 1993. Rational coopera- son, Vol. 2. Reading, MA: Addison-Wesley.
tion in the finitely repeated Prisoner’s Rev. ed.
Dilemma: experimental evidence. Econ. J. Axelrod R. 1984. The Evolution of Coopera-
103:570–85 tion. New York: Basic Books
Ansolabehere S, Iyenger S. 1994. Riding the Baumrind D. 1985. Research using intentional
wave and claiming ownership over issues: the deception: ethical issues revisited. Am. Psy-
joint effects of advertising and news coverage chol. 40:165–74
in campaigns. Public Opin. Q. 58:335–57 Bazerman M, Samuelson W. 1983. I won the
Ansolabehere S, Iyengar S, Simon A, Valen- auction but don’t want the prize. J. Confl.
tino N. 1994. Does attack advertising de- Resolut. 27:618–34
mobilize the electorate? Am. Polit. Sci. Rev. Bolton G. 1991. A comparative model of bar-
88:829–38 gaining: theory and evidence. Am. Econ. Rev.
Aronson E, Carlsmith M. 1968. Experimenta- 81:1096–36
tion in social psychology. In The Handbook Bolton G, Ockenfels A. 2000. Measuring moti-
of Social Psychology, ed. G Lindzey, E Aron- vations for the reciprocal responses observed
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
58 MCDERMOTT
in a simple dilemma game. Am. Econ. Rev. Eckel C, Grossman P. 1996. Altruism in anony-
90:166–93 mous dictator games. Games Econ. Behav.
Bolton G, Zwick R. 1995. Anonymity versus 16:181–91
punishment in ultimatum bargaining. Games Etheredge L. 1978. A World of Men: The Private
Econ. Behav. 10:95–121 Sources of American Foreign Policy. Cam-
Boylan R, Ledyard J, Lupia A, McKelvey R, bridge, MA: MIT Press
Ordeshook P. 1991. Political competition in Fehr E, Schmidt K. 1999. A theory of fairness,
a model of economic growth: an experimen- competition and cooperation. Q. J. Econ.
tal study. In Laboratory Research in Political 114:817–68
Economy, ed. T Palfrey, pp. 33–68. Ann Ar- Felsenthal D, Rapoport A, Maoz A. 1988. Tacit
bor: Univ. Mich. Press cooperation in three alternative noncoopera-
Burke J, Greenstein F. 1989. How Presidents tive voting games: a new model of sophisti-
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
Test Reality: Decisions on Vietnam, 1954 and cated behavior under the plurality procedure.
1965. New York: Russell Sage Fdn. Elect. Stud. 7:143–61
by University of Illinois - Chicago on 05/17/12. For personal use only.
Camerer C. 1997. Progress in behavioral game Fey M, McKelvey R, Palfrey T. 1996. An ex-
theory. J. Econ. Persp. 11:167–88 perimental study of a constant-sum centipede
Campbell DT. 1968. Quasi-experimental de- game. Int. J. Game Theory 25:269–87
sign. In International Encyclopedia of the Fiorina M, Plott C. 1978. Committee decisions
Social Sciences, ed. DL Sills, Vol. 5. New under majority rule: an experimental study.
York: Macmillan Am. Polit. Sci. Rev. 72:575–98
Campbell DT, Ross HL. 1968. The Connecticut Flood M. 1952. Some experimental games. Res.
crackdown on speeding: time-series data in Memo. RM-789, RAND Corp., June
quasi-experimental analysis. Law Soc. Rev. Forsythe R, Horowitz J, Savin N, Sefton M.
3:33–53 1994. Fairness in simple bargaining games.
Campbell DT, Stanley JC. 1963. Experimen- Games Econ. Behav. 6:347–69
tal and quasi-experimental designs for re- Forsythe R, Lundholm R. 1990. Information
search on teaching. In Handbook of Research aggregation in an experimental market. Eco-
on Teaching, ed. NL Gage. Chicago: Rand nometrica 58:309–48
McNally Forsythe R, Nelson F, Neumann G, Wright J.
Chamberlain R. 1948. An experimental imper- 1992. Anatomy of an experimental political
fect market. J. Polit. Econ. 56(2):95–108 stock market. Am. Econ. Rev. 82:1142–61
Cooper R, DeJong D, Forsythe R, Ross T. Forsythe R, Palfrey T, Plott C. 1982. Asset
1990. Selection criteria in coordination valuation in an experimental market. Econo-
games: some experimental results. Am. Econ. metrica 50:537–67
Rev. 80:218–33 Frank R. 1988. Passions Within Reason: The
Cowden J, McDermott R. 2000. Short term Strategic Role of Emotions. New York:
forces and partisanship. Polit. Behav. 22: Norton
197–222 Geva N, Mayhar J, Skorick JM. 2000. The
Crawford V. 1991. An “evolutionary” interpre- cognitive calculus of foreign policy deci-
tation of Van Huyck, Battalio and Beil’s ex- sion making: an experimental assessment. J.
perimental results on coordination. Games Confl. Resolut. 44:447–71
Econ. Behav. 3:25–59 Geva N, Skorick JM. 1999. Information in-
Eavey C, Miller G. 1984a. Fairness in majority consistency and the cognitive algebra of for-
rule games with a core. Am. J. Polit. Sci. 28: eign policy decision making. Int. Interact.
570–86 25:333–62
Eavey C, Miller G. 1984b. Bureaucratic agenda Geva N, Skorick JM. 2000. Process and out-
control: imposition or bargaining? Am. Polit. come consequences of simultaneous foreign
Sci. Rev. 78:719–33 policy decisions. Presented at Annu. Meet.
13 Apr 2002 14:59 AR AR158-02.tex AR158-02.SGM LaTeX2e(2001/05/10) P1: GJC
EXPERIMENTS 59
Int. Soc. Polit. Psychol., July 4–8, Seattle, An experimental examination of sex stereo-
WA types and press patterns in statewide cam-
Gosnell H. 1926. An experiment in the stimula- paigns. Am. J. Polit. Sci. 38:162–95
tion of voting. Am. Polit. Sci. Rev. 20:869–74 Kahneman D, Slovic P, Tversky A, eds.
Grether D, Plott C. 1984. The effects of market 1982. Judgment Under Uncertainty: Heuris-
practices in oligopolistic markets: an exper- tics and Biases. Cambridge, UK: Cambridge
imental examination of the ethyl case. Econ. Univ. Press
Inq. 22:479–507 Kahneman D, Tversky A. 1979. Prospect the-
Guarnaschelli S, McKelvey R, Palfrey T. 2000. ory: an analysis of decision under risk.
An experimental study of jury decision rules. Econometrica 47:263–91
Presented at Exp. Methods Conf., May 11– Kahneman D, Tversky A. 2000. Choices, Val-
12, Cambridge, MA ues and Frames. New York: Cambridge Univ.
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
Hong J, Plott C. 1982. Rate filing policies for Press/Russell Sage Fdn.
inland water transportation: an experimental Korn J. 1987. Judgments of acceptability of de-
by University of Illinois - Chicago on 05/17/12. For personal use only.
60 MCDERMOTT
EXPERIMENTS 61
CONTENTS
BARGAINING THEORY AND INTERNATIONAL CONFLICT, Robert Powell 1
EXPERIMENTAL METHODS IN POLITICAL SCIENCE, Rose McDermott 31
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
vii
P1: FDS
April 9, 2002 11:17 Annual Reviews AR158-FM
viii CONTENTS
INDEXES
Subject Index 451
Cumulative Index of Contributing Authors, Volumes 1–5 469
Cumulative Index of Chapter Titles, Volumes 1–5 471
ERRATA
An online log of corrections to The Annual Review of Political Science
chapters (if any have yet been occasioned, 1997 to the present) may be
found at http://polisci.annualreviews.org/
Annu. Rev. Polit. Sci. 2002.5:31-61. Downloaded from www.annualreviews.org
by University of Illinois - Chicago on 05/17/12. For personal use only.