Handbooks of Sociology and Social Research
Stephen L. Morgan Editor
Handbook of
Causal Analysis for
Social Research
Preface
In spring of 2010, Howard Kaplan invited me to compile a volume on sociological methodology for
the Springer series, Handbooks of Sociology and Social Research. I proposed causal analysis for the
focus of a new volume because (1) causal explanation is a common goal of social research, (2) the
nature and practice of causal analysis has been a topic of methodological debate for decades, and (3)
the literature on causality has moved quickly in the last 20 years to a point where a volume-length
assessment by a diverse collection of scholars would be of considerable value to readers in sociology
and in the social sciences more broadly.
After selecting causal analysis as the focus of the volume, I recruited contributors with established
track records of publishing sophisticated and readable methodological scholarship, most of whom
held appointments in sociology departments and/or were trained as sociologists. Contributors were
encouraged to recruit graduate student coauthors in order to expand the community of scholars in
sociology who write on methodological topics.
As a target audience, I asked contributors to write for advanced graduate students and faculty
researchers in sociology. I also recommended that contributors include conceptual and empirical
examples from sociology and from the allied social sciences whenever appropriate. To maximize
accessibility, I asked contributors to develop chapters with mathematical details and demands that
would be only as difficult as they needed to be, in recognition of the fact that too much methodological
scholarship already uses more mathematics than is necessary for the purposes at hand. As an objective
standard, I asked for chapters that required mathematical preparation that is no more advanced than is
necessary to read the typical articles published in Sociological Methodology and Sociological Methods
and Research.
I also made it clear to contributors that my goal, as Editor, was not to push for the adoption of
any particular model of causality, including the counterfactualist perspective on quantitative causal
analysis of which I am most enamored. However, I did note that, because of the shape of the recent
literature, I hoped that all chapters would engage some of the counterfactuals literature to some extent.
I indicated that such engagement could be critical and/or brief, as appropriate, and that I was inviting
a collection of scholars whom I expected would collectively disagree on the ultimate value of the
potential outcome version of the counterfactual model. As readers of the complete Handbook will
discern, I succeeded in generating a diversity of positions on this issue.
I thank the contributors to the volume for their uniformly strong dedication to their chapters. We
hope that this Handbook will strengthen the conclusions typical of social research, providing a wide
range of researchers with methodological guidance that can help them to (a) select and utilize methods
v
vi Preface
of estimation and inference appropriately and (b) determine when causal conclusions are warranted,
based on the particular standards in the subfields in which they work. If this Handbook succeeds in
promoting these goals, then all of the credit is due to the talent and skill of the contributors to the
volume.
Ithaca, NY Stephen L. Morgan
Contents
1 Introduction .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 1
Stephen L. Morgan
Part I Background and Approaches to Analysis
2 A History of Causal Analysis in the Social Sciences . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 9
Sondra N. Barringer, Scott R. Eliason, and Erin Leahey
3 Types of Causes . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 27
Jeremy Freese and J. Alex Kevern
Part II Design and Modeling Choices
4 Research Design: Toward a Realistic Role for Causal Analysis . . . . . .. . . . . . . . . . . . . . . . . . . . . 45
Herbert L. Smith
5 Causal Models and Counterfactuals.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 75
James Mahoney, Gary Goertz, and Charles C. Ragin
6 Mixed Methods and Causal Analysis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 91
David J. Harding and Kristin S. Seefeldt
Part III Beyond Conventional Regression Models
7 Fixed Effects, Random Effects, and Hybrid Models for Causal Analysis .. . . . . . . . . . . . . . . 113
Glenn Firebaugh, Cody Warner, and Michael Massoglia
8 Heteroscedastic Regression Models for the Systematic Analysis
of Residual Variances . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 133
Hui Zheng, Yang Yang, and Kenneth C. Land
9 Group Differences in Generalized Linear Models . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 153
Tim F. Liao
10 Counterfactual Causal Analysis and Nonlinear Probability Models . . . . . . . . . . . . . . . . . . . . . 167
Richard Breen and Kristian Bernt Karlson
11 Causal Effect Heterogeneity .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 189
Jennie E. Brand and Juli Simon Thomas
vii
viii Contents
12 New Perspectives on Causal Mediation Analysis. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 215
Xiaolu Wang and Michael E. Sobel
Part IV Systems of Causal Relationships
13 Graphical Causal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 245
Felix Elwert
14 The Causal Implications of Mechanistic Thinking: Identification Using
Directed Acyclic Graphs (DAGs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 275
Carly R. Knight and Christopher Winship
15 Eight Myths About Causality and Structural Equation Models . . . . .. . . . . . . . . . . . . . . . . . . . . 301
Kenneth A. Bollen and Judea Pearl
Part V Influence and Interference
16 Heterogeneous Agents, Social Interactions, and Causal Inference. . .. . . . . . . . . . . . . . . . . . . . . 331
Guanglei Hong and Stephen W. Raudenbush
17 Social Networks and Causal Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 353
Tyler J. VanderWeele and Weihua An
Part VI Retreat from Effect Identification
18 Partial Identification and Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 377
Markus Gangl
19 What You Can Learn from Wrong Causal Models . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 403
Richard A. Berk, Lawrence Brown, Edward George, Emil Pitkin,
Mikhail Traskin, Kai Zhang, and Linda Zhao
Chapter 1
Introduction
Stephen L. Morgan
In disciplines such as sociology, the meaning and interpretations of key terms are debated with
great passion. From foundational concepts (e.g., class and structure) to more recent ones (e.g.,
globalization and social capital), alternative definitions grow organically from exchanges between
competing researchers who inherit and then strive to strengthen the conceptual apparatus of the
discipline. For the methodology of social inquiry, similar levels of contestation are less common,
presumably because there is less scope for dispute over matters that many regard as mere technique.1
The terms causality and causal are the clear exceptions. Here, the debates are heated and expansive,
engaging the fundamentals of theory (What constitutes a causal explanation, and must an explanation
be causal?), matters of research design (What warrants a causal inference, as opposed to a descriptive
regularity?), and domains of substance (Is a causal effect present or not, and which causal effect is
most important?). In contrast to many conceptual squabbles, these debates traverse all of the social
sciences, extending into most fields in which empirical relations of any form are analyzed. The present
volume joins these debates with a collection of chapters from leading scholars.
Summary of Contents
Part I offers two chapters of overview material on causal inference, weighted toward the forms of
causal analysis practiced in sociology. In Chap. 2, “A History of Causal Analysis in the Social
Sciences,” Sondra Barringer, Scott Eliason, and Erin Leahey provide an illuminating examination
of 12 decades of writing on causal analysis in sociology, beginning with Albion Small’s 1898
guidance published in the American Journal of Sociology. The chapter introduces readers to the
main variants of causal modeling that are currently in use in the social sciences, revealing their
connections to foundational writings from the nineteenth century and forecasting advances in their
likely development.
1
Then again, some methodological terms have shifting definitions that are not embraced by all, whether they are design
concepts (e.g., mixed methods and natural experiment), measurement concepts (e.g., reliability and validity), or features
of models (e.g., error term, fixed effect, and structural equation).
S.L. Morgan (!)
Department of Sociology, Cornell University, Uris Hall 358, Ithaca, NY 14853, USA
e-mail: slm45@cornell.edu
S.L. Morgan (ed.), Handbook of Causal Analysis for Social Research, 1
Handbooks of Sociology and Social Research, DOI 10.1007/978-94-007-6094-3 1,
© Springer ScienceCBusiness Media Dordrecht 2013
2 S.L. Morgan
In Chap. 3, “Types of Causes,” Jeremy Freese and J. Alex Kevern lay out the variety of causal
effects of concern to social scientists and some of the types of causal mechanisms that are posited
to generate them. Beginning with arrow salad, and followed by discussions of proximity, necessity,
and sufficiency, the chapter provides examples of causal effects that the social science literature has
labeled actual, basic, component, fundamental, precipitating, and surface. The chapter also draws
some of the connections to the literature in epidemiology and health-related social science, where
important methodological and substantive work has enriched the literature on causality (and in ways
still too infrequently appreciated by researchers working in the core social sciences).
Part II offers three chapters that assess some of the major issues in the design of social research. In
Chap. 4, “Research Design: Toward a Realistic Role for Causal Analysis,” Herbert Smith begins with
the principled guidelines for causal analysis supplied by the influential statisticians David Freedman,
Paul Holland, and Leslie Kish, which he then discusses alongside the design advice offered by social
scientists from the 1950s onward. Filled with examples from demography and the social sciences
more broadly, the chapter argues that many of the excesses of recent efforts to establish causality
should be replaced by more sober attempts to understand the full range of data available on outcomes
of interest.
In Chap. 5, “Causal Models and Counterfactuals,” James Mahoney, Gary Goertz, and Charles
Ragin argue for the supremacy of set-theoretic models of causal processes for small-N and case-
oriented social science. Contrary to the forecast offered by Barringer, Leahey, and Eliason in Chap. 2,
it seems rather unlikely that future innovations in set-theoretic approaches to causal analysis proposed
by Mahoney, Goertz, and Ragin will emerge from embracing probabilistic or potential outcome
models of counterfactuals. Practitioners of small-N research will find much in this chapter that will
help them bridge the communication divide that exists with large-N researchers who deploy alternative
methodologies. Large-N researchers will benefit from the same.
In Chap. 6, “Mixed Methods and Causal Analysis,” David Harding and Kristin Seefeldt explain
how using qualitative methods alongside quantitative methods can enhance the depth of research on
causal questions of importance. Stressing the value of qualitative methods for enhancing models of
selection processes, mechanisms, and heterogeneity, they develop their argument by detailing concrete
examples of success, often from the latest research on poverty, stratification, and urban inequality.
For Part III, six chapters present some of the important extensions to conventional regression-
based approaches to data analysis that may aid in the analysis of causal effects. In Chap. 7, “Fixed
Effects, Random Effects, and Hybrid Models for Causal Analysis,” Glenn Firebaugh, Cody Warner,
and Michael Massoglia explain the value of fixed effects models, and several variants of them, for
strengthening the warrants of desired causal conclusions. In Chap. 8, “Heteroscedastic Regression
Models for the Systematic Analysis of Residual Variances,” Hui Zheng, Yang Yang, and Ken Land
explain how variance-component models can deepen the analysis of within-group heterogeneity
for descriptive and causal contrasts. Both chapters offer empirical examples from stratification and
demography, which demonstrate how to estimate and interpret the relevant model parameters.
In Chap. 9, “Group Differences in Generalized Linear Models,” Tim Liao steps back to the
full generalized linear model and demonstrates the variety of group difference models that can be
deployed for outcomes of different types, paying particular attention to distributional assumptions
and the statistical tests that can rule out differences produced by chance variability. In Chap. 10,
“Counterfactual Causal Analysis and Non-Linear Probability Models,” Richard Breen and Kristian
Karlson then offer an extended analysis of the class of these models that are appropriate for binary
outcomes. Together these two chapters demonstrate how the general linear model can be put to use
to prosecute causal questions, and yet they also show how the parametric restrictions of particular
models can represent constraints on inference and subsequent explanation.
In Chap. 11, “Causal Effect Heterogeneity,” Jennie Brand and Juli Simon Thomas consider how
regression, from a potential outcome perspective, can offer misleading representations of causal ef-
fects that vary across individuals. Taking this theme further, in Chap. 12, “New Perspectives on Causal
1 Introduction 3
Mediation Analysis,” Xiaolu Wang and Michael Sobel show how models that assume variability of
individual-level causal effects, and permit general forms of nonlinearity across distributions of effects,
are incompatible with claims that regression techniques can identify and effectively estimate separate
“direct” and “indirect” effects. Together, these two chapters demonstrate that analysis can proceed
under reasonable assumptions that causal effects are not constant and additive, but the standard tool kit
offered in generic linear modeling textbooks will fail to deliver meaningful estimates. Both chapters
offer alternative solutions that are effective and less onerous than some researchers may assume.
For Part IV, three chapters cover most of the central issues in the identification of systems of
causal relationships, all united by their attention to how modern graphical models can be used to
represent them. In Chap. 13, “Graphical Causal Models,” Felix Elwert provides a careful introduction
to the burgeoning literature on causal graphs, fully explaining the utility of directed acyclic graphs
for considering whether or not causal effects are identified with the data available to an analyst.
With incisive examples from demography and health research, the chapter demonstrates when and
why common conditioning strategies impede a causal analysis as well as how identification strategies
for time-varying treatments can be developed.
In Chap. 14, “The Causal Implications of Mechanistic Thinking: Identification Using Directed
Acyclic Graphs (DAGs),” Carly Knight and Christopher Winship enrich the recent literature on causal
mechanisms in the social sciences, which is all too often cited while also being misunderstood. The
chapter clarifies the importance and promise of the empirical search for the mechanisms that generate
effects and demonstrates how mechanisms can be represented with causal graphs, all while remaining
grounded in the most prominent and convincing treatments of mechanisms from the philosophy of
science literature. The chapter also demonstrates how casual effects that remain unidentified by all
other methods may still be identified by the specification and observation of a mechanism, under
assumptions that may be no more restrictive than those commonly invoked for other models routinely
employed by others.
In Chap. 15, “Eight Myths about Causality and Structural Equation Models,” Kenneth Bollen and
Judea Pearl team up to dispel what they see as considerable misunderstanding in the literature on
the power and utility of structural equation models. Bridging their prior work, they return to the
origins of structural modeling, trace it through the modern literature on causal graphs, and provide a
convincing case that the best days for structural equation modeling are still in the future. The chapter
demonstrates both the depth of the literature before modern causal graph methodology was developed
and the contribution of the latter in clarifying adjustment criteria, mediation methodology, and the
role of conditional independence assumptions in effect identification. Here, as in other places in the
volume, the reader will find healthy disagreement with other chapters of the volume (most notably
with Chap. 12, which takes an alternative position on contributions to the mediation literature and the
value of causal graphs more generally).
For Part V, two chapters consider the emergent literature on models of influence and interference.
In Chap. 16, “Heterogeneous Agents, Social Interactions, and Causal Inference,” Guanglei Hong and
Stephen Raudenbush demonstrate how traditional assumptions of no-unit-level interference of causal
effects can be relaxed and why such relaxation may be essential to promote consistency between the
estimated model and the true processes unfolding in the observed world. The chapter demonstrates
that such modeling is possible and that it can greatly improve conclusions of research (and with
manageable additional demands on the analyst).
In Chap. 17, “Social Networks and Causal Inference,” Tyler VanderWeele and Weihua An consider
the other side of the noninterference coin: social influence that travels across network connections
that have been established, in most cases, prior to the introduction of a treatment or exposure to a
cause. Considering both the recent experimental literature and (controversial) attempts to identify
network effects with observational data, the chapter discusses the extent to which data can reveal
social influence effects that propagate through networks (and, additionally, the effects of interventions
on social networks, including those on an ego’s ties and those on the deeper structural features of
4 S.L. Morgan
complete networks). No reader will fail to appreciate how difficult such effect identification can be
(nor, after some independent reflection, how naı̈ve many explanatory claims from the new network
science clearly are).
For Part VI, two final chapters consider how empirical analysis that seeks to offer causal knowledge
can be undertaken, even though the identification of specific effects is not possible. In Chap. 18,
“Partial Identification and Sensitivity Analysis,” Markus Gangl explains the two most prominent
strategies to determine how much information is contained in data that cannot point-identify causal
effects. Sensitivity analysis considers how large a violation of a false maintained assumption would
have to be in order to invalidate a conclusion that rests on a claim of statistical significance.
Partial identification analysis considers how much can be said about an effect with certainty while
maintaining the most strong assumptions one can assert that all critics will agree are beyond reproach
(which, in reality, will therefore be weak assumptions). More researchers should use these techniques
than do, and this chapter shows them how.
Finally, in Chap. 19, “What You Can Learn from Wrong Causal Modes,” Richard Berk and six
of his colleagues take empirical inquiry one step further. If one knows that a simple parametrically
constrained regression model will not deliver a warranted point estimate of some form of an average
causal effect, then why step away only from the causal interpretation? One should step away entirely
from the clearly incorrect model and its entailed parametric constraints and instead allow the data to
reveal more of the full complexity that nature must have constructed. The challenge is to represent
such complexity in ways that can still be summarized crisply by a model, and the chapter shows that
the most recent developments in nonparametric and semi-parametric statistics are more powerful and
practical than many researchers in the social sciences are aware. The chapter is justified by the claim,
echoed by other chapters in the volume (especially Chap. 4) that one does not need to estimate causal
effects in order to learn something about them.
Contribution
For a volume on causality, it seems especially appropriate to ask: What effects will this one have on
research practice? It is reasonable to hope that the considerable work that was required to produce it
will generate positive effects of some form.
Forecasting these effects requires that one first consider the challenges and realities of today’s
social science research. As relatively recent entrants into the academy, social scientists aspire to
produce knowledge of the highest utility that can elucidate processes that journalists, politicians, and
others opine. Yet, it would be surprising to all if such successes were easy to come by or if the
goals of social scientists were to settle by fiat the conundrums that eminently talented thinkers could
not lay to rest before the modern social sciences were established. Accordingly, nearly all domains
of substantive research in the social sciences are rife with everyday causal controversies. Verified
causal explanations to some scholars are spurious associations to others. Deep and compelling causal
accounts to some scholars are shallow surface narratives to others.
Why are causal controversies in the social sciences so persistent? It would appear that the answer
to this question is found in the confluence of substantive domains that are largely observational
with the freedom that academic researchers have from real-world demands for action. The former
prompts researchers to ask questions for which no infallible and easy-to-implement designs exist,
and the latter, when paired with the former, has bred fields of social science that lack inquiry-ending
standards. Consider some counterexamples, where observational inquiry is productively paired with
such standards. In the law, decisions must be rendered, either by judges or by juries, and so the
concepts of “cause-in-fact” and “legal cause” have been developed to bring cases to a close. In
medical practice, a treatment must begin, which requires that a diagnosis for the relevant malady first
1 Introduction 5
be adopted. The diagnosis, this nonphysician perhaps mistakenly assumes, amounts to asserting the
existence of responsible causes in sufficient detail to pick from amongst the most effective available
treatments. In academic social science, what brings our causal controversies to conclusion in the
absence of shared routines for doing so? Too often, little more than fatigue and fashion.
I would not claim that any of the questions raised long ago by Hume, Mill, Peirce, and others have
been resolved by the contents of this volume. However, I am optimistic that this volume, when read
alongside other recent writing on causality, will move us closer to a threshold that we may soon cross.
On the other side, most researchers will understand when causal conclusions are warranted, when
off-the-shelf methods do not warrant them, and when causal questions cannot be answered with the
data that are available. We will then be able to evolve inquiry-ending standards, sustained by new
systems that promote the rapid diffusion of research findings. If we can cross this threshold, some
of the unproductive contestation that now prevails will subside, and manifestly incorrect results will
receive less attention. Fewer causal conclusions will be published, but those that are will be believed.