Elliott 2017
Elliott 2017
Abstract. Although selecting a probability sample has been the standard for
decades when making inferences from a sample to a finite population, incen-
tives are increasing to use nonprobability samples. In a world of “big data”,
large amounts of data are available that are faster and easier to collect than
are probability samples. Design-based inference, in which the distribution for
inference is generated by the random mechanism used by the sampler, cannot
be used for nonprobability samples. One alternative is quasi-randomization in
which pseudo-inclusion probabilities are estimated based on covariates avail-
able for samples and nonsample units. Another is superpopulation modeling
for the analytic variables collected on the sample units in which the model is
used to predict values for the nonsample units. We discuss the pros and cons
of each approach.
Key words and phrases: Coverage error, hierarchical regression, quasi-
randomization, reference sample, selection bias, superpopulation model.
249
250 M. R. ELLIOTT AND R. VALLIANT
respondents would actually vote, question wording and Repeatedly attempting to get nonrespondents to coop-
framing, deliberate misreporting, and volatility in vot- erate, which is standard procedure in probability sam-
ers’ opinions about candidates. The samples for the ples, can be expensive and time-consuming. Eliminat-
2015 British polls were online or telephone polls that ing nonresponse followup is also an expedient way of
could not be considered probability samples of all reg- cutting costs. In telephone-only surveys, no amount of
istered voters. Demographic population totals for char- nonresponse followup is likely to boost response to the
acteristics like age, sex, region, social grade and work- rates that were considered minimally acceptable 10 to
ing status were used to set quota sample and weighting 15 years ago. For these reasons, nonprobability sam-
targets. After evaluating eight putative explanations, pling is currently staging a kind of renascence (e.g.,
Sturgis et al. (2016) concluded that the British polls see Berzofsky, Williams and Biemer, 2009, Dever and
were wrong because of their unrepresentative samples. Valliant, 2014).
The statistical adjustment procedures that were used There are also other data sources that are currently
did not correct this basic problem. receiving attention and might be considered for fi-
On the other hand, selecting a probability sample nite population estimation (Couper, 2013). Social me-
does not guarantee that the cooperating units will pro- dia and other data that can be scraped from the web
vide a good basis for inference to a population. In might be used for gauging public opinion (Murphy
many types of surveys response rates have declined et al., 2015) or measuring changes in consumer prices
dramatically, casting doubt on how well these samples (Cavallo and Rigobon, 2016). Although the inferential
represent the population. Pew Research reported that issues raised subsequently apply to these “big data,” we
their response rates (RRs) in typical telephone surveys mainly concern ourselves with nonprobability samples
dropped from 36% in 1997 to 9% in 2012 (Kohut et al., that were directly collected for the purposes of making
finite population estimates.
2012). With such low response rates, a sample initially
selected randomly can hardly be called a probability 1.1 Types of Nonprobability Samples
sample from the desired population. Low RRs raise the
There are a number of types of nonprobability sam-
question of whether probability sampling is a viable
ples that are summarized briefly below. Regardless of
methodology for general population surveys without
type, there is quite a bit of controversy about the use
expensive face-to-face data collection methods which
of nonprobability surveys for making inferences. Sec-
usually have higher response.
tion 2 describes the potential problems with nonproba-
For some purposes, convenience samples or other
bility samples that can bias inferences. However, these
types of nonprobability samples have long been ac- concerns are not limited to finite population inference.
ceptable. For example, using convenience samples in Keiding and Louis (2016) is a recent discussion of
experimental studies is standard practice, even when problems with self-selected entry to epidemiological
the conclusions are intended to apply to some larger studies and surveys. Stuart et al. (2011) considers the
population. The inferences are model-based and come use of propensity cores to generalize results from ran-
from assuming that the experimental effects are ho- domized trials to populations. Kaizar (2015) reviews
mogeneous among all units in the relevant population. approaches that have been proposed for combining ran-
Models are also used for inference in observational domized and nonrandomized studies in the estimation
studies where, in contrast to designed experiments, as- of treatment efficacy. O’Muircheartaigh and Hedges
signments of interventions or treatments are not con- (2014) describe the use of stratified propensity scores
trolled by an experimenter. However, the lack of ran- for analyzing a nonrandomized social experiment.
domization in those studies may threaten their validity For finite population sampling, the American Asso-
(Madigan et al., 2014). Inferences from nonprobability ciation of Public Opinion Research (AAPOR) has is-
samples must also rely on models, rather than the dis- sued two task force reports on the use of nonprobabil-
tribution generated by random sampling, to project a ity samples—neither of which favored their use. Baker
sample to a larger finite population. et al. (2010) studied the use of online Internet panels;
Obtaining data without exercising much control over Baker et al. (2013a, 2013b) cover nonprobability sam-
the set of units for which it is collected is often cheaper pling generally. Baker et al. (2010) recommended on
and quicker than probability sampling where efforts are several grounds that researchers not use online pan-
made to use a frame that covers most or all of the popu- els if the objective is to accurately estimate popula-
lation, and units are randomly selected from the frame. tion values. Among other reasons, they noted that (i)
NONPROBABILITY SAMPLES 251
some comparative studies showed that nonprobability observational studies. A variation of matching in sur-
samples were less accurate than probability samples; vey sampling is to match the units in a nonprobability
(ii) the demographic composition of different panels sample with those in a probability sample. Each unit in
can affect estimates; and (iii) not all panel vendors the nonprobability sample is then assigned the weight
fully disclose their methods. Baker et al. (2013a) took of its match in the probability sample. Rivers (2007)
a more nuanced view that inferences to a population describes this type of sampling matching in the con-
from nonprobability samples can be valid but that the text of web survey panels. Other techniques developed
modeling assumptions needed are difficult to check. by Rosenbaum and Rubin (1983) and others for ana-
Nonprobability surveys capture participants through lyzing observational data have also been applied when
various methods. The AAPOR task force on nonprob- attempting to develop weights for some volunteer sam-
ability sampling (Baker et al., 2013a) characterized ples.
these samples into three broad types: In network sampling, members of some target pop-
ulation (usually a rare one like intravenous drug users
1. Convenience sampling. or men who have sex with men) are asked to identify
2. Sample matching. other members of the population with whom they are
3. Network sampling. somehow connected. Members of the population that
Baker et al. (2013a) describe these in some detail; are identified in this way are then asked to join the
we briefly summarize them here. Convenience sam- sample. This method of recruitment may proceed for
pling is a form of nonprobability sampling in which several rounds. Snowball sampling (also called chain
easily locating and recruiting participants is the pri- sampling, chain-referral sampling or referral sampling)
mary consideration. No formal sample design is used. is an example of network sampling in which existing
Some types of convenience samples are mall inter- study subjects recruit additional subjects from among
cepts, volunteer samples, river samples, observational their acquaintances. These samples typically do not
studies and snowball samples. In a mall intercept sam- represent any well-defined target population, although
they are a way to accumulate a sizeable collection of
ple, interviewers try to recruit shoppers to take part in
units from a rare population.
some study. Usually, neither the malls nor the people
Sirken (1970) is one of the earliest examples of
are probability samples.
network or multiplicity sampling in which the net-
Volunteer samples are common in social science,
work that respondents report about is clearly defined
medicine and market research. Volunteers may partici-
(e.g., members of a person’s extended family). Prop-
pate in a single study or become part of a panel whose
erly done, a multiplicity sample is a probability sample
members may be recruited for different studies over the
because a person’s network of recruits is well-defined.
course of time. A recent development is the opt-in web Heckathorn (1997) proposed an extension to this called
panel in which volunteers are recruited when they visit respondent driven sampling (RDS) in which persons
particular web sites (Schonlau and Couper, 2017). Af- would report how many people they knew in a rare
ter becoming part of a panel, the members may par- population and recruit other members of the rare popu-
ticipate in many different surveys, often for some type lation. RDS has been used in many applications. For
of incentive. River samples are a version of opt-in web example, Frost et al. (2006) used RDS to locate in-
sampling in which volunteers are recruited at a number travenous drug users; Schonlau, Weidmer and Kapteyn
of websites. Some thought may be given to the set of (2014) used it in an attempt to recruit an internet panel.
websites used for recruitment with an eye toward ob- If some restrictive assumptions on how the recruiting
taining a cross-section of demographic groups. is done are satisfied, probabilities of being included in
In sample matching, the members of a nonproba- a sample can be computed and used for inferences to
bility sample are selected to match a set of important a full rare population, but these assumptions can easily
population characteristics. For example, a sample of be violated (e.g., see Gile and Handcock, 2010). Be-
persons may be constructed so that its distribution by cause the network applications are extremely special-
age, race-ethnicity and sex closely matches the distri- ized, we will not address them further.
bution of the inference population. Quota sampling is
1.2 General Framework for Inference
an example of sample matching. The matching is in-
tended to reduce selection biases as long as the covari- Smith (1983) discusses the general problem of mak-
ates that predict survey responses can be used in match- ing inferences from nonrandom samples. His formula-
ing. Rubin (1979) presents the theory for matching in tion is to consider the joint density of the population
252 M. R. ELLIOTT AND R. VALLIANT
vector of an analysis variable, Y = (Y1 , Y2 , . . . , YN ) Since the sample values are observed, we use lower
and the population vector of 0–1 indicator variables, case y for them; upper case is used for the unobserved,
δ s = (δ1 , δ2 , . . . , δN ) for a sample s. The presentations nonsample values. In this simple case, the nonsample
of Rubin (1976) and Little (1982) on selection mecha- sum, ts̄ , is often estimated (or predicted) by aweighted
nisms and survey nonresponse are closely related. Sup- sum of the sample observations, that is, tˆs̄ = i∈s wi yi
pose that X is an N × p matrix of covariates that can where wi is a weight that may be dependent on the
be used in designing a sample or in constructing es- units in the sample. [Alternative ways of calculating
timators. The conditional density of Y given X and weights in probability samples are discussed in Haziza
a parameter vector is f (Y|X; ). The density of and Beaumont (2017)].
Typically, the estimator can
δ s given Y, X, and another unknown parameter is also be written as tˆs̄ = i∈s̄ ŷi where ŷi is a prediction
f (δ s |Y, X; ). The joint model for Y and δ s is for nonsample unit i. Thus, for totals the estimation
problem is one of prediction.
(1) f (Y, δ s |X; , ) = f (Y|X; )f (δ s |Y, X; ). Estimation of model parameters often requires solv-
Note that this allows the possibility that being in the ing a set of estimating equations for the parameter es-
sample depends on Y, that is, to be not missing at ran- timates. The estimating equations can be linear in the
dom (NMAR). In a probability sample (without nonre- parameters, as for linear regression or nonlinear, as for
sponse or other missingness that is out of control of generalized linear models. In design-based finite popu-
the sampler), f (δ s |Y, X; ) = f (δ s |X). The density lation estimation, the estimating equations include sur-
f (δ s |X) is the randomization distribution and is the vey weights and are estimators of types of finite pop-
basis for design-based inference. However, in a non- ulation totals (Binder and Roberts, 2009). If weights
probability sample, the distribution of δ s can depend are constructed for a nonprobability sample that are
on both Y and an unknown parameter . Depending appropriate for estimating totals, then those weights
on the application, inference can be based on either can also be used in the estimating equations. Conse-
f (Y|X; ) or f (δ s |Y, X; ) or on a combination of quently, weight construction for nonprobability sam-
both. ples can play the same role in estimation as in proba-
We term two general approaches to making infer- bility sampling.
ences from nonprobability samples as quasi-random- Baker et al. (2013a) discuss the methods that have
ization and superpopulation. Quasi-randomization is been proposed for weighting nonprobability samples.
described in Section 3 and requires modeling f (δ s | Such samples lack many of the features that guide
Y, X; ). Ideally, the probability of being in the sam- weighting in probability samples. A nonprobability
ple is not NMAR and a model can be found for sample is not selected randomly from an explicit sam-
f (δ s |X; ). The superpopulation approach is cov- pling frame. Consequently, selection probabilities can-
ered in Section 4 and involves modeling f (Y|X; ). not be computed, and the usual method of comput-
Both of these approaches involve models, but the ap- ing base weights (inverses of selection probabilities)
proaches are fundamentally different. In the quasi- does not apply. Weights can, however, be computed
randomization approach the probability of a unit’s be- using the quasi-randomization or superpopulation ap-
ing included in the sample is modeled. In the superpop- proaches noted above.
ulation approach, the analytic variables (y’s) collected
in the sample are modeled. Deville (1991) also covers 2. POTENTIAL PROBLEMS WITH
these approaches in the context of quota sampling. NONPROBABILITY SAMPLES
Descriptive statistics, like means and totals, and ana- Since nonprobability samples are often obtained in
lytic statistics, like model parameters, are common es- a poorly controlled or uncontrolled way, they can be
timands in finite population estimation. Detailed dis- subject to a number of biases when the goal is infer-
cussion of the latter is given in Lumley and Scott ence to a specific finite population. Several issues are
(2017). Finite population totals are the simplest target listed here in the context of voluntary Internet panels,
to discuss. A total of some quantity Y can be written but other types of nonprobability samples can suffer
as the sum of the values over the set of sample units, s, from similar problems.
and the sum over the nonsample units s̄: Selection bias occurs if the seen part of the popula-
tU = yi + Yi ≡ ts + ts̄ . tion (the sample) differs from the unseen (the nonsam-
i∈s i∈s̄ ple) in such a way that the sample cannot be projected
NONPROBABILITY SAMPLES 253
TABLE 1
Percentages of US households with Internet subscriptions;
2013 American Community Survey
due to questionnaire design, mode and peculiarities of ers to probability samples and online to nonprobabil-
respondents. For example, the persons who participate ity samples. As they noted, “Only one of these studies
in panels tend to have higher education levels. The mo- yielded consistently equivalent findings across meth-
tivation for participating may be a sense of altruism for ods, and many found differences in the distributions
some but may be just to collect an incentive for oth- of answers to both demographic and substantive ques-
ers. Participants are often paid per survey completed. tions. Further, these differences generally were not sub-
Some respondents speed through surveys, answering stantially reduced by weighting.”
as quickly as possible to collect the incentive. This is a
Despite all of these actual and potential problems,
form of “satisficing” where respondents do just enough
online panels are now widely used. For example, the
to get the job done (Simon, 1956). On the other hand,
Washington Post newspaper and the company, Survey-
self-administered online surveys do tend to elicit more
reports of socially undesirable behaviors, like drug use, Monkey, have recently mounted a nonprobability, on-
than do face-to-face surveys. Higher reports are usu- line poll of over 75,000 registered voters that covers all
ally taken to be more nearly correct. But, it may be that 50 states in the US (Clement, 2016). Baker et al. (2010)
the people taking those surveys just behave undesirably quotes the market research newsletter, Inside Research
more often than the general population. as estimating the total spent on online research in 2009
Baker et al. (2010, page 739) list 19 studies where at about $2 billion USD, the vast majority of which is
the same questionnaire was administered by interview- supported by online panels.
NONPROBABILITY SAMPLES 255
found in a probability, reference sample, this would (Elliott and Davis, 2005):
be individual-level matching. The matches would be
found based on covariates available in each dataset. P Si∗ = 1|xi = xo
This may be done based on individual covariate values P (xi = xo |Si∗ = 1)P (Si∗ = 1)
or on propensity scores as described in Rosenbaum and =
P (xi = xo )
Rubin (1983). This is an example of predictive mean (3)
P (xi = xo |Si∗ = 1)P (Si∗ = 1)P (Si = 1|xi = xo )
matching in which an imputation of an inclusion prob- =
ability is made for each nonprobability unit. P (Si = 1)P (xi = xo |Si = 1)
Matching at the aggregate level consists on making P (xi = xo |Si∗ = 1)P (Si = 1|xi = xo )
the frequency distribution of the nonprobability sample ∝ ,
P (xi = xo |Si = 1)
the same as that of the population. Quota sampling is
an example of this. For example, the age × race distri- where P (Si = 1)/P (Si∗ = 1) can be treated as a nor-
bution of the sample might be controlled to be the same malizing constant.
as that in the population. If we start with a large panel Estimating P (xi = xo |Si∗ = 1) and P (xi = xo |Si =
of volunteers, a subsample might be selected to achieve 1) can be difficult for a general joint distribution of
this kind of distributional balance. Each person would covariates x, but extensions of discriminant analysis
receive the same weight, which is the same way that (without making a normality assumption) provide a
a proportionally allocated probability sample would be way around this problem. Combine the probability and
treated. Considered in this way, quota sampling falls nonprobability samples and let Zi = 1 for nonproba-
into the quasi-randomization framework. bility cases (i.e., Si∗ = 1, Si = 0) and Zi = 0 for the
A probability sample used as a reference survey or probability cases (i.e., Si∗ = 0, Si = 1) conditional on
in sample matching ideally must not be subject to cov- being in the combined probability-nonprobability sam-
erage or other types of bias. As noted in Section 1, ple (i.e., Si∗ + Si = 1). Then
many probability samples are now subject to high non-
P (xi = xo |Zi = 1)
response rates and are tantamount to nonprobability
samples themselves. Poor quality reference or match- P (xi = xo |Zi = 0)
ing samples can lead to biased estimators of the in- P (Zi = 1|xi = xo )P (xi = xo )/P (Zi = 1)
(4) =
clusion probabilities in (2) and, consequently, biased P (Zi = 0|xi = xo )P (xi = xo )/P (Zi = 0)
estimators from the nonprobability sample. This is an P (Zi = 1|xi = xo )
argument for using large, well-controlled samples con- ∝ .
P (Zi = 0|xi = xo )
ducted by central governments for reference or match-
ing samples if at all possible. For example, in a house- As long as sampling fractions are small, P (Si =
hold survey in the US, the American Community Sur- 1, Si∗ = 0) ≈ P (Si = 1) and P (Si = 0, Si∗ = 1) ≈
vey (https://www.census.gov/programs-surveys/acs/) P (Si∗ = 1), so P (xi |Zi = 0) = P (xi |Si = 1, Si∗ = 0) ≈
would be a good choice. P (xi |Si = 1) and P (xi |Zi = 1) = P (xi |Si = 0, Si∗ =
1) ≈ P (xi |Si∗ = 1). Thus,
3.1 Estimation Using Pseudo-Weights
This approach assumes that the nonprobability sam- P Si∗ = 1|xi = xo
ple actually does have a probability sampling mecha- · P (Zi = 1|xi = xo )
nism, albeit one with probabilities that have to be es- ∝ P (Si = 1|xi = xo ) .
P (Zi = 0|xi = xo )
timated under identifying assumptions. The goal is to
estimate this unknown probability of selection relying The resulting “pseudo-weight” is given by
on a true probability sample or a census with common
wi = 1/P̂ Si∗ = 1|xi = xo
variables that explain the unknown sampling mecha- (5)
nism (Elliott, 2009, Elliott et al., 2010). Let Si denote P̂ (Zi = 0|xi = xo )
the sampling indicator for the probability sample, Si∗ ∝ 1/P̂ (Si = 1|xi = xo ) .
P̂ (Zi = 1|xi = xo )
denote the indicator for the nonprobability sample, and
xi be the set of common covariates available to both If the covariates xi that are available in both the non-
samples that are assumed to fully govern the sampling probability and probability sample match those used
mechanism for both. Applying Bayes rule, we have to design the probabilities of selection/inclusion in the
NONPROBABILITY SAMPLES 257
probability sample, (5) can be written as be adapted to cases where the nonprobability sample
represents only a portion of the population.
wi = 1/P̂ Si∗ = 1|xi = xo
(6) If analysis of the nonprobability sample only is re-
P̂ (Zi = 0|xi = xo ) quired, the pseudo-weight construction is complete. If
∝ w̃i , the nonprobability and probability samples are to be
P̂ (Zi = 1|xi = xo ) combined, the nonprobability sample pseudo-weights
where w̃i is the inverse of the probability of selection and probability sample weights are normalized so that
for the nonprobability unit in the probability sampling the weighted fraction of the nonprobability sample is
frame. Otherwise, in the more likely setting where xi equal to the unweighted fraction of the nonprobabil-
does not correspond precisely to the probability sample ity sample cases in the combined dataset, and similarly
design variables, P̂ (Si = 1|xi = xo ) can be estimated the weighted fraction of the probability sample is equal
by regressing xi on w̃i−1 via beta regression (Ferrari to the unweighted fraction of the probability sample
and Cribari-Neto, 2004) in the probability sample, and cases in the combined dataset (Korn and Graubard,
predicting P (Si = 1|xi = xo ) for the nonprobability 1999, pages 278–284). This ensures that the sum of
sample elements. the combined weights continues to approximate the
The term P̂ (Zi = z|xi = xo ) can be obtained via lo- population size, and that each sample will contribute
gistic regression, or, to reduce model misspecification in proportion to their unweighted sample size. This
if xi is of high dimensionality, via least absolute shrink- is accomplished by setting ŵi = CS ∗
× wi for CS ∗ =
age and regression operator (LASSO) (Tibshirani, nS ∗ /(nS + nS ∗ ) × i I (Zi = 0)w̃i / i I (Zi = 1)wi
1996, LeBlanc and Tibshirani, 1998), Bayesian addi- for the nonprobability sample cases and ŵi = CS × w̃i
tive regression trees (BART) (Chipman, George and for CS = nS /(nS + nS ∗ ).
McCulloch, 2010), or super learner algorithms that To obtain inference, the pseudo-weights or the nor-
combine estimators from numerous model fitting meth- malized pseudo-weights and probability sample
ods (Van der Laan, Polley and Hubbard, 2007). In some weights in the combined dataset can be used to ob-
settings, the nonprobability sample will represent only tain weighted point estimates. For variance estimation,
a portion of population; for example, in a setting with a bootstrap or jackknife estimator should be used to in-
a binary outcome Y (e.g., injured/uninjured) only pos- corporate both sampling variability in the estimation of
itive outcomes Y = 1 (e.g., injuries) might be repre- the pseudo weights and in the estimation of the main
sented in the nonprobability dataset; in this case (5) is quantity of interest. In the absence of true design infor-
updated as mation in the nonprobability sample, resampling at the
subject level for the bootstrap or leave-one-out compu-
wi = 1/P̂ Si∗ = 1|xi = xo
tation of the pseudo-estimate for the jackknife can be
(7) ∝ 1/P̂ (Si = 1|xi = xo , Yi = 1) applied. However, some thought must be given to the
structure of the convenience sample. For example, the
P̂ (Zi = 0|xi = xo ) websites used to recruit a volunteer web panel might
· .
P̂ (Zi = 1|xi = xo ) properly be considered as clusters if different types of
An alternative to estimating the probability of unit persons visit the different sites (Brick, 2015). For the
i’s being in the nonprobability sample is used by some probability sample, resampling clusters within strata
panel vendors. The probability (reference) and non- and use of the Rao–Wu bootstrap (Rao and Wu, 1988,
probability samples are combined, but a logistic regres- Rao, Wu and Yue, 1992) to accommodate weights can
sion is run to estimate P (Si∗ = 1|xi = xo ), not condi- be used. For the jackknife, clusters within strata should
tioned on being in the combined probability and non- be dropped, with standard weighting up by the number
probability sample (e.g., see Valliant and Dever, 2011). of clusters divided by the number of clusters retained
This is done by assigning a weight of 1 to the non- to maintain the stratum size should be used. For each
probability cases, the probability sampling weight to bootstrap or jackknife iteration, the pseudo-weights
the probability cases, and running a weighted logistic should be recomputed as well as the point estimator
regression. The model predictions, thus, refer to the using the dropped-out or resampled data.
unconditional probability, P (Si∗ = 1|xi = xo ), not the
4. SUPERPOPULATION MODEL APPROACH
probability conditional on being in the combined sam-
ple. Whether this method is better or worse than (5) In the superpopulation modeling approach, a statis-
has not been studied, although, as noted above, (5) can tical model is fitted for a Y analysis variable from the
258 M. R. ELLIOTT AND R. VALLIANT
sample and used to project the sample to the full pop- For some common estimation methods like poststrat-
ulation. That is, inferences are based on f (Y|X; ). ification, only population totals of the covariates are
This approach could, of course, also be used with a required to construct the estimator, so that individual
probability sample. The difference here is that design- nonsample X values are unnecessary. Suppose that the
based inference, where the randomization distribution mean of a variable yi follows a linear model:
is under the control of the sampler, is not an option for
a nonprobability sample. As noted in Smith (1983), the EM (yi |xi ) = xTi β,
sample selection mechanism can be ignored for model- where the subscript M means that the expectation is
based inferences about the distribution of Y if with respect to the model, xi is a vector of p covariates
(8) f (δ s |Y, X; ) = f (δ s |X; ), for unit i and β is a parameter vector. Given a sample s,
an estimator of the slope parameter is β̂ = A−1 T
s Xs ys
which would be the formal justification for using only
where As = XTs Xs , Xs is the n × p matrix of covariates
f (Y|X; ). There are purposive, nonprobability sam-
for the sample units, and ys is the n-vector of sample
ples that satisfy (8). For example, selecting the n units
y’s. (Weighted least squares might also be used if there
with the largest x values as is done by US Energy
were evidence of nonhomogeneous model variances.)
Information Administration (2016), or sampling bal-
A prediction of the value of a unit in the set of nonsam-
anced on population moments of covariates (Royall,
1970, 1971) are ignorable, nonprobability plans. How- ple units, denoted by r, is ŷi = xTi β̂. A predictor of the
ever, in nonprobability samples where the selection of population total is
sample units is not well-controlled, (8) may not hold tˆ1 = yi + ŷi
and the quasi-randomization and superpopulation ap- i∈s i∈s̄
proaches could be combined. (9)
Note that Y can be partitioned between the sam- = yi + (tU x − tsx )T β̂,
ple and nonsample units as Y = (Ys , Ys̄ ). Thus, i∈s
f (Y|X; ) = f (Ys |Ys̄ , X; )f (Ys̄ |X; ). If f (Ys | where tU x is the total of the x s in the population and
Ys̄ , X; ) = f (Ys |X; ), then Ys and Ys̄ are inde- tsx is the sample sum of the x s. This estimator is also
pendent conditional on the covariates, X. If model- equal to the general regression estimator (GREG) of
based inferences are desired for , these can be done Särndal, Swensson and Wretman (1992) if the inverse
based only on f (Ys |X; ). However, if descriptive in- selection probabilities in that estimator are all set to 1.
ferences are required for the full population Y, then The theory for this prediction approach is extensively
f (Ys̄ |X; ) must be estimated. If this model has the covered in Valliant, Dorfman and Royall (2000). If the
same form as f (Ys |X; ), then the model fitted from sample is a small fraction of the population, as would
the sample can be used to predict values for the non- be the case for most volunteer web surveys, the predic-
sample. If this is not the case, inference to the full pop- tion estimator is approximately the same as predicting
ulation may be difficult or impossible. the value for every unit in the population and adding
To introduce the superpopulation approach, consider the predictions:
the simple case of estimating a finite population total.
The general idea in model-based estimation when es- (10) tˆ2 = ŷi = tTU x β̂.
timating a total is to sum the responses for the sample i∈U
cases and add to them the sum of predictions for non-
sample cases. The key to forming unbiased estimates The population mean of y can be estimated by Ȳˆ =
X̄UT β̂ where X̄ = t /N , the population vector of co-
is that the variables to be analyzed for the sample and U Ux
nonsample follow a common model and that this model variate means.
can be discovered by analyzing the sample responses. The estimators in (9) or (10) are quite flexible in
When both the sample and nonsample units follow the what covariates can be included. For example, we
same model, model parameters can be estimated from might predict the amount that people have saved for re-
the sample and used to make predictions for the non- tirement based on their occupation, years of education,
sample cases. An appropriate model usually includes marital status, age, number of children they have and
covariates, as in f (Ys |X; ) above, which are known region of the country in which they live. Constructing
for each individual sample case. The covariates may the estimator would require that census counts be avail-
or may not be known for individual nonsample cases. able for each of those covariates. Another possibility is
NONPROBABILITY SAMPLES 259
to use estimates from some other larger or more accu- where vi is a variance parameter that does not have to
rate survey (e.g., Dever and Valliant, 2010, 2016). The be specifically defined. The variance estimators below
reference surveys mentioned earlier could be a source will work regardless of the form of vi (as long as it is
of estimated control totals in which webographic co- finite).
variates might be used. For use below, define ai to be wi − 1 where wi is
Both (9) and (10) can be written so that they are either w1i or w2i . The variance estimators below then
weighted sums of y’s. If (9) is used, the weight for unit apply for either of the w1i or w2i weights. The predic-
i is w1i = 1 + tTrx A−1
s xi where trx = tU x − tsx . In (10), tion variance of an estimator of a total, tˆ, is defined
the weight is w2i = tTU x A−1
s xi . The estimated total for as
an analysis variable can be written as tˆ = s wi yi
(12) VM (tˆ − tU ) = ai2 vi + vi .
where wi is either w1i or w2i . Notice that these weights
i∈s i∈r
depend only on the x’s not on y. As a result, the same
set of weights could be used for all estimates. It is true The population total of y, tU , is subtracted on the
that a single set of weights will not be equally efficient left-hand side because the sum is random under the
for every y, but this situation is also true for design- model. As long as the fraction of the population that
based weights. is sampled is very small, the second term on the right-
In the superpopulation (y-model) approach, statis- hand side above is inconsequential compared to the
tical properties, like bias and variance, are computed first. The variance estimators are built from the model
conditional on the set of sample units that is ob- residuals, ri = yi − xTi β̂. An estimator of the dominant,
served. This contrasts to the quasi-randomization ap- first term is
proach where the pseudo design-based calculations av- (13) ai2 v̂i ,
erage over the random appearance in the sample of s
units that have the same configuration of covariates ob-
served in the sample. A quasi-randomization estima- where v̂i can be any of three choices: (i) ri2 , (ii) ri2 /(1 −
tor that only uses inverse estimated inclusion probabil- hii ), or (iii) [ri /(1 − hii )]2 where hii is the leverage for
ities as weights will be biased under a y-model where unit i, defined as the diagonal element of the hat matrix
EM (y|x) depends on covariates. Consequently, the y- H = XTs A−1 s Xs . As the sample size increases and if no
model approach to constructing estimators can produce x is extreme, each leverage will converge to zero.
more precise estimators than the quasi-randomization The estimators of the first term are robust in the sense
approach alone. Chen (2015) gives some numerical il- that they are approximately model-unbiased regardless
lustrations of this approach applied to a nonprobability of the form of vi (which is unknown) as long as the
sample. sampling fraction is small. The first choice, v̂i = ri2 ,
when used in (13), gives an example of a sandwich es-
4.1 Variance Estimation for Prediction Estimators timator. The second choice adjusts for the fact that ri2
For the frequentist methods, estimating the vari- is slightly biased for vi . The third choice is very simi-
ance of an estimator is the usual step toward mak- lar to the jackknife in which one sample unit at a time
ing inferences about population values. There are sev- is deleted, a new estimate of the total computed, and
eral choices for variance estimators when model-based the variance among those delete-one estimates is used.
weighting is used. These are described in Valliant, Since the second term in (12) is usually negligible com-
Dorfman and Royall (2000, Chapter 5). To fully de- pared to the first, misspecifying its form is likely to be
fine the model, we need to add a variance specification. unimportant. Valliant, Dorfman and Royall (2000) pro-
The ones we summarize here are appropriate for mod- vide some options for estimating that term.
els in which units are mutually independent. Although The bootstrap is another replication estimator that
model-based estimators have been extended to cases should be equally robust, although, to our knowledge,
where units are correlated within clusters (Valliant, finite population, model-based theory has not been
Dorfman and Royall, 2000, Chapter 9), these clustered worked-out for the bootstrap. The bootstrap should
structures are often unnecessary for the web surveys also be consistent for estimating the variance of esti-
and similar cases that we cover here. Suppose that the mated quantiles, unlike the jackknife. If the population
full model is totals for some of the covariates are estimated from an
EM (yi |xi ) = xTi β independent survey, then the variance in (12) should
(11) be modified by adding a term to reflect that additional
VM (yi |xi ) = vi , uncertainty (e.g., see Dever and Valliant, 2010, 2016).
260 M. R. ELLIOTT AND R. VALLIANT
k1 =1
τ 2 and sample
sizes nh within the hth combination of
x’s, and n = h nh ; ȳh is the sample mean for units in
p
p
the hth combination and ȳ is the mean for all units. In
(14) + βk1 ,k2 I (xik1 = 1)I (xik2 = 1) + · · ·
practice, σ 2 and τ 2 are replaced, for example, with em-
k1 =1 k2 =2
pirical Bayes estimators. Simulation studies in Elliott
p
and Little (2000) showed that exchangeable priors of
+ βk1 ,k2 I (xikp−1 = 1)I (xikp = 1) the form (17) were somewhat fragile, tending to over-
kp−1 =p−1
smooth when σ 2 and τ 2 were approximately equal. Al-
p
ternative priors that ordered the strata or poststrata h
+ · · · + βk1 ,k2 ,...,kp I (xikl = 1), by sampling weights wh = Nh /nh for population size
l=1 Nh and included information about this structuring in
where I (·) is a binary indicator variable. Raking as- either the prior mean or the variance (e.g., having the
sumes main effects only: mean be a function of wh , or the variance an autore-
p
gressive structure as a function of |h − h |) had much
(15) μyi = EM (yi |xi ) = β0 + βk I (xik = 1). better performance with respect to coverage and mean
k=1 square error.
Denote the 2p possible combinations of values of Wang et al. (2015) used an extension of this hier-
x1 , . . . , xp by h = 1, . . . , 2p . The resulting estimates of archical model approach, termed multilevel regression
a population mean are given by and stratification (MRP), to obtain estimates of voting
2p
behavior in the 2012 US Presidential election from a
(16) Yˆ = Ph μ̂h , highly nonrepresentative convenience sample of nearly
h=1 350,000 Xbox users, empaneled 45 days prior to the
election. This large sample, combined with highly pre-
where Ph is the proportion of the population whose
combination of binary indicator variables is equal to h. dictive covariates about voting behavior, including in-
That is, the Ph are special cases of X̄U at the beginning formation about party identification and 2008 Presi-
of this section. dential election voting behavior, allowed for a refined
The estimated mean, μ̂h , of the hth combination is prediction model that incorporated numerous interac-
found by replacing each β with an estimator, β̂, in (14) tions and used priors on the βs to stabilize parame-
for the poststratification estimator and in (15) for the ter estimates and resulting values of μh . The values
raking estimator. (Note that μ̂h is an estimator of μyi of Ph were estimated via probability sample exit polls
for each unit in combination h.) These correspond to from the 2008 US Presidential election, themselves
the weighted estimates obtained from poststratification of very large size (over 100,000). Wang et al. (2015)
or raking. Both of these models can be extended to gen- showed that, despite the fact that the raw Xbox esti-
eralized linear regression by replacing μyi with the ap- mates were severely biased in favor of Romney, re-
propriate link function g(μyi ) (logistic link for logistic flecting its largely male and white sample composi-
regression of a binary outcome, log link for a count out- tion, accurate estimates of voting behavior were ob-
come, etc.). Intermediate models between poststratifi- tained, based on comparisons with aggregated proba-
cation (14) and raking (15) can be fit by incorporating bility sampling polls as well as the final election re-
some but not all possible interaction terms. sult. This accuracy was due to the large sample size
NONPROBABILITY SAMPLES 261
that allowed prediction of voting behavior among de- probability and nonprobability samples. Draws of
cidedly under-represented elements of the population p(Xns |Ys , Xs Zp ) can be made under (18) and im-
(e.g., older minority females), combined with the hier- putations of Yns made by alternating between draws
archical regression modeling to stabilize predictions. of p(θ |Y, X) and p(Yns |Ys , X, θ ). Full implemen-
tation is made by obtaining L Bayesian draws of
4.2.1 Multilevel regression and stratification via Ys , Xs , Zp , S draws of Xns via a weighted FPBB, and
Bayesian finite population inference. Wang. et al.’s im- finally M draws of Yns via standard multiple impu-
plementation of MRP ignored uncertainty in the esti- tation methods (including, possibly, MRP models of
mation of the Ph from the probability sample. While the form used in Wang et al., 2015). Inference about
this may have been warranted due to its large size, in Y , or, more typically, functions Q ≡ Q(Y ) can then
general failure to account for this variance will lead be made via the approximate posterior distribution
to anti-conservative inference (too narrow confidence of Q given by t
L−1
(QL , (1 + L−1 )V L ) where QL =
intervals). An alternative approach would be utilize = l (Q̃ − QL )
1 (lms) 1 (l) 2
LMS l m s q and V L L−1
a Bayesian finite population inference approach that
for Q̃(l) = MS 1
m sq
(lms) and q (lms) is Q(Y (lms) )
treats the unsampled elements in the population as
missing data, together with the variable Y that is miss- where Y (lms) = (Ys , Yns lms ) for Y lms obtained from the
ns
ing in the probability sample data but available in the sth imputation of the mth weighted FPBB of the lth
BB. Details are available in Zhou, Elliott and Raghu-
nonprobability sample data.
nathan (2016c, 2016a, 2016b), where empirical results
Let X be the variables available in the probabil-
are also presented.
ity and nonprobability sample for prediction of Y , Z
be the probability sample design variables, and let 5. CONCLUSION
(Xns , Zns ) and (Xp , Zp ) represent the nonsampled and
probability-sampled elements of the population, re- Although selection of probability samples has been
spectively. Dong, Elliott and Raghunathan (2014) ob- the standard for inference in finite populations for over
tain nonparametric draws from the posterior predictive 60 years, there are now many other sources of data that
distribution of the nonsampled elements (Xns |Xs , Zp ) seem useful. Data obtained from convenient sources
like internal business records or the internet are plenti-
p(Xns |Xs , Zp ) ful and tempting to use in estimation. Another mitigat-
(18) ing factor is that selecting and maintaining probability
∝ p(Xns , Zns |Xp , Zp )p(Xp , Zp ) dZns samples becomes more difficult all the time, particu-
larly when surveying households and persons. Because
under the assumption of ignorable sampling (X is inde- of these considerations, methods of statistical infer-
pendent of the sampling indicator I conditional on Z) ence other than the design-based, repeated sampling
by making draws of p(Xp , Zp ) from a Bayesian boot- approach are required.
strap (Rubin, 1981) and draws from p(Xns , Zns |Xp , Two alternatives are quasi-randomization and super-
Zp ) via a finite population Bayesian bootstrap (FPBB) population modeling. In the former, probabilities of be-
procedure that accounts for probabilities of selection, ing included in a sample are estimated based on covari-
clustering and weighting. Treating the nonprobability ates. Unit-level covariates must be available for both
sample (Ynp , Xnp ) as a certainty sample and concate- the nonprobability sample and either a census of the
nating it with the probability sample to obtain Ys = population or a well-controlled, reference dataset that
Ynp and Xs = (Xp , Xnp ), we have (Zhou, Elliott and represents the nonsample units. The reference sample
Raghunathan, 2016c) may or may not be a probability sample. But, in any
case, the reference sample must permit inclusion prob-
p(Xns |Ys , Xs , Zp )
abilities to be estimated for the nonprobability units
∝ p(Xns , Yns |Ys , Xs , Zp ) dYns when the two covariate sources are combined. The su-
perpopulation approach constructs models for y vari-
ables and uses them to predict finite population quanti-
∝ p(Yns |X, Ys , Zp , θ )p(Xns |Ys , Xs , Zp , θ ) ties like means or totals. The quasi-randomization and
· p(Ys , Xs , Zp , |θ )p(θ ) dθ dYns superpopulation approaches can also be combined to
create estimators.
under the assumption that p(Y |X, θ ) = p(Ys |Xnp , θ ), There are pros and cons to the two. In quasi-
that is, the model for Y given X holds in both the randomization, general inclusion probabilities can be
262 M. R. ELLIOTT AND R. VALLIANT
uk-politics-32751993. BBC News online; accessed 06- H OLT, D. and S MITH , T. M. F. (1979). Poststratification. J. R. Stat.
November-2016. Soc., A 142 33–46.
D EVER , J., R AFFERTY, A. and VALLIANT, R. (2008). Internet sur- K AIZAR , E. (2015). Incorporating both randomized and observa-
veys: Can statistical adjustments eliminate coverage bias? Sur- tional data into a single analysis. Annual Review of Statistics
vey Research Methods 2 47–62. and Its Application 2 49–72.
D EVER , J. and VALLIANT, R. (2010). A comparison of variance K EIDING , N. and L OUIS , T. (2016). Perils and potentials of self-
estimators for poststratification to estimated control totals. Surv. selected entry to epidemiological studies and surveys. J. R. Stat.
Methodol. 36 45–56. Soc., A 179 319–376. MR3461587
D EVER , J. and VALLIANT, R. (2014). Estimation with non- KOHUT, A., K EETER , S., D OHERTY, C., D IMOCK , M. and
probability surveys and the question of external validity. In C HRISTIAN , L. (2012). Assessing the representative-
Proceedings of Statistics Canada Symposium 2014. Statistics ness of public opinion surveys. Available at http://www.
Canada, Ottawa, ON. people-press.org/2012/05/15/assessing-the-representativeness-
D EVER , J. and VALLIANT, R. (2016). GREG estimation with un- of-public-opinion-surveys/. Pew Research Center; accessed
06-November-2016.
dercoverage and estimated controls. Journal of Survey Statistics
KORN , E. and G RAUBARD , B. (1999). Analysis of Health Surveys.
and Methodology 4 289–318.
Wiley, New York.
D EVILLE , J. (1991). A theory of quota surveys. Surv. Methodol. 17
L E B LANC , M. and T IBSHIRANI , R. (1998). Monotone shrinkage
163–181.
of trees. J. Comput. Graph. Statist. 7 417–433.
D ONG , Q., E LLIOTT, M. and R AGHUNATHAN , T. (2014). A non- L EE , S. and VALLIANT, R. (2009). Estimation for volunteer panel
parametric method to generate synthetic populations to adjust web surveys uing propensity score adjustment and calibration
for complex sample designs. Surv. Methodol. 40 29–46. adjustment. Sociol. Methods Res. 37 319–343.
E LLIOTT, M. (2009). Combining data from probability and non- L IEBERMANN , O. (2015). Why were the Israeli election
probability samples using pseudo-weights. Survey Practice. polls so wrong? Available at http://www.cnn.com/2015/03/
E LLIOTT, M. R. and DAVIS , W. W. (2005). Obtaining cancer risk 18/middleeast/israel-election-polls/. CNN online; accessed 06-
factor prevalence estimates in small areas: Combining data from November-2016.
two surveys. J. R. Stat. Soc. Ser. C. Appl. Stat. 54 595–609. L ITTLE , R. J. A. (1982). Models for nonresponse in sample sur-
MR2137256 veys. J. Amer. Statist. Assoc. 77 237–250.
E LLIOTT, M. and L ITTLE , R. J. A. (2000). Model averaging meth- L ITTLE , R. J. A. (2003). Bayesian methods for unit and item non-
ods for weight trimming. J. Off. Stat. 16 191–209. response. In Analysis of Survey Data (R. Chambers and C. Skin-
E LLIOTT, M., R ESLER , A., F LANNAGAN , C. and RUPP, J. (2010). ner, eds.). Wiley, Chichester.
Combining data from probability and non-probability samples L UMLEY, T. and S COTT, A. (2017). Fitting regression models to
using pseudo-weights. Accident Analysis and Prevention 42 survey data. Statist. Sci. 32 265–278.
530–539. M ADIGAN , D., S TANG , P., B ERLIN , J., S CHUEMIE , M., OVER -
E NTEN , H. (2014). Flying Blind Toward Hogan’s Upset HAGE , J., S UCHARD , M., D UMOUCHEL , W., H ARTZEMA , W.
Win In Maryland. Available at http://fivethirtyeight.com/ and RYAN , P. (2014). A systematic statistical approach to eval-
datalab/governor-maryland-surprise-brown-hogan/. FiveThir- uating evidence from observational studies. Annual Review of
tyEight online; accessed 06-November-2016. Statistics and Its Application 1 11–39.
F ERRARI , S. L. P. and C RIBARI -N ETO , F. (2004). Beta regression M URPHY, J., L INK , M., C HILDS , J., T ESFAYE , C., D EAN , E.,
for modelling rates and proportions. J. Appl. Stat. 31 799–815. S TERN , M., PASEK , J., C OHEN , J., C ALLEGARO , M. and
MR2095753 H ARWOOD , P. (2015). Social media in public opinion research.
F ILE , T. and RYAN , C. (2014). Computer and internet use in Public Opin. Q. 78 788–794.
the United States: 2013. Available at http://www.census.gov/ N EYMAN , J. (1934). On the two different aspects of the representa-
tive method: The method of stratified sampling and the method
content/dam/Census/library/publications/2014/acs/acs-28.pdf.
of purposive selection. Journal of the Royal Statistical Society
US Census Bureau; accessed 06-November-2016.
97 558–625. MR0121942
F ROST, S., B ROUWER , K., F IRESTONE -C RUZ , M., R AMOS , R.,
O’M UIRCHEARTAIGH , C. and H EDGES , L. V. (2014). Generaliz-
R AMOS , M., L OZADA , R., M AGIS -RODRIGUEZ , C. and
ing from unrepresentative experiments: A stratified propensity
S TRATHDEE , S. (2006). Respondent-driven sampling of injec-
score approach. J. R. Stat. Soc. Ser. C. Appl. Stat. 63 195–210.
tion drug users in two U.S.-Mexico border cities: Recruitment
MR3234340
dynamics and impact on estimates of hiv and syphilis preva- R AO , J. N. K. and W U , C. F. J. (1988). Resampling inference
lence. Journal of Urban Health 83 83–97. with complex survey data. J. Amer. Statist. Assoc. 83 231–241.
G ILE , K. J. and H ANDCOCK , M. S. (2010). Respondent-driven MR0941020
sampling: An assessment of current methodology. Sociol. R AO , J. N. K., W U , C. F. J. and Y UE , K. (1992). Some re-
Method. 40 285–327. cent work on resampling methods for complex surveys. Surv.
G OSNELL , H. F. (1937). How accurate were the polls? Public Methodol. 18 209–217.
Opin. Q. 1 97–105. R IVERS , D. (2007). Sampling for web surveys. Amazon Web
H AZIZA , D. and B EAUMONT, J.-F. (2017). Construction of Services. Available at https://s3.amazonaws.com/yg-public/
weights in surveys: A review. Statist. Sci. 32 206–226. Scientific/Sample+Matching_JSM.pdf.
H ECKATHORN , D. D. (1997). Respondent-driven sampling: A new ROSENBAUM , P. and RUBIN , D. (1983). The central role of the
approach to the study of hidden populations. Soc. Probl. 44 propensity score in observational studies for causal effects.
174–199. Biometrika 70 41–55. MR0742974
264 M. R. ELLIOTT AND R. VALLIANT
ROYALL , R. (1970). On finite population sampling theory under and S MITH , P. (2016). Report of the Inquiry into the
certain linear regression models. Biometrika 57 377–387. 2015 British general election opinion polls. Available at
ROYALL , R. (1971). Linear regression models in finite popula- http://eprints.ncrm.ac.uk/3789/1/Report_final_revised.pdf. ac-
tion sampling theory. In Foundations of Statistical Inference cessed 06-November-2016.
(V. Godambe and D. Sprott, eds.). Holt, Rinehart, and Winston, T IBSHIRANI , R. (1996). Regression shrinkage and selection via
Toronto. the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 267–288.
RUBIN , D. B. (1976). Inference and missing data. Biometrika 63 MR1379242
581–592. MR0455196 US E NERGY I NFORMATION A DMINISTRATION (2016). Weekly
RUBIN , D. (1979). Using multivariate matched sampling and re- petroleum status report. Available at https://www.eia.gov/
gression adjustment to control bias in observational studies. petroleum/supply/weekly/pdf/appendixb.pdf. US Department
J. Amer. Statist. Assoc. 74 318–328. of Energy online: accessed 06-November-2016.
RUBIN , D. B. (1981). The Bayesian bootstrap. Ann. Statist. 9 130– VALLIANT, R. and D EVER , J. A. (2011). Estimating propensity
134. MR0600538 adjustments for volunteer web surveys. Sociol. Methods Res. 40
S ÄRNDAL , C.-E., S WENSSON , B. and W RETMAN , J. (1992). 105–137. MR2758301
Model Assisted Survey Sampling. Springer, New York. VALLIANT, R., D EVER , J. A. and K REUTER , F. (2013). Practical
MR1140409 Tools for Designing and Weighting Survey Samples. Springer,
S CHONLAU , M. and C OUPER , M. (2017). Options for conducting New York. MR3088726
web surveys. Statist. Sci. 32 279–292. VALLIANT, R., D ORFMAN , A. H. and ROYALL , R. M. (2000).
S CHONLAU , M., VAN S OEST, A. and K APTEYN , A. (2007). Are Finite Population Sampling and Inference: A Prediction Ap-
“Webographic” or attitudinal questions useful for adjusting es- proach. Wiley, New York.
timates from web surveys using propensity scoring? Survey Re- VAN DER L AAN , M. J., P OLLEY, E. C. and H UBBARD , A. E.
search Methods 1 155–163. (2007). Super learner. Stat. Appl. Genet. Mol. Biol. 6.
S CHONLAU , M., W EIDMER , B. and K APTEYN , A. (2014). Re- VONK , T. W. E., VAN O SSENBRUGGEN , R. and W ILLEMS , P.
cruiting an Internet panel using respondent-driven sampling. (2006). The effects of panel recruitment and manage-
J. Off. Stat. 30 291–310. ment on research results. Available at https://www.esomar.
S IMON , H. (1956). Rational choice and the structure of the envi- org/web/research_papers/Web-Panel_1476_The-effects-of-
ronment. Psychological Review 63 129–138. panel-recruitment-and-management-on-research-results.php.
S IRKEN , M. (1970). Household surveys with multiplicity. J. Amer. ESOMAR; accessed 06-November-2016.
Statist. Assoc. 65 257–266. WANG , W., ROTHSCHILD , D., G OEL , S. and G ELMAN , A.
S MITH , T. M. F. (1976). The foundations of survey sampling: A re- (2015). Forecasting elections with non-representative polls. Int.
view. J. Roy. Statist. Soc. Ser. A 139 183–204. MR0445669 J. Forecast. 31 980–991.
S MITH , T. M. F. (1983). On the validity of inferences from non- Z HOU , H., E LLIOTT, M. and R AGHUNATHAN , T. (2016a). Multi-
random samples. J. R. Stat. Soc., A 146 394–403. MR0769995 ple imputation in two-stage cluster samples using the weighted
S QUIRE , P. (1988). Why the 1936 literary digest poll failed. Public finite population Bayesian bootstrap. Journal of Survey Statis-
Opin. Q. 52 125–133. tics and Methodology 4 139–170.
S TUART, E. A., C OLE , S. R., B RADSHAW, C. P. and L EAF, P. J. Z HOU , H., E LLIOTT, M. and R AGHUNATHAN , T. (2016b). Syn-
(2011). The use of propensity scores to assess the generaliz- thetic multiple imputation procedure for multi-stage complex
ability of results from randomized trials. J. R. Stat. Soc., A 174 samples. J. Off. Stat. 32 251–256.
369–386. MR2898850 Z HOU , H., E LLIOTT, M. and R AGHUNATHAN , T. (2016c). A two-
S TURGIS , P., BAKER , N., C ALLEGARO , M., F ISHER , S., step semiparametric method to accommodate sampling weights
G REEN , J., J ENNINGS , W., K UHA , J., L AUDERDALE , B. in multiple imputation. Biometrics 72 242–252. MR3500593