Biostatistics and Epidemiology | Sem 1 (AY 23-24’) BSMLS 2D
Cagayan State University - College of Allied Health Sciences
Bachelor of Science in Medical Laboratory Science
“Consistency is Excellence” | Prof.
Types of Probability Sampling
Types of Sampling Design & Sample Size
j
determination
I. Types of Sampling Design Simple Random Sampling
II. Types of Probability Sampling Design Characteristics: Every element in the population
III. Sample Size Determination
has an equal chance of being included in the
sample.
Types of Sampling Designs
j
Procedures for Sample Selection:
2D
1. Non-probability Sampling Designs
a. The probability of each member of the a. Prepare the sampling frame
sampling population to be selected in the b. Number all the population elements in the
sample is difficult to determine or cannot sampling frame chronologically from 1 to
be specified, hence the reliability of the N, is where N is the population size
resulting estimates of the sample results c. Determine the required sample size, n.
cannot be assessed. d. Select n numbers at random between 1 to
b. The external validity of the results N, using either the lottery method or
LS
becomes an issue computer generated random numbers
c. Examples of non-probability sampling using a software like Excel.
designs are: e. The population elements in the list whose
● Purposive sampling numbers matches to the n numbers
● Judgement sampling randomly selected will comprise the simple
M
● Snow-ball technique (Referral random sample
j
sampling)
Stratified Random Sampling
d. These are the types of designs usually
used in qualitative studies. Characteristics: This design is used when the
BS
investigator wants to:
2. Probability Sampling Designs
a. Ensure the groups of interest or
a. The rules and procedures for selecting
subsections of the population considered
the sample and estimating the parameters
important for the study are adequately
are explicitly & rigidly specified
represented
b. The reliability of the resulting estimates
b. Derive reasonably precise estimates for
can be determined
important subsections of the population
c. Most quantitative studies use probability
sampling designs in the selection of the
Procedures:
subjects
a. Identify the stratification variable
Biostatistics and Epidemiology | Sem 1 (AY 23-24’) BSMLS 2D
Cagayan State University - College of Allied Health Sciences
Bachelor of Science in Medical Laboratory Science
“Consistency is Excellence” | Prof.
b. Classify the population elements according a. Every elemental has an equal chance of
to the categories of the stratification being selected
variable b. Often used under the following conditions:
c. Number the popultion elements ● The population elements are too
chronologically from 1 to N, within each many to list or to number
category of the stratification variable. chronologically
d. Determine the sample size needed from ● A frame is not available
each stratum c. Often used in combination with other
2D
e. Within each stratum, select the required designs
number of samples by simple sampling.
Procedures:
a. Determine the required sample size, n.
b. Determine the sampling interval, k, where:
k=N/n
c. Select a number at random between 1 and
k. The population element in the frame
LS
corresponding to the random number
selected will be the first to be included in
the sample
d. Include in the sample survey every kth
population element after the first random
number selected
M
BS
Non-proportional Allocation
- Divide the sample size into the number of strata
Cluster Sampling
identified.
- Each stratum contains equal number of sample Characteristics:
sizes. a. Used when a frame for the individual
j
elementary units in the population is not
Systematic Sampling
available. However, a frame for groups or
Characteristics: clusters of elements is available
Biostatistics and Epidemiology | Sem 1 (AY 23-24’) BSMLS 2D
Cagayan State University - College of Allied Health Sciences
Bachelor of Science in Medical Laboratory Science
“Consistency is Excellence” | Prof.
b. The sampling unit is different from the
elementary unit
Sample Size Determination
SAMPLE PROBLEM: DETERMINING THE
SAMPLE SIZE IN ORDER TO ESTIMATE A
2D
PROPORTION
In a project which aims to assess, among
others, the effectiveness of a post-disaster
nutrition program to be implemented in
Province X, one of the impact indicators to
be used is the decrease in the prevalence
LS
of malnutrition among preschoolers in the
province. In order to measure this indicator,
it is planned to conduct a baseline survey
to determine the prevalence of
malnutrition among preschoolers in
Province X at the start of the program. A
M
review of nutrition data in the province
revealed that the only background data
available is the result of a study done in the
province 2 years ago, which indicates that
BS
the prevalence of moderate and severe
malnutrition among preschoolers is 25%. If
it is decided to select a random sample of
preschoolers for the baseline survey, how
big should the sample size be if the error
rate is set to be within ±5%, with 95%
confidence?
Based on the information provided in the sample
problem:
Biostatistics and Epidemiology | Sem 1 (AY 23-24’) BSMLS 2D
Cagayan State University - College of Allied Health Sciences
Bachelor of Science in Medical Laboratory Science
“Consistency is Excellence” | Prof.
z = 1.96 (based on the desired confidence level of maximum value when P = .50 and Q-0.50,
95%) and hence will ensure an adequate sample
size irrespective of the actual value of P.
P = 0.25 (malnutrition prevalence based on a
survey done 2 years ago) c. When the sampling design used makes
use of cluster sampling instead of pure
Q = (1-P)-(1- 0.25)=0.75 simple random sampling, the sample size
has to be corrected for the design effect
2D
d = 0.05 (deff) - i.e.,
Therefore, n = (1.96)2 (0.25)(0.75)/ (.05)² = 288 n(cluster sampling) = n(simple random
sampling)^ x deff
"Deff '' - the factor by which the sample size for a
cluster sample has to be increased in order to
derive estimates with the same precision as a
LS
simple random sample. In the area of health, it has
been shown for most health surveys, deft= 1.5 to
2.0, with deff-2.0 being a common value used.
d. In order to ensure that the required sample
M
size is reached, a correction factor for
DEALING WITH COMMON ISSUES IN SAMPLE
non-response is usually applied at the
SIZE DETERMINATION
time of sample size determination.
● This avoids the need for looking for
a. When deciding on the sample size
BS
"substitutes" during data collection
requirements for a study with more than
which usually introduces biases in
one objective involving the estimation
sample selection. The non- response
and/or testing of several parameters and
rate varies depending on the survey
hypotheses, the sample size requirement
setting (ex., urban areas generally
of each important parameter has to be
have high non-response rated
computed and considered.
compared to rural areas; surveys
which ask for sensitive questions also
b. When estimating a proportion whose value
have higher non-response rates) but
is unknown, a common practice is to
in general, an inflation factor of 10%
assume that P=.50. The basis for this is
for non-response has been shown to
the fact that the variance of indicators
be adequate in most situations.
which are in the form of proportions have a
Biostatistics and Epidemiology | Sem 1 (AY 23-24’) BSMLS 2D
Cagayan State University - College of Allied Health Sciences
Bachelor of Science in Medical Laboratory Science
“Consistency is Excellence” | Prof.
● Therefore, if for example, the required
sample size of a given survey after
applying the design effect is 800,
then the revised target sample size
after applying for the correction factor
for non-response will be 800 + 80
880. Data collection activities should
therefore be planned for a sample
2D
size of 880.
e. There are instances when the computed
sample size is deemed too big relative to
the population size. (There are even
instances when the computed sample size
is bigger than the population size). This is
when the finite population correction (fpc)
LS
can be applied to determine the final
sample size to be considered. The sample
size formula after application of the fpc is:
M
BS
where nfpc = computed sample size after
application of the finite population correction
nо = initial sample size computed prior to
application of fpc
N = population size