0% found this document useful (0 votes)
17 views14 pages

Determination of Sample Size

The document discusses the importance of determining an optimum sample size for studies to ensure valid and reliable results. It outlines factors affecting sample size, including study objectives, type I and II errors, standard deviation, and allowable error. Additionally, it provides various formulas for calculating sample size for different types of studies and highlights the potential biases that can occur in sampling.

Uploaded by

jahanvi.aundhiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views14 pages

Determination of Sample Size

The document discusses the importance of determining an optimum sample size for studies to ensure valid and reliable results. It outlines factors affecting sample size, including study objectives, type I and II errors, standard deviation, and allowable error. Additionally, it provides various formulas for calculating sample size for different types of studies and highlights the potential biases that can occur in sampling.

Uploaded by

jahanvi.aundhiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

23-02-2024

Statistics:
Determination of
Sample Size
Dr. Nilesh Fichadiya
M.D.(Community Medicine), D.P.H.

Calculation of Sample Size


• What should be the sample size?
 To get the correct results

• To small samples  Study is not valid


• To large sample  Laborious, costly & time
consuming

• We need optimum sample size, which gives reliable


results

Sample and population


• If sufficiently large and unbiased sample is taken
from defined population  Inference drawn from
sample can be applied to whole population

1
23-02-2024

Sample and population


• If sample is representative of population, sample
statistics Mean (X ), SD and proportion(p) will
NOT differ significantly from population parameters
µ,  and P

Characteristics of Representative
Sample
1. Precision: Sample Size

2. Unbiased character: Technique of sample


selection

Precision
• Also known as reproducibility or reliability

• Defined as Ability of an instrument/test/sample to


provide the same or a very similar result with
repeated measurements of the same factor

• In case of sample, it depends on Sample size

2
23-02-2024

Precision

Precision =
n= sample size
s = Standard deviation, SD

Sample size depends on:


1. Study Objective and Study design:
 To estimate a value : Descriptive study

 To test hypothesis : Analytical study or experimental


study

 Whether we have one or two set of data

 Whether the data is paired or unpaired

Sample Size and Study Design


• In analytical study, there is ALWAYS comparison
 So we have cases and controls or comparison group
 Here we need to multiply sample size with (r+1)/r where
r= ratio of control to case

• In interventional study:
 We have intervention and control group
 We have to double the sample size

3
23-02-2024

Sample Size and Study Design


• In before-after study,
 We have one group
 We need not double the sample size

• In cluster sampling,
 We have to multiply the sample size with design effect
(discussed in previous topic)

Sample size depends on:


2. The degree of type I error (denoted as α)
 Type I error (α) is also known as p value or
level of significance
 1- α is confidence level (taken as 95% or 99%)

3. The degree of type II error (denoted as β)


 (1 – β) is “power” of the study (taken as 80% or 90%)

 and  errors
• It is the probability of the difference occurring by
chance and not in reality (Type I error) but we
conclude that the difference is real

• So, 1- (Confidence level) is probability the


difference is NOT occurring by chance

• Prior to starting a study, we set an acceptable value


for this “p.” which can be p<0.05 or p<0.01
(accordingly confidence level is 95% or 99%)

4
23-02-2024

 and  errors
•  error is the probability that the study will miss a
true difference
 i.e. there is a real difference but we miss to conclude
that

• So, (1 – β) or the “power” of the study is


probability that if there is a difference  we will
detect it correctly

 and  errors
• As a rule:
• For estimating a value  only  error is considered
• For testing hypothesis  both  and  errors are
considered

• If  error is considered in the formula of sample


size, we get a higher sample size

Sample size depends on:


4. Standard deviation (SD) or variance(s) in the
population

5. The proportion of people experiencing (p) and


people not experiencing (1-p or q) the attribute

5
23-02-2024

Sample size depends on:


6. The level of acceptable or allowable error (E)
 Can be taken as 5% or 10% or 20%

7. The degree of difference or effect size expected


between the two set of data (d)
 We can estimate the effect size based on previously
reported or preclinical studies
 If the effect size is large  the sample size is less
 If the effect size is small  the sample size is large

Sample size is larger if:


• Confidence level desired is large
• Power is large
• Variance (or SD) is large
• Proportion of people with attribute is small
• Margin of error (allowable error) is small

Various Formulae for Calculation of


Sample Size:
• To estimate the value by descriptive cross-sectional
study
 Quantitative variable (mean):

𝑍 𝑠
𝑁=
𝐸
where,
𝑍  = Value of Std. normal deviate for value of ,
s = standard deviation,
E = allowable error

6
23-02-2024

Value of Z1-/2 & Z1- for two


tailed and one tailed studies

Exercise
• Mean hemoglobin level of girls students in the
colleges is estimated to be 11.5 gm% with SD of 1.5
gm%. Calculate sample size for a study of
Haemoglobin estimation of physiotherapy colleges
of Saurashtra region with allowable error of 0.2 at
5% significance level

• 𝑁=

• N = (1.96)2(1.5)2
(0.2)2

= 3.84 x 2.25
0.04
= 216

7
23-02-2024

Sample size calculation: For


Quantitative data:
• Exercise:
• Mean pulse rate of a population is believed to be
72/minute with Standard Deviation of 8. Calculate
minimum sample size to verify this if allowable
error is 1 at 5% significance level.

Various Formulae for Calculation of


Sample Size:
• To estimate the value by descriptive cross-
sectional study
 Qualitative variable (proportion):

𝑍 𝑝𝑞 where,
𝑁= p = positive character,
𝐸
q = negative character = 1 – p, or
q = 100 – p in percentage as p+q =
100%,
E = allowable error of p, usually 10%
or 20% of p

Example
Incidence rate in the last SARS CoV epidemic was
found to be 50/1000 (5%) of the population exposed.
What should be the size of sample to find
incidence rate SARS CoV in the current epidemic if
allowable error is 10% and 20%?

8
23-02-2024

p = 5%,
q = p-100=100-5=95%
L = 0.5 (at 10% of p)
L = 1 (at 20% of p)

• Sample size calculation at 10% allowable error


n= 4pq = 4 x 5 x 95 = 7600
E2 0.5 x 0.5
• Sample size calculation at 20% allowable error
n= 4pq = 4 x 5 x 95 = 1900
E 2 1x1

Sample size calculation: For Qualitative


data:

• Exercise:
Prevalence rate of Musculoskeletal disorders were
found to be 40% in earlier studies.
Calculate the size of sample required to find the
prevalence rate of Musculoskeletal disorders in your
area if allowable error is 10% or 20%.

Various Formulae for Calculation of


Sample Size:
• To establish association by case control study or
independent sample cohort study:
 Quantitative Data (sample size for each group):

𝑁 =2 𝑍 +𝑍 2 𝑠 ×𝑠

---------------------------------------
𝑑

9
23-02-2024

Value of Z1- for different


Value of Z1-/2 & Z1- for two power of studies
tailed and one tailed studies

Example:
• An investigator wants to conduct a study to find out
whether there is any difference in effect of pollution on
lung function by studying force expiratory volume (FEV)
between traffic police and general population.
• From the previous study it is known that S.D.(s) of FEV are
3.5 l/min and 5 l/min among traffic police and general
population respectively.
• How many subjects from traffic police and general
population required for testing the null hypothesis that
there is no difference in FEV between traffic police and
general population.
• The investigator wishes to be 90% confident of detecting a
difference of 2.5 l/min or more in either direction at 5%
level of significance?

𝑁 =2 𝑍 +𝑍 2 𝑠 ×𝑠

---------------------------------------
𝑑
=2 (1.96 + 1.28)2 (3.5 x 5)
(2.5)2
= 2 (10.49 x 17.5)
6.25
= 58.74  59 from each population

10
23-02-2024

Various Formulae for Calculation of


Sample Size:
• To establish association by case control study or
independent sample cohort study:
 Qualitative Data (sample size for each group):

𝑁 = 𝑍 +𝑍 2 𝑝1𝑞1 + 𝑝2𝑞2

----------------------------------------------
𝑑

Example
• An investigator is interested to study whether the lower
back pain among Bank officers in Ahmedabad is more
or equal to the rest of the population?
• Previous studies reported that the prevalence of Lower
back pain among Bank officers in Ahmedabad is 55%
whereas it is 45% among general population.
• The investigator wishes to be 80% confident of
detecting a difference of 10% or more in either
direction at 1% level of significance.
• How many Bank Officers and members of General
population should be included in this study ?

𝑁= 𝑍 +𝑍 2 𝑝1𝑞1 + 𝑝2𝑞2

----------------------------------------------
𝑑
= (2.57 + 0.84)2 [(0.45 x 0.55) + (0.55 x 0.45)]
(0.1)2
= (3.41)2 (0.25 + 0.25) = 581
0.01

11
23-02-2024

Various Formulae for Calculation of


Sample Size:
• To identify significant difference between two
groups by intervention study
 Quantitative Data (sample size for each group):

𝑁 =2 𝑍 +𝑍 2 𝑠
For Unpaired Data,
------------------------- • Take pooled ‘s’
𝑑 • Calculated sample
size is for each
group

Calculate Pooled s or p

• Pooled 𝑠 =

• Pooled 𝑝 =

Various Formulae for Calculation of


Sample Size:
• To identify significant difference between two
groups by intervention study
 Qualitative Data (sample size for each group):

𝑁 =2 𝑍 +𝑍 2 𝑝𝑞
For Unpaired Data,
-------------------------------- • Take pooled ‘p’ & ‘q’
𝑑 • Calculated sample
size is for each
group

12
23-02-2024

Bias in Sampling
• Bias: A result that differs from the true values

• Examples of Bias

• Types:
1. Selection Bias
2. Information Bias
3. Measurement Bias
4. Bias due to confounding

Bias in Sampling
• Selection Bias
 Selection of subjects
 Non-response
 Loss to follow up

• Information Bias
 Quality and extent of information obtained from
different subject
 Recall bias

Bias in Sampling
• Measurement Bias
 Misclassification or mis-diagnosis of subjects
 Unequal diagnostic work-up in different groups
 Measurement error

• Bias due to confounding


 A factor that is associated with the risk factor and
disease outcome

13
23-02-2024

Thank You

14

You might also like