Introduction to Sampling Theory
Lecture 5
Simple Random Sampling
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
Slides can be downloaded from
http://home.iitk.ac.in/~shalab/sp
1
Probability of Selection of a Unit
Let the size of the population is N.
One out of N sampling unit is to be chosen.
SRSWOR
𝟏
The probability of drawing a sampling unit =
𝑵
SRSWR
𝟏
The probability of drawing a sampling unit =
𝑵
2
Probability of Selection of a Unit by SRSWOR or SRSWR
1 2 3 4 5 6 7 8 9 10
1 Probability of drawing ball 1= 1/10
2 Probability of drawing ball 2= 1/10
10 Probability of drawing ball 10= 1/10
3
Proof: Probability of Selection of a Unit: SRSWOR
Let Al : Event that a particular jth unit is not selected at the ith
draw.
The probability of selecting, say, jth unit at kth draw is
P ( selection of u j at k draw )
th
P ( A1 A2 ... Ak 1 Ak )
P ( A1 ) P ( A2 | A1 ) P ( A3 | A1 A2 )...P ( Ak 1 | A1 , A2 ... Ak 2 ) P ( Ak | A1 A2 ... Ak 1 )
1
1 1
1 1 1 1
1 ... 1
N N 1 N 2 N k 2 N k 1
N 1 N 2 N k 1 1
. ... .
N N 1 N k 2 N k 1
1
.
N 4
Probability of Selection of a Sample
Let the size of the population is N.
Let the size of the sample is n.
A sample of n sampling units out of N sampling units is to be chosen.
5
Probability of Selection of a Sample
SRSWOR
Total number of combinations to choose n sampling units out of N
𝑵
sampling unit =
𝒏
𝟏
The probability of drawing a sample = 𝑵
𝒏
6
Probability of Selection of a Sample
SRSWOR
Suppose N =3, n = 2
1 2 3
𝟑
Total samples = 𝟑
𝟐
Sample 1 1 2
Sample 2 2 3
Sample 3 1 3
𝟏
Probability of drawing a sample =
𝟑 7
Probability of Selection of a Sample
SRSWOR
Suppose N =3, n = 2
𝟑
Total samples = 𝟑
𝟐
Sample 1
Sample 2
Sample 3
𝟏
Probability of drawing a sample =
𝟑 8
Proof: Probability of Selection of a Sample: SRSWOR
A unit can be selected at any one of the n draws.
Let ui be the ith unit selected in the sample.
This unit can be selected in the sample either at first draw, second
draw, …, or nth draw.
Let Pj(i) denotes the probability of selection of ui at the jth draw, j =
1,2,...,n. Then
Pj (i ) P1 (i ) P2 (i ) ... Pn (i )
1 1 1
... (n times )
N N N
n
.
N 9
Proof: Probability of Selection of a Sample: SRSWOR
Let u1, u2,…,un are the n unit selected in the sample.
The probability of their selection is
P(u1, u2,…, un) = P(u1). P(u2). . .P(un)
When the first unit is to be selected, then there are n units left to be
selected in the sample from the population of N units.
𝒏
So P(u1)=
𝑵
10
Proof: Probability of Selection of a Sample: SRSWOR
When the second unit is to be selected, then there are (n – 1) units
left to be selected in the sample from the population of (N – 1) units.
𝒏 𝟏
So P(u2)=
𝑵 𝟏
When the third unit is to be selected, then there are (n – 2) units left
to be selected in the sample from the population of (N – 2) units and
so on.
𝒏 𝟐
So P(u3)=
𝑵 𝟐
𝟏
And so on, P(un)=
𝑵 𝒏 𝟏 11
Proof: Probability of Selection of a Sample: SRSWOR
Thus probability of their selection is
P (u1 , u2 ,..., un ) P (u1 ).P (u2 )...P (un )
n n 1 n 2 1
= . . ...
N N 1 N 2 N n 1
1
.
N
n
12
Probability of Selection of a Sample
SRSWR
Total number of combinations to choose n sampling units out of N
sampling unit = Nn
𝟏
The probability of drawing a sample =
Nn
13
Probability of Selection of a Sample
SRSWR
Suppose N =3, 1 2 3
Total samples N=3, n=2, Nn = 32 = 9
Sample 1 1 1 Sample 4 2 1 Sample 7
3 1
Sample 2 1 2 Sample 5 2 2 Sample 8
3 2
Sample 3 1 3 Sample 6 2 3 Sample 9
3 3
𝟏
Probability of drawing a sample =
𝟗
14
Probability of Selection of a Sample
SRSWR
Suppose N =3,
Total samples N=3, n=2, Nn = 32 = 9
Sample 1 Sample 4 Sample 7
Sample 2 Sample 5 Sample 8
Sample 3 Sample 6 Sample 9
𝟏
Probability of drawing a sample =
𝟗
15
Proof: Probability of Selection of a Sample: SRSWR
Let ui be the ith unit selected in the sample.
This unit can be selected in the sample either at 1st draw, 2nd draw,
…, or nth draw.
At any stage, there are always N units in the population in case of
SRSWR, so the
probability of selection of ui at any stage = 1/N for all i = 1,2,…,n.
16
Proof: Probability of Selection of a Sample: SRSWR
Then the probability of selection of n units u1, u2,…,un in the
sample is
P(u1 , u2 ,..., un ) P(u1 ) . P(u2 )...P(un )
1 1 1
. ...
N N N
1
n.
N
17
Notations:
Following notations will be used:
N : Number of sampling units in the population (Population size).
n : Number of sampling units in the sample (Sample size)
Y : The characteristic under consideration
Yi : Value of characteristic for the ith unit of the population
(i = 1,2,… 2,…,N)
yi : Value of the characteristic for the ith unit of the sample
(i = 1, 2,…,n)
18
Notations: Example
Y: Height of students in a class
N = 10 : Number of students in the class (Population size)
n = 3 : Number of students in the sample (Sample size)
Yi : Height of ith student in the population
19
Example
Y: Height of students in a class
N = 10 : Number of students in the class (Population size)
n = 3 : Number of students in the sample (Sample size)
Name of Student Yi = Height of students (in Centimeters)
A Y1= 151
B Y2= 152
C Y3 = 153
D Y4= 154
E Y5 = 155
F Y6= 156
G Y7 = 157
H Y8= 158
I Y9 = 159
J Y10= 160 20
Notations: Example
Suppose
Y1 = 151 cms., Y2 = 152 cms., Y3 = 153 cms., Y4 = 154 cms., Y5 = 155 cms.,
Y6 = 156 cms., Y7 = 157 cms., Y8 = 158 cms., Y9 = 159 cms., Y10 = 160 cms.,
yi : Height of ith student in the sample
Selected sample = 3rd , 7th and 9th student
y1 = Y3 = 153 cms., y2 = Y7 = 157 cms., y3 = Y9 = 159 cms.
21
Drawing of sample
Suppose we want to select the name of student or Height of the
student.
The data in R will usually be given in a data frame, CSV file or any
other format.
Suppose the data is stored in a data frame heightdata by using
the following commands:
height=c(151,152,153,154,155,156,157,158,159,160)
name=c("A","B","C","D","E","F","G","H","I","J")
heightdata=data.frame(name,height)
22
Drawing of sample using R
> heightdata
name height
1 A 151
2 B 152
3 C 153
4 D 154
5 E 155
6 F 156
7 G 157
8 H 158
9 I 159
10 J 160
> names=heightdata$name
> names
[1] A B C D E F G H I J
Levels: A B C D E F G H I J
> heights=heightdata$height
> heights
[1] 151 152 153 154 155 156 157 158 159 160 23
Drawing of sample using R
24
Drawing of sample using R : SRSWOR
Suppose we want this sample in terms of names of persons.
sample(names, size=5, replace = FALSE)
> sample(names, size=5, replace = FALSE)
[1] G F A B H
Levels: A B C D E F G H I J
Suppose we want this sample in terms of heights of persons.
sample(heights, size=5, replace = FALSE)
> sample(heights, size=5, replace = FALSE)
[1] 152 156 154 155 158
25
Drawing of sample using R : SRSWOR
26
Drawing of sample using R : SRSWR
Suppose we want this sample in terms of names of persons.
Sample of size 5
> sample(names, size=5, replace = TRUE)
[1] F F I E A
Levels: A B C D E F G H I J
Sample of size 8
> sample(names, size=8, replace = TRUE)
[1] C C D D J H G E
Levels: A B C D E F G H I J
27
Estimation of population mean: Notations
Y1 ,Y2 ,..., YN : Population
y1 ,y2 ,..., yn : Sample
1 N
Y
N
Y i
: Population mean
i 1
1 n
y
n
y i
: Sample mean
i 1
28