MATH 264
Statistics for
Social Sciences
Chapter 6
Discrete Random Variables and
Probability Distributions
Definition
A Random variable is a variable that takes on
numerical values determined by the outcome of a
random experiment
Introduction to
Probability Distributions
Random Variable
Represents a possible numerical value from
a random experiment
Random
Variables
Ch. 6 Discrete Continuous Ch. 7
Random Variable Random Variable
Discrete Random Variables
Can only take on a countable number of values
Examples:
Roll a die twice
Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)
Toss a coin 5 times.
Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)
Definition
A Probability Distribution Function P(X), of a
discrete random variable X expresses that the
random variable X takes the value x, as a function of
X.
That is;
P(X) = P(X=x) for all values of x
Discrete Probability Distribution
Experiment: Toss 2 Coins. Let X = # heads.
Show P(x) , i.e., P(X = x) , for all values of x:
4 possible outcomes
Probability Distribution
T T x Value Probability
0 1/4 = .25
T H 1 2/4 = .50
2 1/4 = .25
H T
Probability
.50
.25
H H
0 1 2 x
Probability Distribution
Required Properties
P(x) 0 for any value of x
The individual probabilities sum to 1;
P(x) 1
x
(The notation indicates summation over all possible x values)
Cumulative Probability Function
The cumulative probability function, denoted
F(x0), shows the probability that X is less than or
equal to x0
F(x 0 ) P(X x 0 )
In other words,
F(x 0 ) P(x)
xx0
Properties of F(X0)
0 ≤ F(X0) ≤ 1 for every number x0
İf x0 and x1 are two numbers with x0 < x1
F(X0) < F(X1)
Descriptive Measures
Functions of Random Variables
If P(x) is the probability function of a discrete
random variable X , and g(X) is some function of
X , then the expected value of function g is
E[g(X)] g(x)P(x)
x
Note that expected value takes the average of g(x)
Expected Value
Expected Value (or mean) of a discrete
distribution (Weighted Average)
μ E(x) xP(x)
x
x P(x)
Example: Toss 2 coins, 0 .25
x = # of heads, 1 .50
compute expected value of x: 2 .25
E(x) = (0 x .25) + (1 x .50) + (2 x .25)
= 1.0
Variance and Standard
Deviation
Variance of a discrete random variable X
σ E(X μ) (x μ) P(x)
2 2 2
Standard Deviation of a discrete random variable X
σ σ2
x
(x μ)2
P(x)
Shortcut for variance formula
Var(X) = s²
= E(X²) – m²
= S X² P(X) – m²
Standard Deviation Example
Example: Toss 2 coins, X = # heads,
compute standard deviation (recall E(x) = 1)
σ x
(x μ)2
P(x)
σ (0 1)2 (.25) (1 1)2 (.50) (2 1)2 (.25) .50 .707
Possible number of heads
= 0, 1, or 2
Rules for Expectation & Variance
E(c) = c V(c) = 0
E(cX) = cE(X) V(cX) = c²V(X)
E(cX+Y)= cE(X)+E(Y) V(cX+Y)= c²V(X)+V(Y)
E(X-cY)= E(X)-cE(Y) V(X-cY)= V(X)+c²V(Y)
Where X and Y random
variable & c constant
Solve CE4,1 and CE4,2
CE4,1: Consider the following probability distribution function (pdf). Let X denotes number of cars sold in a
day.
X 0 1 2 3 4 5 6
P(X) 0.07 0.19 0.23 0.17 0.16 0.14 0.04
a) What is P(X>3) =?
b) P(2<X<5) = ?
c) P(X≥2) = ?
d) P(X<6) = ?
CE4,2: Suppose that the PDF for the number of errors, X, on pages from business textbook is
P(0) = 0.81 P(1) = 0.17 P(2) = 0.02
Find the mean standard deviation of number of errors per page
Solution: CE4,3
CE4,3: Consider the following probability distribution function (pdf). Let X denotes number of cars
sold in a day.
X 0 1 2 3 4 5 6
P(X) 0.07 0.19 0.23 0.17 0.16 0.14 0.04
Find mean and standard deviation.
Mean= m = E(X)=SX*P(X) =(0)(0.07)+(1)(0.19)+………+ (6)(0.04)=2.74 3 cars will be sold in any
random day
Variance=s2 = SX-m)2P(X) = (0-2.74)2(0.07)+(1-2.74)2(0.19)+ ……+(6-2.74)2(0.04)=2.63
s= 2.63 = 1.62 2 cars is the expected deviation from center
Probability Distributions
Probability
Distributions
Ch. 6 Discrete Continuous Ch. 7
Probability Probability
Distributions Distributions
Bernoulli Normal
Binomial
Hypergeometric
Bernoulli Distribution
Bernoulli Distribution
Consider only two outcomes: “success” or “failure”
Let P denote the probability of success
Let 1 – P be the probability of failure
Define random variable X:
x = 1 if success, x = 0 if failure
Then the Bernoulli probability function is
P(0) (1 P) and P(1) P
Bernoulli Distribution
Mean and Variance
The mean is µ = P
μ E(X) xP(x) (0)(1 P) (1)P P
X
The variance is σ2 = P(1 – P)
σ 2 E[(X μ)2 ] (x μ)2 P(x)
X
(0 P) (1 P) (1 P) P P(1 P)
2 2
Binomial Distribution
Binomial Distribution
A fixed number of observations, n
e.g., 15 tosses of a coin; ten light bulbs taken from a warehouse
Two mutually exclusive and collectively exhaustive
categories
e.g., head or tail in each toss of a coin; defective or not defective
light bulb
Generally called “success” and “failure”
Probability of success is P , probability of failure is 1 – P
Constant probability for each observation
e.g., Probability of getting a tail is the same each time we toss
the coin
Observations are independent
The outcome of one observation does not affect the outcome of
the other
Possible Binomial Distribution
Settings
A manufacturing plant labels items as
either defective or acceptable
A firm bidding for contracts will either get a
contract or not
A marketing research firm receives survey
responses of “yes I will buy” or “no I will
not”
New job applicants either accept the offer
or reject it
Developing Binomial Distribution
Lets repeat success and failure type experiment n independent times.
Assume that we obtain below sequence.
S S S……S S F F F……F F
Probability of observing this sequence is
p*p* ….p(1-p)*(1-p)*……*(1-p)
Since X defined as number of success we can write
px(1-p)n-x
Is this the only possible sequence that we can observe?
How many different sequences can be observed?
Sequences of x Successes
in n Trials
The number of sequences with x successes in n
independent trials is:
n!
C
n
x! (n x)!
x
Where n! = n·(n – 1)·(n – 2)· . . . ·1 and 0! = 1
These sequences are mutually exclusive, since no two
can occur at the same time
Developing Binomial Distribution
Lets repeat success and failure type experiment n independent times.
Assume that we obtain below sequence.
S S S……S S F F F……F F
Probability of observing this sequence is
p*p* ….p(1-p)*(1-p)*……*(1-p)
Since X defined as number of success we can write
px(1-p)n-x
We can observe 𝑋𝑛 different sequences
𝑛
P(X)= 𝑋
px(1-p)n-x
Example
The random variable X represents the number of students
who prefer news from the Internet among a random
sample of students from a large university. If the
population proportion of students who prefer Internet
news is 0.6. And, if we relabel the outcome “Internet” as
a success ( S ) and “not Internet” as a failure (F). List the
associated probabilities, and the value of X for the
elementary outcomes of sampling 4 students.
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3 1
0.60 0.40 0.604
each outcome
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3
0.60 0.40 1
0.604
each outcome
Number of 4 4 4 4 4
0
=1 1
=4 2
=6 3
=4 4
=1
outcomes
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3
0.60 0.40 1
0.604
each outcome
Number of 4 4 4 4 4
0
=1 1
=4 2
=6 3
=4 4
=1
outcomes
P(X) 0404 4*0.601 0.403 6*0.602 0.402 4*0.603 0.401 0.604
𝑛
P(X)= 𝑋
px(1-p)n-x
Binomial Distribution Formula
n! X nX
P(x) P (1- P)
x ! (n x )!
P(x) = probability of x successes in n trials,
with probability of success P on each trial
Example: Flip a coin four
times, let x = # heads:
x = number of ‘successes’ in sample,
n=4
(x = 0, 1, 2, ..., n)
P = 0.5
n = sample size (number of trials
or observations) 1 - P = (1 - 0.5) = 0.5
P = probability of “success” x = 0, 1, 2, 3, 4
Example:
Calculating a Binomial Probability
What is the probability of one success in five
observations if the probability of success is 0.1?
x = 1, n = 5, and P = 0.1
n!
P(x 1) P X (1 P)n X
x! (n x)!
5!
(0.1)1(1 0.1)51
1!(5 1)!
(5)(0.1)(0.9)4
.32805
Binomial Distribution
The shape of the binomial distribution depends on the
values of P and n
Mean P(x) n = 5 P = 0.1
.6
Here, n = 5 and P = 0.1 .4
.2
0 x
0 1 2 3 4 5
P(x) n = 5 P = 0.5
Here, n = 5 and P = 0.5 .6
.4
.2
0 x
0 1 2 3 4 5
Binomial Distribution
Mean and Variance
Mean
μ E(x) nP
Variance and Standard Deviation
σ nP(1- P)
2
σ nP(1- P)
Where n = sample size
P = probability of success
(1 – P) = probability of failure
Binomial Characteristics
Examples
μ nP (5)(0.1) 0.5
Mean P(x) n = 5 P = 0.1
.6
.4
σ nP(1- P) (5)(0.1)(1 0.1) .2
0.6708 0 x
0 1 2 3 4 5
μ nP (5)(0.5) 2.5 P(x) n = 5 P = 0.5
.6
.4
σ nP(1- P) (5)(0.5)(1 0.5) .2
1.118 0 x
0 1 2 3 4 5
CE4,4
The cubs are to play a series of 5 games in St. Louis
against the Cardinals. For any one game it is estimated
that the probability of a Cubs win is 0.4. The
outcomes of the five games are independent of one
other.
Solution: CE4,4
a) What is the probability that the Cubs will win all five games?
5
P(X=5)= P(X)= 5
(0.4)5(1-0.4)0 =0.0102= 1%
b) What is the probability that the Cubs will win a majority of five games?
P(X≥3)= P(X=3)+P(X=4)+P(X=5)
= 53 (0.4)3(1-0.4)2 + 54 (0.4)4(1-0.4)1 + 5
5
(0.4)5(1-0.4)0 =0.3174= 32%
c) If the Cubs win the first game, what is the probability that they will win a majority of five games?
First game cubs 4 games to play
P(X≥2)=0.5248 = 52%
d) Before the series begins, what is the expected number of Cubs wins in these five games?
Mean=np=5(0.4)=2 games
e) If Cubs win the first game, what is the expected number of Cubs wins in these five games?
First game cubs 4 games to play
Y=1+X
Mean=E(Y)=E(1+X)=1+E(X)=1+(4)(0.4)=2.6 3 games
The Hypergeometric Distribution
The Hypergeometric Distribution
“n” trials in a sample taken from a finite
population of size N
Sample taken without replacement
Outcomes of trials are dependent
Concerned with finding the probability of “X”
successes in the sample where there are “S”
successes in the population
Hypergeometric Distribution
Formula
S! (N S)!
CSxCNnxS x! (S x)! (n x)! (N S n x)!
P(x) N
Cn N!
n!(N n)!
Where
N = population size
S = number of successes in the population
N – S = number of failures in the population
n = sample size
x = number of successes in the sample
n – x = number of failures in the sample
Hypergeometric Distribution
Mean and Variance
Mean = E(X)=m n(S/N)
Variance =Var(X) = s² = n(S/N)(1-S/N)*((N-n)/(N-1))
Where (N-n)/(N-1) is the finite population correction
We assume that we sample from a finite population
without replacement and n/N > 0.05. So, the
probability of a success changes for each trial.
Otherwise you can use Binomial Distribution
Using the
Hypergeometric Distribution
■ Example: 3 different computers are checked from 10 in
the department. 4 of the 10 computers have illegal
software loaded. What is the probability that 2 of the 3
selected computers have illegal software loaded?
N = 10 n=3
S=4 x=2
CSxCNnxS C24C16 (6)(6)
P(x 2) N
10 0.3
Cn C3 120
The probability that 2 of the 3 selected computers have illegal
software loaded is 0.30, or 30%.
CE4,5
A company receives a shipment of 16 items. A random
sample of 4 items is selected, and the shipment is
rejected if any of these items proves to be defective.
N=16 n=4
X=number of defectives in the sample
Decision rule: Reject shipment if X≥1 and accept
shipment if X=0
Solution: CE4,5
N=16 n=4 X=number of defectives in the sample
Decision rule: Reject shipment if X≥1 and accept shipment if X=0
a) What is the probability of accepting a shipment containing 4 defective items?
S=4
4 12
0 4
P(accept)=P(X=0)= 16 = 0.29
4
b) What is the probability of accepting a shipment containing 1 defective item?
S=1
1 15
0 4
P(accept)=P(X=0)= 16 = 0.75
4
c) What is the probability of rejecting a shipment containing 1 defective item?
S=1
P(reject)=1-P(accept) = 1- 0.75 =0.25
Discuss: CE4,6 & CE4,7
CE4,6: An analyst predicted that 3.5% of all small corporations
would file for bankruptcy in the coming year. For a random sample
of 100 small corporations, estimate the probability that at least 3
will file for bankruptcy in the next year.
CE4,7:An auditor reviewing the invoices of a small company finds
that there are errors in 1.5% of them. If auditor looks at 500
invoices, what is the probability that he finds more than 1 invoice
with errors?
Chapter Summary
Defined discrete random variables and
probability distributions
Discussed descriptive measures
Discussed the Bernoulli distribution
Discussed the Binomial distribution
Discussed the Hypergeometric distribution