Joint
Probability
5 Distributions and
Random Samples
Week 5, 2011 Stat 4570/5570
Material from Devore’s book (Ed 8), and Cengage
Covariance
2
Covariance
When two random variables X and Y are not independent,
it is frequently of interest to assess how strongly they are
related to one another.
The covariance between two rv’s X and Y is
Cov(X, Y) = E[(X – µX)(Y – µY)]
X, Y discrete
X, Y continuous
3
Covariance
If both variables tend to deviate in the same direction (both
go above their means or below their means at the same
time), then the covariance will be positive. If the opposite is
true, the covariance will be negative.
If X and Y are not strongly related, the covariance will be
near 0.
4
Covariance shortcut
The following shortcut formula for Cov(X, Y) simplifies the
computations.
Proposition
Cov(X, Y) = E(XY) – µX µY
According to this formula, no intermediate subtractions are
necessary;; only at the end of the computation is µX µY
subtracted from E(XY).
This is analogous to the “shortcut” for the variance
computation we saw earlier.
5
Covariance
The covariance depends on both the set of possible pairs
and the probabilities of those pairs.
Below are examples of 3 types of “co-varying”:
(a) positive covariance;; (b) negative covariance;; (c) covariance near zero
Figure 5.4 6
Example 1
An insurance agency services customers who have both a
homeowner’s policy and an automobile policy. For each
type of policy, a deductible amount must be specified.
For an automobile policy, the choices are $100 and $250,
whereas for a homeowner’s policy, the choices are $0,
$100, and $200.
Suppose an individual – Bob -- is selected at random from
the agency’s files. Let X = his deductible amount on the
auto policy and Y = his deductible amount on the
homeowner’s policy.
7
Example 1 cont’d
Suppose the joint pmf is given by the insurance company in
the accompanying joint probability table:
What is the covariance between X and Y?
8
Correlation
9
Correlation
Definition
The correlation coefficient of X and Y, denoted by
Corr(X, Y), ρX,Y, or just ρ, is defined by
It represents a “scaled” covariance – correlation ranges
between -1 and 1.
10
Example
In the insurance example, what is the correlation between
X and Y?
11
Correlation
Propositions
1. Cov(aX + b, cY + d) = a c Cov (X, Y)
2. Corr(aX + b, cY + d) = sgn(ac) Corr(X, Y)
3. For any two rv’s X and Y, –1 ≤ Corr(X, Y) ≤ 1
4. ρ = 1 or –1 iff Y = aX + b for some numbers a and b with
a ≠ 0.
12
Correlation
If X and Y are independent, then ρ = 0, but ρ = 0 does
not imply independence.
The correlation coefficient ρ is a measure of the linear
relationship between X and Y, and only when the two
variables are perfectly related in a linear manner will ρ be
as positive or negative as it can be.
A ρ less than 1 in absolute value indicates only that the
relationship is not completely linear, but there may still be a
very strong nonlinear relation.
13
Correlation
Also, ρ = 0 does not imply that X and Y are independent,
but only that there is a complete absence of a linear
relationship.
When ρ = 0, X and Y are said to be uncorrelated.
Two variables could be uncorrelated yet highly dependent
because there is a strong nonlinear relationship, so be
careful not to conclude too much from knowing that ρ = 0.
14
Interpreting Correlation
xkcd.com/552/
15