0% found this document useful (0 votes)

5 views18 pages

Stat

The document provides a comprehensive overview of key concepts in statistics, including probability, distributions, expectations, and their associated theorems. It defines essential terms such as sample space, random variable, probability distribution, and conditional probability, along with theorems that govern their relationships. Additionally, it covers topics like expected value, moments, variance, and independence of random variables.

Uploaded by

Prem Dagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views18 pages

Stat

Uploaded by

Prem Dagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

STATISTICS-SUMMARY

PREM DAGAR

1. Probability

Definition 1.1. Sample Space. The set of all possible outcomes of an experiment is called the sample
space, and it is usually denoted by the letter S. Each outcome in a sample space is called an element of the
sample space, or simply a sample point.

Definition 1.2. Event. An event is a subset of a sample space.

Definition 1.3. Mutually Exclusive Events. Two events having no elements in common are said to be
mutually exclusive.

Definition 1.4. Conditional Probability. If A and B are any two events in a sample space S and P (A) ̸= 0,
the conditional probability of B given A is
P (A ∩ B)
P (B | A) = .
P (A)
Theorem 1.1. If A and B are any two events in a sample space S and P (A) ̸= 0, then
P (A ∩ B) = P (A) · P (B | A).

Theorem 1.2. If A, B, and C are any three events in a sample space S such that P (A ∩ B) ̸= 0, then
P (A ∩ B ∩ C) = P (A) · P (B | A) · P (C | A ∩ B).

Definition 1.5. Two events A and B are independent if and only if

P (A ∩ B) = P (A) · P (B).

Theorem 1.3. If the events B1 , B2 , . . . , Bk constitute a partition of the sample space S and P (Bi ) ̸= 0 for
i = 1, 2, . . . , k, then for any event A in S,
k
X
P (A) = P (Bi ) · P (A | Bi ).
i=1

Theorem 1.4. If B1 , B2 , . . . , Bk constitute a partition of the sample space S and P (Bi ) ̸= 0 for i = 1, 2, . . . , k,
then for any event A in S such that P (A) ̸= 0,
P (Br ) · P (A | Br )
P (Br | A) = Pk for r = 1, 2, . . . , k.
i=1 P (Bi ) · P (A | Bi )

2. Distribution

Definition 2.1 (Random Variable). If S is a sample space with a probability measure and X is a real-valued
function defined over the elements of S, then X is called a random variable.

Definition 2.2 (Probability Distribution). If X is a discrete random variable, the function given by f (x) =
P (X = x) for each x within the range of X is called the probability distribution of X.

Theorem 2.1. A function can serve as the probability distribution of a discrete random variable X if and
only if its values, f (x), satisfy the conditions:
1
2 PREM DAGAR

(1) f (x) ≥ 0 for each value within its domain;

P
(2) x f (x) = 1, where the summation extends over all the values within its domain.

Definition 2.3 (Distribution Function). If X is a discrete random variable, the function given by
X
F (x) = P (X ≤ x) = f (t) for − ∞ < x < ∞,
t≤x

where f (t) is the value of the probability distribution of X at t, is called the distribution function or the
cumulative distribution of X.

Theorem 2.2. The values F (x) of the distribution function of a discrete random variable X satisfy the
following conditions:
(1) F (−∞) = 0 and F (∞) = 1;
(2) If a < b, then F (a) ≤ F (b) for any real numbers a and b.

Theorem 2.3. If the range of a random variable X consists of the values x1 < x2 < x3 < · · · < xn , then
f (x1 ) = F (x1 ) and f (xi ) = F (xi ) − F (xi−1 ) for i = 2, 3, . . . , n.

Definition 2.4 (Probability Density Function). A function with values f (x), defined over the set of all real
numbers, is called a probability density function of the continuous random variable X if and only if
Z b
P (a ≤ X ≤ b) = f (x) dx
a
for any real constants a and b with a ≤ b.

Theorem 2.4. If X is a continuous random variable and a and b are real constants with a ≤ b, then
P (a ≤ X ≤ b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a < X < b).

Theorem 2.5. A function can serve as a probability density of a continuous random variable X if its values,
f (x), satisfy the conditions:
(1) fZ (x) ≥ 0 for −∞ < x < ∞;
∞
(2) f (x) dx = 1.
−∞

Definition 2.5 (Distribution Function). If X is a continuous random variable and the value of its probability
density at t is f (t), then the function given by
Z x
F (x) = P (X ≤ x) = f (t) dt for − ∞ < x < ∞
−∞
is called the distribution function or the cumulative distribution function of X.

Theorem 2.6. If f (x) and F (x) are the values of the probability density and the distribution function of X
at x, then
P (a ≤ X ≤ b) = F (b) − F (a)
for any real constants a and b with a ≤ b, and
dF (x)
f (x) =
dx
where the derivative exists.

Definition 2.6 (Joint Probability Distribution). If X and Y are discrete random variables, the function given
by
f (x, y) = P (X = x, Y = y)
for each pair of values (x, y) within the range of X and Y is called the joint probability distribution of X and
Y.
STATISTICS-SUMMARY 3

Theorem 2.7. A bivariate function can serve as the joint probability distribution of a pair of discrete random
variables X and Y if and only if its values, f (x, y), satisfy the conditions:
(1) f (x, y) ≥ 0 for each pair of values (x, y) within its domain;
P P
(2) x y f (x, y) = 1, where the double summation extends over all possible pairs (x, y) within its domain.

Definition 2.7 (Joint Probability Density Function). A bivariate function with values f (x, y) defined over
the xy-plane is called a joint probability density function of the continuous random variables X and Y if and
only if
ZZ
P ((X, Y ) ∈ A) = f (x, y) dx dy
A
for any region A in the xy-plane.

Theorem 2.8. A bivariate function can serve as a joint probability density function of a pair of continuous
random variables X and Y if its values, f (x, y), satisfy the conditions:
(1) fZ (x, y) ≥ 0 for −∞ < x < ∞, −∞ < y < ∞;
∞ Z ∞
(2) f (x, y) dx dy = 1.
−∞ −∞

Definition 2.8 (Joint Distribution Function). If X and Y are continuous random variables, the function
given by
Z y Z x
F (x, y) = P (X ≤ x, Y ≤ y) = f (s, t) ds dt for − ∞ < x < ∞, −∞ < y < ∞,
−∞ −∞

where f (s, t) is the joint probability density of X and Y at (s, t), is called the joint distribution function of
X and Y .

If f (x, y) is the value of their joint probability distribution at (x, y), the function given by
X
g(x) = f (x, y)
y

for each x within the range of X is called the marginal distribution of X. Correspondingly, the function given
by
X
h(y) = f (x, y)
x
for each y within the range of Y is called the marginal distribution of Y .

Definition 2.9 (Marginal Density). If X and Y are continuous random variables and f (x, y) is the value of
their joint probability density at (x, y), the function given by
Z ∞
g(x) = f (x, y) dy for − ∞ < x < ∞
−∞

is called the marginal density of X. Correspondingly, the function given by

Z ∞
h(y) = f (x, y) dx for − ∞ < y < ∞
−∞

is called the marginal density of Y .

Definition 2.10 (Conditional Distribution). If f (x, y) is the value of the joint probability distribution of the
discrete random variables X and Y at (x, y) and h(y) is the value of the marginal distribution of Y at y, the
function given by
f (x, y)
f (x | y) = , h(y) ̸= 0
h(y)
4 PREM DAGAR

for each x within the range of X is called the conditional distribution of X given Y = y. Correspondingly, if
g(x) is the value of the marginal distribution of X at x, the function given by
f (x, y)
w(y | x) = , g(x) ̸= 0
g(x)
for each y within the range of Y is called the conditional distribution of Y given X = x.

Definition 2.11 (Conditional Density). If f (x, y) is the value of the joint density of the continuous random
variables X and Y at (x, y) and h(y) is the value of the marginal density of Y at y, the function given by
f (x, y)
f (x | y) = , h(y) ̸= 0
h(y)
for −∞ < x < ∞, is called the conditional density of X given Y = y. Correspondingly, if g(x) is the value of
the marginal density of X at x, the function given by
f (x, y)
f (y | x) = , g(x) ̸= 0
g(x)
for −∞ < y < ∞, is called the conditional density of Y given X = x.

Definition 2.12 (Independence of Discrete Random Variables). If f (x1 , x2 , . . . , xn ) is the value of the joint
probability distribution of the discrete random variables X1 , X2 , . . . , Xn at (x1 , x2 , . . . , xn ) and fi (xi ) is the
value of the marginal distribution of Xi at xi for i = 1, 2, . . . , n, then the n random variables are independent
if and only if
f (x1 , x2 , . . . , xn ) = f1 (x1 ) · f2 (x2 ) · . . . · fn (xn )
for all (x1 , x2 , . . . , xn ) within their range.

3. Expectations

Definition 3.1 (Expected Value). If X is a discrete random variable and f (x) is the value of its probability
distribution at x, the expected value of X is
X
E(X) = x · f (x).
x

Correspondingly, if X is a continuous random variable and f (x) is the value of its probability density at x,
the expected value of X is Z ∞
E(X) = x · f (x) dx.
−∞

Theorem 3.1. If X is a discrete random variable and f (x) is the value of its probability distribution at x,
the expected value of g(X) is given by
X
E[g(X)] = g(x) · f (x).
x

Correspondingly, if X is a continuous random variable and f (x) is the value of its probability density at x,
the expected value of g(X) is given by
Z ∞
E[g(X)] = g(x) · f (x) dx.
−∞

Theorem 3.2. If a and b are constants, then

E(aX + b) = aE(X) + b.

Corollary 3.3. If a is a constant, then

E(aX) = aE(X).
STATISTICS-SUMMARY 5

Theorem 3.4. If c1 , c2 , . . . , cn are constants, then

" n # n
X X
E ci gi (X) = ci E[gi (X)].
i=1 i=1

Theorem 3.5. If X and Y are discrete random variables and f (x, y) is the value of their joint probability
distribution at (x, y), the expected value of g(X, Y ) is
XX
E[g(X, Y )] = g(x, y) · f (x, y).
x y

Correspondingly, if X and Y are continuous random variables and f (x, y) is the value of their joint probability
density at (x, y), the expected value of g(X, Y ) is
Z ∞Z ∞
E[g(X, Y )] = g(x, y)f (x, y) dx dy.
−∞ −∞

Theorem 3.6. If c1 , c2 , . . . , cn are constants, then

" n # n
X X
E ci gi (X1 , X2 , . . . , Xk ) = ci E[gi (X1 , X2 , . . . , Xk )].
i=1 i=1

Definition 3.2 (Moments about the Origin). The rth moment about the origin of a random variable X,
denoted by µ′r , is the expected value of X r ; symbolically,
X
µ′r = E(X r ) = xr · f (x)
x

for r = 0, 1, 2, . . . when X is discrete, and

Z ∞
µ′r = E(X ) = r
xr · f (x) dx
−∞

when X is continuous.

Definition 3.3 (Mean of a Distribution). µ′1 is called the mean of the distribution of X, or simply the mean
of X, and it is denoted simply by µ.

Definition 3.4 (Moments about the Mean). The rth moment about the mean of a random variable X,
denoted by µr , is the expected value of (X − µ)r , symbolically
X
µr = E[(X − µ)r ] = (x − µ)r · f (x)
x

for r = 0, 1, 2, . . . when X is discrete, and

Z ∞
r
µr = E[(X − µ) ] = (x − µ)r · f (x) dx
−∞

when X is continuous.

Definition 3.5 (Variance). µ2 is called the variance of the distribution of X, or simply the variance of X,
and it is denoted by σ 2 , σX
2 , var(X), or V (X). The positive square root of the variance, σ, is called the

standard deviation of X.

Theorem 3.7.
σ 2 = µ′2 − µ2

Theorem 3.8. If X has the variance σ 2 , then

var(aX + b) = a2 σ 2
6 PREM DAGAR

Theorem 3.9 (Chebyshev’s Theorem). If µ and σ are the mean and the standard deviation of a random
variable X, then for any positive constant k the probability is at least 1 − k12 that X will take on a value within
k standard deviations of the mean; symbolically,
1
P (|X − µ|< kσ) ≥ 1 − , σ ̸= 0.
k2
Definition 3.6. The moment generating function of a random variable X, where it exists, is given by
X
MX (t) = E(etX ) = etx · f (x)
x

when X is discrete, and Z ∞

tX
MX (t) = E(e )= etx · f (x) dx
−∞
when X is continuous.

Theorem 3.10.
dr MX (t)
= µr
dtr t=0

Theorem 3.11. If a and b are constants, then

(1) MX+a (t) = E[e(X+a)t ] = eat · MX (t);
(2) MbX (t) = E[ebXt
h ]X+a
= MiX (bt);
a
(3) M X+a (t) = E e b )t = e b t · MX bt .
(
b

Definition 3.7 (Product Moments About the Origin). The rth and sth product moment about the origin of
the random variables X and Y , denoted by µr,s , is the expected value of X r Y s ; symbolically,
XX
µr,s = E(X r Y s ) = xr y s · f (x, y)
x y

for r = 0, 1, 2, . . . and s = 0, 1, 2, . . . when X and Y are discrete, and

Z ∞Z ∞
r s
µr,s = E(X Y ) = xr y s · f (x, y) dx dy
−∞ −∞

when X and Y are continuous.

Definition 3.8 (Product Moments About the Mean). The rth and sth product moment about the means of
the random variables X and Y , denoted by µr,s , is the expected value of (X − µX )r (Y − µY )s ; symbolically,

XX
µr,s = E[(X − µX )r (Y − µY )s ] = (x − µX )r (y − µY )s · f (x, y)
x y
for r = 0, 1, 2, . . . and s = 0, 1, 2, . . . when X and Y are discrete, and
Z ∞ Z ∞
µr,s = E[(X − µX )r (Y − µY )s ] = (x − µX )r (y − µY )s · f (x, y) dx dy
−∞ −∞
when X and Y are continuous.

Definition 3.9 (Covariance). µ1,1 is called the covariance of X and Y , and it is denoted by σXY , cov(X, Y ),
or C(X, Y ).

Theorem 3.12.
σXY = µ1,1 − µX µY

Theorem 3.13. If X and Y are independent, then

E(XY ) = E(X) · E(Y ) and σXY = 0.

STATISTICS-SUMMARY 7

Remark: It is of interest to note that the independence of two random variables implies a zero covariance,
but a zero covariance does not necessarily imply their independence.

Theorem 3.14. If X1 , X2 , . . . , Xn are independent, then

E(X1 X2 · · · Xn ) = E(X1 ) · E(X2 ) · · · E(Xn ).

Theorem 3.15. If X1 , X2 , . . . , Xn are random variables and

n
X
Y = ai Xi
i=1
where a1 , a2 , . . . , an are constants, then
n
X
E(Y ) = ai E(Xi )
i=1
and
n
X X
var(Y ) = a2i · var(Xi ) + 2 ai aj · cov(Xi , Xj )
i=1 i<j
where the double summation extends over all values of i and j, from 1 to n, for which i < j.

Corollary 3.16. If the random variables X1 , X2 , . . . , Xn are independent and

n
X
Y = ai Xi ,
i=1
then
n
X
var(Y ) = a2i · var(Xi ).
i=1

Theorem 3.17. If X1 , X2 , . . . , Xn are random variables and

n
X n
X
Y1 = ai Xi and Y2 = bi Xi ,
i=1 i=1
where a1 , a2 , . . . , an and b1 , b2 , . . . , bn are constants, then
n
X X
cov(Y1 , Y2 ) = ai bi · var(Xi ) + (ai bj + aj bi ) · cov(Xi , Xj ).
i=1 i<j

Corollary 3.18. If the random variables X1 , X2 , . . . , Xn are independent, and

n
X n
X
Y1 = ai Xi and Y2 = bi Xi ,
i=1 i=1
then
n
X
cov(Y1 , Y2 ) = ai bi · var(Xi ).
i=1

Definition 10. Conditional Expectation.

If X is a discrete random variable and f (x|y) is the conditional probability distribution of X given Y = y at
x, then the conditional expectation of u(X) given Y = y is defined as
X
E[u(X) | y] = u(x) · f (x|y).
x

Correspondingly, if X is a continuous random variable and f (x|y) is the conditional probability density of X
given Y = y at x, then the conditional expectation of u(X) given Y = y is
Z ∞
E[u(X) | y] = u(x) · f (x|y) dx.
−∞
8 PREM DAGAR

4. Special Probability Distributions

Definition 1. Discrete Uniform Distribution.

A random variable X has a discrete uniform distribution and is called a discrete uniform random variable if
and only if its probability distribution is given by
1
f (x) =
k
for
x = x1 , x2 , . . . , xk ,
where xi ̸= xj when i ̸= j.
Definition 2. Bernoulli Distribution.
A random variable X has a Bernoulli distribution and is called a Bernoulli random variable if and only if its
probability distribution is given by
f (x; θ) = θx (1 − θ)1−x for x = 0, 1,
where 0 ≤ θ ≤ 1.
Definition 3. Binomial Distribution. A random variable X has a binomial distribution and is called a
binomial random variable if and only if its probability distribution is given by

n x
b(x; n, θ) = θ (1 − θ)n−x for x = 0, 1, 2, . . . , n,
x
where n is a non-negative integer and 0 ≤ θ ≤ 1.
Theorem 1.

b(x; n, θ) = b(n − x; n, 1 − θ)
Theorem 2. The mean and the variance of the binomial distribution are
µ = nθ and σ 2 = nθ(1 − θ)
X
Theorem 3. If X has a binomial distribution with parameters n and θ, and Y = n, then
θ(1 − θ)
E(Y ) = θ and σY2 =
n
Now, if we apply Chebyshev’s theorem with kσ = c, we can assert that for any positive constant c, the
probability is at least
θ(1 − θ)
1−
nc2
that the proportion of successes in n trials falls between µ − c and µ + c.
Hence, as n → ∞, the probability approaches 1 that the proportion of successes will differ from µ by less
than any arbitrary constant c.
This result is called a law of large numbers, and it should be observed that it applies to the proportion of
successes, not to their actual number. It is a fallacy to suppose that when n is large, the number of successes
must necessarily be close to nθ.
THEOREM 4. The moment-generating function of the binomial distribution is given by
n
MX (t) = 1 + θ et − 1

Definition 4. Negative Binomial Distribution.

A random variable X has a negative binomial distribution, and is referred to as a negative binomial random
variable, if and only if its probability distribution is given by

∗ x−1 k
b (x; k, θ) = θ (1 − θ)x−k for x = k, k + 1, k + 2, . . .
k−1
STATISTICS-SUMMARY 9

Thus, the number of the trial on which the kth success occurs is a random variable having a negative binomial
distribution with the parameters k and θ. The name “negative binomial distribution” derives from the fact
that the values of b∗ (x; k, θ) for x = k, k + 1, k + 2, . . . are the successive terms of the binomial expansion of

1 1 − θ −k

− .
θ θ
In the literature of statistics, negative binomial distributions are also referred to as binomial waiting-time
distributions or as Pascal distributions.
Theorem 5.

k
b∗ (x; k, θ) = · b(k; x, θ)
x
Theorem 6.
The mean and the variance of the negative binomial distribution are

k 2 k 1
µ= and σ = −1
θ θ θ
Definition 5. Geometric Distribution.
A random variable X has a geometric distribution and is referred to as a geometric random variable if and
only if its probability distribution is given by

g(x; θ) = θ(1 − θ)x−1 for x = 1, 2, 3, . . .

Definition 6. Hypergeometric Distribution.

A random variable X has a hypergeometric distribution, and is referred to as a hypergeometric random
variable, f and only if its probability distribution is given by

M N −M
x n−x
h(x; n, N, M ) = for x = 0, 1, 2, . . . , n
N
n
subject to the constraints x ≤ M and n − x ≤ N − M .
Theorem 7.
The mean and the variance of the hypergeometric distribution are
nM nM (N − M )(N − n)
µ= and σ 2 =
N N 2 (N − 1)
Definition 7. Poisson Distribution.
A random variable X has a Poisson distribution, and is referred to as a Poisson random variable if and only
if its probability distribution is given by
λx e−λ
p(x; λ) = for x = 0, 1, 2, . . .
x!
Theorem 8.
The mean and the variance of the Poisson distribution are given by

µ=λ and σ 2 = λ

Theorem 9.
The moment-generating function of the Poisson distribution is given by
t −1)
MX (t) = eλ(e

Definition 1. Uniform Distribution.

10 PREM DAGAR

A random variable X has a uniform distribution and is referred to as a continuous uniform random variable
if and only if its probability density function is given by

 1 for α < x < β,
u(x; α, β) = β−α
0 elsewhere.
Theorem 1.
The mean and the variance of the uniform distribution are given by
α+β 1
µ= and σ 2 = (β − α)2
2 12
Definition 2. Gamma Distribution.
A random variable X has a gamma distribution and is referred to as a gamma random variable if and only
if its probability density function is given by

 1 xα−1 e−x/β for x > 0,
α
g(x; α, β) = β Γ(α)
0 elsewhere,
where α > 0 and β > 0.
Definition 3. Exponential Distribution.
A random variable X has an exponential distribution and is referred to as an exponential random variable if
and only if its probability density function is given by

 1 e−x/θ for x > 0,
g(x; θ) = θ
0 elsewhere,
where θ > 0.
Definition 4. Chi-Square Distribution.
A random variable X has a chi-square distribution and is referred to as a chi-square random variable if and
only if its probability density function is given by

ν
 1
2ν/2 Γ(ν/2)
x 2 −1 e−x/2 for x > 0,
f (x; ν) =
0 elsewhere,
where ν > 0 is the degrees of freedom.
Theorem 2.
The rth moment about the origin of the gamma distribution is given by
Γ(α + r)
µ′r = β r .
Γ(α)
Theorem 3.
The mean and the variance of the gamma distribution are given by

µ = αβ and σ 2 = αβ 2 .

Corollary 1.
The mean and the variance of the exponential distribution are given by

µ=θ and σ 2 = θ2 .

Corollary 2.
The mean and the variance of the chi-square distribution are given by

µ=ν and σ 2 = 2ν.

Definition 5. Beta Distribution.

STATISTICS-SUMMARY 11

A random variable X has a beta distribution and is referred to as a beta random variable if and only if its
probability density function is given by

 Γ(α+β) xα−1 (1 − x)β−1 for 0 < x < 1,
f (x; α, β) = Γ(α) Γ(β)
0 elsewhere,
where α > 0 and β > 0.
Theorem 5.
The mean and the variance of the beta distribution are given by
α αβ
µ= and σ 2 = .
α+β (α + β)2 (α + β + 1)
Definition 6. Normal Distribution.
A random variable X has a normal distribution and is referred to as a normal random variable if and only
if its probability density function is given by
1 1 x−µ 2
n(x; µ, σ) = √ e− 2 ( σ ) for − ∞ < x < ∞,
σ 2π
where σ > 0.
Theorem 6.
The moment-generating function of the normal distribution is given by
1 2 t2
MX (t) = eµt+ 2 σ .
Definition 7. Standard Normal Distribution.
The normal distribution with µ = 0 and σ = 1 is referred to as the standard normal distribution.
If a random variable X has a normal distribution with mean µ and standard deviation σ, then
X −µ
Z=
σ
has the standard normal distribution.
Theorem 8.
If X is a random variable having a binomial distribution with parameters n and θ, then the moment-generating
function of
X − nθ
Z=p
nθ(1 − θ)
approaches that of the standard normal distribution as n → ∞.
Definition 8. Bivariate Normal Distribution.
A pair of random variables X and Y has a bivariate normal distribution and are referred to as jointly normally
distributed random variables if and only if their joint probability density function is given by
( " #)
x − µ1 2 y − µ2 2

1 1 x − µ1 y − µ2
f (x, y) = exp − − 2ρ +
2(1 − ρ2 )
p
2πσ1 σ2 1 − ρ2 σ1 σ1 σ2 σ2
for −∞ < x < ∞ and −∞ < y < ∞, where σ1 > 0, σ2 > 0, and −1 < ρ < 1.
Theorem 9.
If X and Y have a bivariate normal distribution, the conditional density of Y given X = x is a normal
distribution with mean
σ2
µY |x = µ2 + ρ (x − µ1 )
σ1
and variance
σY2 |x = σ22 (1 − ρ2 ).
Similarly, the conditional density of X given Y = y is a normal distribution with mean
σ1
µX|y = µ1 + ρ (y − µ2 )
σ2
12 PREM DAGAR

and variance
2
σX|y = σ12 (1 − ρ2 ).
Theorem 10.
If two random variables have a bivariate normal distribution, they are independent if and only if ρ = 0.
Theorem 3.
If X1 , X2 , . . . , Xn are independent random variables and
Y = X1 + X2 + · · · + Xn ,
then the moment-generating function of Y is
n
Y
MY (t) = MXi (t),
i=1

where MXi (t) is the moment-generating function of Xi evaluated at t.

5. Sampling Distributions

Definition 1. Population.
A set of numbers from which a sample is drawn is referred to as a population. The distribution of the numbers
constituting a population is called the population distribution.
Definition 2. Random Sample.
If X1 , X2 , . . . , Xn are independent and identically distributed random variables, we say that they constitute
a random sample from the infinite population given by their common distribution.
Definition 3. Sample Mean and Sample Variance.
If X1 , X2 , . . . , Xn constitute a random sample, then the sample mean is given by
n
1X
X= Xi
n
i=1
and the sample variance is given by
n
2 1 X
S = (Xi − X)2 .
n−1
i=1
Theorem 1.
If X1 , X2 , . . . , Xn constitute a random sample from an infinite population with mean µ and variance σ 2 , then
σ2
E(X) = µ and var(X) = .
n
Theorem 2.
For any positive constant c, the probability that X will take on a value between µ − c and µ + c is at least
σ2
1− .
nc2
As n → ∞, this probability approaches 1.
Theorem 3. Central Limit Theorem.
If X1 , X2 , . . . , Xn constitute a random sample from an infinite population with mean µ, variance σ 2 , and
moment-generating function MX (t), then the limiting distribution of
X −µ
Z= √
σ/ n
as n → ∞ is the standard normal distribution.
Theorem 4.
If X is the mean of a random sample of size n from a normal population with mean µ and variance σ 2 , then
2
its sampling distribution is a normal distribution with mean µ and variance σn .
STATISTICS-SUMMARY 13

Definition 4. Random Sample—Finite Population.

If X1 is the first value drawn from a finite population of size N , X2 is the second value drawn, . . ., Xn is the
nth value drawn, and the joint probability distribution of these n random variables is given by
1
f (x1 , x2 , . . . , xn ) =
N (N − 1) · · · (N − n + 1)
for each ordered n-tuple of values of these random variables, then X1 , X2 , . . . , Xn are said to constitute a
random sample from the given finite population.
Definition 5. Sample Mean and Variance—Finite Population. The sample mean and the sample
variance of the finite population {c1 , c2 , . . . , cN } are
N
X 1
µ= ci ·
N
i=1
and
N
X 1
σ2 = (ci − µ)2 · .
N
i=1
Theorem 5. If Xr and Xs are the rth and sth random variables of a random sample of size n drawn from
the finite population {c1 , c2 , . . . , cN }, then
σ2
cov(Xr , Xs ) = − .
N −1
Theorem 6. If X is the mean of a random sample of size n taken without replacement from a finite
population of size N with the mean µ and the variance σ 2 , then
σ2 N − n
E(X) = µ and ·
var(X) = .
n N −1
Definition. A random variable X has a chi-square distribution with ν degrees of freedom if its probability
density function is given by

1 ν
x 2 −1 e−x/2 for x > 0,


ν/2
f (x) = 2 Γ(ν/2)
0

elsewhere.

Theorem 7. If X has the standard normal distribution, then X 2 has the chi-square distribution with ν = 1
degree of freedom.
Theorem 8. If X1 , X2 , . . . , Xn are independent random variables having standard normal distributions,
then
Xn
Y = Xi2
i=1
has the chi-square distribution with ν = n degrees of freedom.
Theorem 9. If X1 , X2 , . . . , Xn are independent random variables having chi-square distributions with
ν1 , ν2 , . . . , νn degrees of freedom, then
Xn
Y = Xi
i=1
has the chi-square distribution with ν1 + ν2 + · · · + νn degrees of freedom.
Theorem 10. If X1 and X2 are independent random variables, X1 has a chi-square distribution with ν1
degrees of freedom, and X1 + X2 has a chi-square distribution with ν > ν1 degrees of freedom, then X2 has a
chi-square distribution with ν − ν1 degrees of freedom.
Theorem 11. If X and S 2 are the mean and the variance of a random sample of size n from a normal
population with mean µ and standard deviation σ, then
(1) X and S 2 are independent;
14 PREM DAGAR

(2) The random variable

(n − 1)S 2
σ2
has a chi-square distribution with n − 1 degrees of freedom.
Definition. If Y has a chi-square distribution with ν degrees of freedom, and Z has the standard normal
distribution, then the distribution of
Z
T =p
Y /ν
is given by
− ν+1
Γ ν+1

2 t2 2
f (t) = √ ν
1+ for − ∞ < t < ∞,
πν Γ 2 ν
and it is called the t distribution with ν degrees of freedom.
Theorem 13. If X and S 2 are the mean and the variance of a random sample of size n from a normal
population with mean µ and variance σ 2 , then
X −µ
T = √
S/ n
has the t distribution with n − 1 degrees of freedom.
Theorem 14. If U and V are independent random variables having chi-square distributions with ν1 and ν2
degrees of freedom, then
U/ν1
F =
V /ν2
is a random variable having an F distribution, that is, a random variable whose probability density function
is given by
 ν1 − ν1 +ν2
Γ ν1 +ν 2

 2 ν1 2 ν1
−1 ν1 2


ν1
ν2
f 2 1+ f f > 0,
g(f ) = Γ 2 Γ 2 ν2 ν2


0 elsewhere.


Theorem 15. If S12 and S22 are the variances of independent random samples of sizes n1 and n2 from normal
populations with the variances σ12 and σ22 , then
S12 /σ12 σ2S 2
F = 2 2 = 22 12
S2 /σ2 σ1 S2
is a random variable having an F distribution with n1 − 1 and n2 − 1 degrees of freedom.
The F distribution is also known as the variance-ratio distribution.

6. Point Estimation

Definition 6.1 (Point Estimation). Using the value of a sample statistic to estimate the value of a population
parameter is called point estimation. The value of the statistic obtained from the sample is referred to as a
point estimate.

Definition 6.2 (Unbiased Estimator). A statistic θ̂ is an unbiased estimator of the parameter θ of a given
distribution if and only if
E(θ̂) = θ
for all possible values of θ.

Definition 6.3 (Asymptotically Unbiased Estimator). Letting bn (θ) = E(θ̂) − θ express the bias of an
estimator θ̂ based on a random sample of size n from a given distribution, we say that θ̂ is an asymptotically
unbiased estimator of θ if and only if
lim bn (θ) = 0.
n→∞
STATISTICS-SUMMARY 15

Theorem 6.1. If S2 is the variance of a random sample from an infinite population with finite variance σ 2 ,
then
E(S 2 ) = σ 2 .

Definition 6.4. Minimum Variance Unbiased Estimator. The estimator for the parameter θ of a given
distribution that has the smallest variance of all unbiased estimators for θ is called the minimum variance
unbiased estimator, or the best unbiased estimator for θ.

Theorem 6.2. If θ̂ is an unbiased estimator of θ and

" 2 #
1 ∂
var(θ̂) = · E ln f (X) ,
n ∂θ

then θ̂ is a minimum variance unbiased estimator of θ.

Definition 6.5 (Consistent Estimator). A statistic θ̂ is a consistent estimator of the parameter θ of a given
distribution if and only if for each c > 0,
lim P (|θ̂ − θ|< c) = 1.
n→∞

Theorem 6.3. If θ̂ is an unbiased estimator of the parameter θ and

var(θ̂) → 0 as n → ∞,
then θ̂ is a consistent estimator of θ.

Definition 6.6. The statistic θ̂ is a sufficient estimator of the parameter θ of a given distribution if and only
if for each value of θ̂, the conditional probability distribution or density of the random sample X1 , X2 , . . . , Xn ,
given θ̂ = t, is independent of θ.

Theorem 6.4. The statistic θ̂ is a sufficient estimator of the parameter θ if and only if the joint probability
distribution or density of the random sample can be factored as
f (x1 , x2 , . . . , xn ; θ) = g(θ̂, θ) · h(x1 , x2 , . . . , xn ),
where g(θ̂, θ) depends only on θ̂ and θ, and h(x1 , x2 , . . . , xn ) does not depend on θ.

Definition 6.7 (Sample Moments). The kth sample moment of a set of observations x1 , x2 , . . . , xn is the
mean of their kth powers and it is denoted by mk ; symbolically,
n
1X k
mk = xi .
n
i=1

Definition 6.8 (Maximum Likelihood Estimator). If x1 , x2 , . . . , xn are the values of a random sample from
a population with the parameter θ, the likelihood function of the sample is given by
L(θ) = f (x1 , x2 , . . . , xn ; θ)
for values of θ within a given domain. Here, f (x1 , x2 , . . . , xn ; θ) is the value of the joint probability distribution
or the joint probability density of the random variables X1 , X2 , . . . , Xn at X1 = x1 , X2 = x2 , . . . , Xn = xn .
We refer to the value of θ that maximizes L(θ) as the maximum likelihood estimator of θ.

Example 6.1. If x1 , x2 , . . . , xn are the values of a random sample from an exponential population, find the
maximum likelihood estimator of its parameter θ.
Solution:
Since the likelihood function is given by
n n
Y 1 1 Pn
L(θ) = f (x1 , x2 , . . . , xn ; θ) = f (xi ; θ) = · e− θ i=1 xi
θ
i=1
16 PREM DAGAR

Differentiation of ln L(θ) with respect to θ yields

n
d n 1 X
[ln L(θ)] = − + 2 xi
dθ θ θ
i=1
Equating this derivative to zero and solving for θ, we get the maximum likelihood estimator
n
1X
θ̂ = xi = x̄
n
i=1

Hence, the maximum likelihood estimator is θ̂ = X̄.

7. Interval Estimation

Definition 7.1 (Confidence Interval). If θ̂1 and θ̂2 are values of the random variables Θ̂1 and Θ̂2 such that
P (Θ̂1 < θ < Θ̂2 ) = 1 − α
for some specified probability 1 − α, we refer to the interval
θ̂1 < θ < θ̂2
as a (1 − α) × 100% confidence interval for θ. The probability 1 − α is called the degree of confidence, and the
endpoints of the interval are called the lower and upper confidence limits.

Theorem 7.1. If X, the mean of a random sample of size n from a normal population with known variance
σ 2 , is to be used as an estimator of the mean of the population, then the probability is 1 − α that the error will
be less than
σ
zα/2 · √ .
n
Theorem 7.2. If x is the value of the mean of a random sample of size n from a normal population with
known variance σ 2 , then
σ σ
x − zα/2 · √ < µ < x + zα/2 · √
n n
is a (1 − α)100% confidence interval for the mean of the population.

Theorem 7.3. If x and s are the values of the mean and the standard deviation of a random sample of size
n from a normal population, then
s s
x − tα/2, n−1 · √ < µ < x + tα/2, n−1 · √
n n
is a (1 − α)100% confidence interval for the mean of the population.

Theorem 7.4. If x1 and x2 are the values of the means of independent random samples of sizes n1 and n2
from normal populations with known variances σ12 and σ22 , then
s s
σ12 σ22 σ12 σ22
(x1 − x2 ) − zα/2 · + < µ1 − µ2 < (x1 − x2 ) + zα/2 · +
n1 n2 n1 n2
is a (1 − α)100% confidence interval for the difference between the two population means.

Theorem 7.5. If x1 , x2 , s1 , and s2 are the values of the means and the standard deviations of independent
random samples of sizes n1 and n2 from normal populations with equal variances, then
r r
1 1 1 1
(x1 − x2 ) − tα/2, n1 +n2 −2 · sp + < µ1 − µ2 < (x1 − x2 ) + tα/2, n1 +n2 −2 · sp +
n1 n2 n1 n2
is a (1 − α)100% confidence interval for the difference between the two population means, where the pooled
standard deviation is s
(n1 − 1)s21 + (n2 − 1)s22
sp = .
n1 + n2 − 2
STATISTICS-SUMMARY 17
x
Theorem 7.6. If X is a binomial random variable with parameters n and θ, n is large, and θ̂ = n, then
s s
θ̂(1 − θ̂) θ̂(1 − θ̂)
θ̂ − zα/2 · < θ < θ̂ + zα/2 ·
n n
is an approximate (1 − α)100% confidence interval for θ.
x
Theorem 7.7. If θ̂ = n is used as an estimate of θ, then with (1 − α)100% confidence, the error in the
estimate is less than s
θ̂(1 − θ̂)
zα/2 · .
n
Theorem 7.8. If X1 is a binomial random variable with parameters n1 and θ1 , and X2 is a binomial random
variable with parameters n2 and θ2 , where n1 and n2 are large, and θ̂1 = nx11 , θ̂2 = nx22 , then
s s
θ̂1 (1 − θ̂1 ) θ̂2 (1 − θ̂2 ) θ̂1 (1 − θ̂1 ) θ̂2 (1 − θ̂2 )
(θ̂1 − θ̂2 ) − zα/2 · + < θ1 − θ2 < (θ̂1 − θ̂2 ) + zα/2 · +
n1 n2 n1 n2
is an approximate (1 − α)100% confidence interval for θ1 − θ2 .

Theorem 7.9. If s2 is the value of the variance of a random sample of size n from a normal population, then
(n − 1)s2 2 (n − 1)s2
< σ <
χ2α/2, n−1 χ21−α/2, n−1
is a (1 − α)100% confidence interval for σ 2 .

Theorem 7.10. If s21 and s22 are the values of the variances of independent random samples of sizes n1 and
n2 from normal populations, then
s21 1 σ12 s21
· < < · Fα/2, n2 −1, n1 −1
s22 Fα/2, n1 −1, n2 −1 σ22 s22
σ12
is a (1 − α)100% confidence interval for σ22
.
18 PREM DAGAR

No. Distribution PDF/PMF Mean Variance MGF

1 a+b (b−a)2 etb −eta
1 Uniform Distribution: b−a , a<x<b 2 12 t(b−a)
2 x
Bernoulli Distribution: θ (1 − θ)1−x , x = 0, 1 θ θ(1 − θ) 1 − θ + θet
n
θx (1 − θ)n−x , x = nθ

3 Binomial Distribution: x nθ(1 − θ) (1 − θ +
0, 1, . . . , n θet )n
θet
4 Geometric Distribution: θ(1 − θ)x−1 , x = 1θ 1−θ
θ2 1−(1−θ)et
1, 2, . . .
r
r(1−θ) θet
Negative Binomial Distribution: x−1 r
r
5 r−1 θ (1 − θ θ2 1−(1−θ)et
θ)x−r , x = r, r + 1, . . .
−λ x t −1)
6 Poisson Distribution: e x!λ , x = 0, 1, 2, . . . λ λ eλ(e
7 Exponential Distribution: λe −λx , x>0 1 1 λ
λ λ2 λ−t α
λα α−1 −λx α α λ
8 Gamma Distribution: Γ(α) x e , x>0 λ λ2 λ−t
xα−1 (1−x)β−1 α αβ
9 Beta Distribution: B(α,β) , 0< x<1 α+β (α+β)2 (α+β+1)
—
(x−µ)2 1
− 2 t2
10 Normal Distribution: √ 1 2 e 2σ 2 , −∞ < µ σ2 eµt+ 2 σ
2πσ
x<∞
11 Chi-square Distribution: ν 2ν (1−2t)−ν/2
ν
1
2ν/2 Γ(ν/2)
x 2 −1 e−x/2 , x > 0

Set Theory & Probability Cheat Sheet
No ratings yet
Set Theory & Probability Cheat Sheet
2 pages
MTH451 Study Notes
No ratings yet
MTH451 Study Notes
29 pages
Distribution Theory I
No ratings yet
Distribution Theory I
25 pages
477 - STS 331 DJISTRIBUTION THEORYb
No ratings yet
477 - STS 331 DJISTRIBUTION THEORYb
30 pages
IARE P&S Lecture Notes 0
No ratings yet
IARE P&S Lecture Notes 0
71 pages
Random Variables
No ratings yet
Random Variables
4 pages
Statistical Theory of Distribution: Stat 471
No ratings yet
Statistical Theory of Distribution: Stat 471
45 pages
Chapterhyposthies 3
No ratings yet
Chapterhyposthies 3
11 pages
Wa0002.
No ratings yet
Wa0002.
55 pages
Joint Probability Distributions Explained
No ratings yet
Joint Probability Distributions Explained
16 pages
Engineering Probability Basics
No ratings yet
Engineering Probability Basics
8 pages
Probability & Stats Lecture Notes
No ratings yet
Probability & Stats Lecture Notes
9 pages
Random Variables and Process
No ratings yet
Random Variables and Process
31 pages
Mathematical Statistics: Probability & Distributions
No ratings yet
Mathematical Statistics: Probability & Distributions
87 pages
Advanced Statistics
100% (1)
Advanced Statistics
131 pages
Prob Review
No ratings yet
Prob Review
19 pages
SI Chapter-1
No ratings yet
SI Chapter-1
30 pages
Random Variables (R.V.) : Probability Theory
No ratings yet
Random Variables (R.V.) : Probability Theory
17 pages
Slide 2 - 20191
No ratings yet
Slide 2 - 20191
44 pages
Econ 327 Definitions
No ratings yet
Econ 327 Definitions
4 pages
Econ-2042 - Unit 4-HO
No ratings yet
Econ-2042 - Unit 4-HO
13 pages
PRP - Unit 2
No ratings yet
PRP - Unit 2
41 pages
(Last) Extension of Several Random Variables
No ratings yet
(Last) Extension of Several Random Variables
16 pages
Ch06 - Probality and Random Process
No ratings yet
Ch06 - Probality and Random Process
42 pages
Probability B and W PDF For Students
No ratings yet
Probability B and W PDF For Students
100 pages
Chapter 3 Random Variables and Probability Distributions
No ratings yet
Chapter 3 Random Variables and Probability Distributions
32 pages
Chapter 2 - Random Variables and Probabi - 2016 - Introduction To Statistical Ma
No ratings yet
Chapter 2 - Random Variables and Probabi - 2016 - Introduction To Statistical Ma
14 pages
S201, Lec 2
No ratings yet
S201, Lec 2
48 pages
Summary Statistics
No ratings yet
Summary Statistics
2 pages
Econ-2042 - Unit 2-HO
No ratings yet
Econ-2042 - Unit 2-HO
12 pages
Chapter 0
No ratings yet
Chapter 0
13 pages
Statistical Signal Processing
100% (3)
Statistical Signal Processing
125 pages
PCS Module 1 4
No ratings yet
PCS Module 1 4
57 pages
AMS 131: Introduction To Probability Theory (Spring 2013)
No ratings yet
AMS 131: Introduction To Probability Theory (Spring 2013)
3 pages
Joint and Marginal Distributions
100% (1)
Joint and Marginal Distributions
12 pages
Binomial and Hypergeometric PDF
No ratings yet
Binomial and Hypergeometric PDF
12 pages
Lecture 2
No ratings yet
Lecture 2
70 pages
PML Class 0 2025
No ratings yet
PML Class 0 2025
55 pages
Joint Probability Distribution Reference 1
No ratings yet
Joint Probability Distribution Reference 1
13 pages
Joint Probability Functions
No ratings yet
Joint Probability Functions
7 pages
Conditional Distribution
No ratings yet
Conditional Distribution
3 pages
Chapter Two 2. Random Variables and Probability Distributions
100% (1)
Chapter Two 2. Random Variables and Probability Distributions
15 pages
Mca4020 SLM Unit 02
No ratings yet
Mca4020 SLM Unit 02
27 pages
Statistical Analysis
No ratings yet
Statistical Analysis
72 pages
RVSP Unit-I
No ratings yet
RVSP Unit-I
11 pages
SF 2940 Forms
No ratings yet
SF 2940 Forms
23 pages
Probabilty Distributions
No ratings yet
Probabilty Distributions
7 pages
Random Variables and Distributions
No ratings yet
Random Variables and Distributions
42 pages
Chapter 2
No ratings yet
Chapter 2
8 pages
CHAPTER 03-Random Variable
No ratings yet
CHAPTER 03-Random Variable
68 pages
Module 4
No ratings yet
Module 4
3 pages
Joint Probability Distribution Partial Lecture
No ratings yet
Joint Probability Distribution Partial Lecture
62 pages
2.2. Multivariate Probability Distributions: Joint Probability Distribution (Density) Function
No ratings yet
2.2. Multivariate Probability Distributions: Joint Probability Distribution (Density) Function
8 pages
Short Notes Booklet GP SIR
No ratings yet
Short Notes Booklet GP SIR
56 pages
Conjugate Gradient Method
No ratings yet
Conjugate Gradient Method
36 pages
Speh Locally Globally
No ratings yet
Speh Locally Globally
27 pages
Math Methods in App Sciences - 2023 - Kazemi - An Existence Result With Numerical Solution of Nonlinear Fractional Integral
No ratings yet
Math Methods in App Sciences - 2023 - Kazemi - An Existence Result With Numerical Solution of Nonlinear Fractional Integral
16 pages
A Trust Region Technique For Multiobjective Optimization Problems With Equality and Inequality Constraints
No ratings yet
A Trust Region Technique For Multiobjective Optimization Problems With Equality and Inequality Constraints
54 pages
Simultaneous Nonvanishing of The Correlation Const
No ratings yet
Simultaneous Nonvanishing of The Correlation Const
13 pages
Trigonometric Identities
No ratings yet
Trigonometric Identities
1 page
Pochhammer, Bell PDF
No ratings yet
Pochhammer, Bell PDF
29 pages
Permutation & Comb Formclas 11
No ratings yet
Permutation & Comb Formclas 11
3 pages
Answers Solutions
No ratings yet
Answers Solutions
35 pages
Solution Manual For Further Mathematics For Economic Analysis 2nd Edition
No ratings yet
Solution Manual For Further Mathematics For Economic Analysis 2nd Edition
13 pages
De&t Cho
No ratings yet
De&t Cho
9 pages
MA3103
No ratings yet
MA3103
1 page
Cse 41
No ratings yet
Cse 41
17 pages
Trigonometric Equation - Vijay Assignment
No ratings yet
Trigonometric Equation - Vijay Assignment
3 pages
MTH282 Online Facilitation Plan
No ratings yet
MTH282 Online Facilitation Plan
5 pages
Math4 Lec1
No ratings yet
Math4 Lec1
8 pages
Matlab Manual III (22-23)
No ratings yet
Matlab Manual III (22-23)
19 pages
Chapter 1 1.1 Integration by Part
No ratings yet
Chapter 1 1.1 Integration by Part
30 pages
4.2 - CH 4 - 3D Transformation - Modified
No ratings yet
4.2 - CH 4 - 3D Transformation - Modified
43 pages
319 Mathematics
No ratings yet
319 Mathematics
4 pages
12 - 10 - 24-SOL-5.8-FLIPPED-Sketching Graphs of Functions and Their Derivatives
No ratings yet
12 - 10 - 24-SOL-5.8-FLIPPED-Sketching Graphs of Functions and Their Derivatives
5 pages
Assignment 10
No ratings yet
Assignment 10
1 page
An Introduction To Algebraic Deformation Theory: Thomas F. Fox
No ratings yet
An Introduction To Algebraic Deformation Theory: Thomas F. Fox
25 pages
Math and Trigonometry Functions (Reference) - Microsoft Support
No ratings yet
Math and Trigonometry Functions (Reference) - Microsoft Support
1 page
Group Theory Basics and Examples
No ratings yet
Group Theory Basics and Examples
20 pages
MAC Syllabus
No ratings yet
MAC Syllabus
4 pages
Tensor Basics for Engineers
No ratings yet
Tensor Basics for Engineers
26 pages
Subgradient Methods
No ratings yet
Subgradient Methods
56 pages
Fourier Series and Half-Range Expansions
No ratings yet
Fourier Series and Half-Range Expansions
127 pages
Matrix (Mathematics)
No ratings yet
Matrix (Mathematics)
35 pages
Ch8 Solutions 2ed
No ratings yet
Ch8 Solutions 2ed
38 pages
Gen Math
No ratings yet
Gen Math
4 pages
Specific Case of An Internal Control-External Control (Bounded Domain)
No ratings yet
Specific Case of An Internal Control-External Control (Bounded Domain)
3 pages
Calculus Exam 2 Derivatives Review
No ratings yet
Calculus Exam 2 Derivatives Review
6 pages
Addmaths Kacang Form 4 Chapter 1: Functions: Notes & Past SPM Questions
No ratings yet
Addmaths Kacang Form 4 Chapter 1: Functions: Notes & Past SPM Questions
17 pages

Stat

Uploaded by

Stat

Uploaded by

STATISTICS-SUMMARY

Definition 1.2. Event. An event is a subset of a sample space.

Definition 1.5. Two events A and B are independent if and only if

(1) f (x) ≥ 0 for each value within its domain;

is called the marginal density of X. Correspondingly, the function given by

is called the marginal density of Y .

Theorem 3.2. If a and b are constants, then

Corollary 3.3. If a is a constant, then

Theorem 3.4. If c1 , c2 , . . . , cn are constants, then

Theorem 3.6. If c1 , c2 , . . . , cn are constants, then

for r = 0, 1, 2, . . . when X is discrete, and

for r = 0, 1, 2, . . . when X is discrete, and

Theorem 3.8. If X has the variance σ 2 , then

when X is discrete, and Z ∞

Theorem 3.11. If a and b are constants, then

for r = 0, 1, 2, . . . and s = 0, 1, 2, . . . when X and Y are discrete, and

when X and Y are continuous.

Theorem 3.13. If X and Y are independent, then

E(XY ) = E(X) · E(Y ) and σXY = 0.

Theorem 3.14. If X1 , X2 , . . . , Xn are independent, then

Theorem 3.15. If X1 , X2 , . . . , Xn are random variables and

Corollary 3.16. If the random variables X1 , X2 , . . . , Xn are independent and

Theorem 3.17. If X1 , X2 , . . . , Xn are random variables and

Corollary 3.18. If the random variables X1 , X2 , . . . , Xn are independent, and

Definition 10. Conditional Expectation.

4. Special Probability Distributions

Definition 1. Discrete Uniform Distribution.

Definition 4. Negative Binomial Distribution.

g(x; θ) = θ(1 − θ)x−1 for x = 1, 2, 3, . . .

Definition 6. Hypergeometric Distribution.

Definition 1. Uniform Distribution.

µ=ν and σ 2 = 2ν.

Definition 5. Beta Distribution.

where MXi (t) is the moment-generating function of Xi evaluated at t.

Definition 4. Random Sample—Finite Population.

(2) The random variable

Theorem 6.2. If θ̂ is an unbiased estimator of θ and

then θ̂ is a minimum variance unbiased estimator of θ.

Theorem 6.3. If θ̂ is an unbiased estimator of the parameter θ and

Differentiation of ln L(θ) with respect to θ yields

Hence, the maximum likelihood estimator is θ̂ = X̄.

No. Distribution PDF/PMF Mean Variance MGF

You might also like