0% found this document useful (0 votes)

315 views46 pages

Cheat Sheet - JAM

This document is a cheat sheet covering topics in probability, statistics, and mathematics. It includes sections on probability theory, random variables, generating functions, inequalities, theoretical distributions, sampling distributions, distribution relationships, transformations, point estimation in statistics, and testing of hypotheses. The cheat sheet provides an overview of key concepts and formulas within each of these topics.

Uploaded by

Harnaik Sahni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

315 views46 pages

Cheat Sheet - JAM

Uploaded by

Harnaik Sahni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Probability, Statistics & Mathematics

(Cheat Sheet)

Author: Anik Chakraborty

Contents

1 Probability 1
1.1 Theory of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Sigma Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.3 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.4 Stochastic Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Univariate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Bivariate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Cumulants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4 Probability Generating Function . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 Markov & Chebyshev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 Cauchy-Schwarz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.3 Jensen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.4 Lyapunov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Theoretical Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.1 Discrete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.2 Continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.3 Multivariate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.4 Truncated Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.1 Chi-square, t, F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.2 Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7 Distribution Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1 Binomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.2 Negative Binomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.3 Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.4 Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.5 Gamma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.6 Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.7 Cauchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.8 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

i
CONTENTS ii

1.8 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.1 Orthogonal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.2 Polar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.3 Special Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Statistics 21
2.1 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Minimum MSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.3 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.4 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.5 Exponential Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.6 Methods of finding UMVUE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.7 Cramer-Rao Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.8 Methods of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Testing of Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.1 Tests of Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.1 Methods of finding C.I. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.2 Wilk’s Optimum Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.3 Test Inversion Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Large Sample Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.1 Modes of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Mathematics 35
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 Combinatorial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2 Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.1 Vectors & Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.4 System of Linear Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Chapter 1

Probability

1.1 Theory of Probability

1.1.1 Sigma Field
Ω: Universal Set. A non-empty class A of few subsets of Ω is said to form a sigma field on Ω if it
satisfies the following properties-

(i) A ∈ A =⇒ Ac ∈ A (Closed under complementation)

∞
S
(ii) A1 , A2 , . . . , An , . . . ∈ A =⇒ An ∈ A (Closed under countable union)
n=1

Theorems
(1) A σ-field is closed under finite unions.

(2) A σ-field must include the null set, ϕ and the whole set, Ω.

(a) A = {∅, Ω} is the smallest/minimal σ-field on Ω.

(b) If A ∈ Ω, then A = {∅, A, Ac , Ω} is the minimal σ-field containing A, on Ω.
(c) The power set of Ω (the set of all subsets of Ω) is the largest σ-field on Ω.

(3) A σ-field is closed under countable intersections.

1.1.2 Properties
(1) For two sets A, B ∈ A -

(a) Monotonic Property: If A ⊆ B, P (A) ≤ P (B)

(b) P (A ∪ B) = P (A) + P (B) − P (A ∩ B) =⇒ P (A ∪ B) ≤ P (A) + P (B)
(c) P (A ∪ B) = P (A − B) + P (B − A) + P (A ∩ B)
p
(d) P (A ∩ B) ≤ min{P (A), P (B)} =⇒ P (A ∩ B) ≤ P (A) · P (B)
(e) P (A ∩ B) ≥ P (A) + P (B) − 1
(f) P (A) = P (A ∩ B) + P (A ∩ B c ) =⇒ P (A − B) = P (A) − P (A ∩ B)

1
CHAPTER 1. PROBABILITY 2

(2) For any n events A1 , A2 , . . . , An ∈ A -

n n
S P
(a) Boole’s inequality: P Ai ≤ P (Ai )
i=1 i=1
n n
T P
(b) Bonferroni’s inequality: P Ai ≥ P (Ai ) − (n − 1)
i=1 i=1
n
n n
P P S P
(c) P (Ai ) − 1⩽i1 <i2 ⩽n P (Ai1 ∩ Ai2 ) ≤ P Ai ≤ P (Ai )
i=1 i=1 i=1

(d) Poincare’s theorem:

n
! n
[ X X X
P Ai = P (Ai ) − P (Ai1 ∩ Ai2 ) + P (Ai1 ∩ Ai2 ∩ Ai3 ) − · · ·
i=1 i=1 1⩽i1 <i2 ⩽n 1⩽i1 <i2 <i3 ⩽n

n
!
\
· · · + (−1)n−1 P Ai
i=1

(e) Jordan’s theorem:

i. The probability that exactly m of the n events will occur is -

m+1 m+2 n−m n
P(m) = Sm − Sm+1 + Sm+2 − · · · + (−1) Sn
m m m

ii. The probability that atleast m of the n events will occur is -

Pm = P(m) + P(m+1) + · · · + P(n)

m m+1 n−m n − 1
= Sm − Sm+1 + Sm+2 − · · · + (−1) Sn
m−1 m−1 m−1
P
where, Sr = 1⩽i1 <i2 <···<ir ⩽n P (Ai1 ∩ Ai2 ∩ · · · ∩ Air ), r = 1(1)n

1.1.3 Conditional Probability

Consider, a probability space (Ω, A, P).
n−1
T
(1) Compound probability: n events A1 , A2 , . . . , An ∈ A are such that P Ai > 0. Then,
i=1
n−1
T
P (A1 ∩ A2 ∩ · · · ∩ An ) = P (A1 ) · P (A2 /A1 ) · P (A3 /A1 ∩ A2 ) · · · P (An / Ai )
i=1

(2) Total Probability Theorem: If (B1 , B2 , . . . , Bn ) is a partition of Ω with P (Bi ) > 0 ∀i, then
n
P
for any event A ∈ A, P (A) = P (Bi ) · P (A/Bi )
i=1

P (Bi )P (A/Bi )
(3) Bayes’ Theorem: P (Bi /A) = n
P ,i = 1(1)n, if P (A) > 0
P (Bk )P (A/Bk )
k=1
CHAPTER 1. PROBABILITY 3

(4) Bayes’ theorem with future events: Let, C ∈ A be an event under the previous conditions
with P (A/Bi ) > 0, i = 1(1)n. Then,
n
P
P (Bi )P (A/Bi )P (C/A ∩ Bi )
i=1
P (C/A) = n
P
P (Bi )P (A/Bi )
i=1

1.1.4 Stochastic Independence

(1) For two independent events A, B -

P (A/B) = P (A/B c ) = P (A) ⇐⇒ P (A ∩ B) = P (A) · P (B)

(2) Pairwise independence: For n events A1 , A2 , . . . , An ∈ A are said to be ‘pairwise’ indepen-

dent if -
P (Ai1 ∩ Ai2 ) = P (Ai1 ) · P (Ai2 ), ∀i1 < i2

(3) Mutual independence: The above events are ‘mutually’ independent if -

P (Ai1 ∩ Ai2 ) = P (Ai1 ) · P (Ai2 ), ∀i1 < i2

P (Ai1 ∩ Ai2 ∩ Ai3 ) = P (Ai1 ) · P (Ai2 ) · P (Ai3 ), ∀i1 < i2 < i3

..
.
P (A1 ∩ A2 ∩ · · · ∩ An ) = P (A1 ) · P (A2 ) · · · P (An )

1.2 Random Variable

1.2.1 Univariate
X : Ω → R, such that {ω : X(ω) ≤ x} ∈ A, ∀x ∈ R is a random variable on {Ω, A}.

(1) The same variable X(·) is a R.V. for a particular choice of σ-field but may not for another
choice of σ-field.

(2) X is a R.V. on (Ω, A) =⇒ f (X) is also a R.V. on (Ω, A). (for any f )

(3) Continuity theorem of Probability: (A1 ⊂ A2 ⊂ . . .) or (A1 ⊃ A2 ⊃ . . .)

=⇒ lim P (An ) = P lim An
n→∞ n→∞

(4) CDF: FX (x) = P [{ω : X(ω) ≤ x}], ∀x ∈ R

(a) Non-decreasing: −∞ < x1 < x2 < ∞ =⇒ F (x1 ) ≤ F (x2 )

(b) Normalized: lim F (x) = 0, lim F (x) = 1
x→−∞ x→+∞

(c) Right Continuous: lim+ F (x) = F (a), ∀a ∈ R

x→a
CHAPTER 1. PROBABILITY 4

For any R.V. X with CDF F (·) -

P (a < X < b) = F (b − 0) − F (a) P (a ≤ X ≤ b) = F (b) − F (a − 0)

P (a < X ≤ b) = F (b) − F (a) P (a ≤ X < b) = F (b − 0) − F (a − 0)

(5) Decomposition theorem: F (x) = αFc (x) + (1 − α)Fd (x) where, 0 ≤ α ≤ 1 and Fc (x), Fd (x)
are continuous and discrete D.F., respectively.

(a) α = 0 =⇒ X is purely discrete.

(b) α = 1 =⇒ X is purely continuous.
(c) 0 < α < 1 =⇒ X is mixed.

(6) X is non-negative with E(X) = 0 =⇒ P (X = 0) = 1

(b−a)2
(7) P (a ≤ X ≤ b) = 1 =⇒ V ar(X) ≤ 4

(8) P (|X| ≤ M ) = 1 for some 0 ≤ M < ∞ =⇒ µ′r exists ∀ r

∞
P
(9) P (X ∈ {0, N}) = 1 =⇒ E(X) = {1 − F (x)}
x=0

(10) P (X ∈ [0, ∞)) = 1 =⇒ lim x{1 − F (x)} = 0, if E(X) exists.

x→∞

R∞
(11) E(X) = {1 − F (x)}dx for any non-negative R.V. X.
0
Z ∞
r
E(X ) = r xr−1 {1 − F (x)}dx
0

(12) ln(GMX ) = E(ln X)

(13) pth quantile: ξp such that F (ξp − 0) ≤ p ≤ F (ξp ). For continuous case, F (ξp ) = p

Symmetry
X has a symmetric distribution about ‘a’ if any of the following, holds -

(a) P (X ≤ a − x) = P (X ≥ a + x), ∀x ∈ R

(b) F (a − x) + F (a + x) = 1 + P (a + x)

Again, if X is continuous then F (a − x) + F (a + x) = 1 or f (a − x) = f (a + x), ∀x ∈ R

E(X) = a, if it exists

Med (X) = a
CHAPTER 1. PROBABILITY 5

1.2.2 Bivariate
X

Y
: Ω → R2 , such that {ω : X(ω) ≤ x, Y (ω) ≤ y} ∈ A, ∀(x, y) ∈ R2 is a bivariate random variable
on {Ω, A}.

(1) CDF: F (x, y) = P [{ω : X(ω) ≤ x, Y (ω) ≤ y}], ∀(x, y) ∈ R2

(a) F (x, y) is non-decreasing and right continuous w.r.t. each of the arguments x and y.
(b) F (−∞, y) = F (x, −∞) = 0, F (+∞, +∞) = 1
(c) For x1 < x2 , y1 < y2 -

P (x1 < X < x2 , y1 < Y < y2 ) = F (x2 , y2 ) − F (x1 , y2 ) − F (x2 , y1 ) + F (x1 , y1 ) ≥ 0

Marginal CDF

FX (x) = lim FX,Y (x, y), FY (y) = lim FX,Y (x, y)

y→∞ x→∞
p
• FX (x) + FY (y) − 1 ≤ FX,Y (x, y) ≤ FX (x) · FY (y), ∀(x, y) ∈ R2

(2) Joint distribution cannot be determined uniquely from the marginals.

(3) fX,Y (x, y) = fX (x) · fY |X (y|x) = fY (y) · fX|Y (x|y)

n o
(4) fX,Y (x, y; α) = fX (x) fY (y) 1 + α · 2FX (x) − 1 · 2FY (y) − 1 , α ∈ [−1, 1]

Stochastic Independence
(5) FX,Y (x, y) = FX (x) · FY (y), ∀(x, y) ∈ R2 =⇒ fX,Y (x, y) = fX (x) · fY (y), ∀(x, y)

(6) X ⊥⊥ Y =⇒ f (X) ⊥⊥ g(Y ) (converse is true when f , g is 1-1)

(7) X ⊥⊥ Y iff fX,Y (x, y) = k · f1 (x) · f2 (y), ∀ x, y ∈ R for some k > 0.

1.2.3 Results
(1) Sum Law: E(X + Y ) = E(X) + E(Y ), if all exists

(2) Product Law: X ⊥⊥ Y =⇒ E(XY ) = E(X) · E(Y )

Cov (X, Y ) = 0 ⇏ X ⊥⊥ Y

(3) X, Y identically distributed ⇏ P (X = Y ) = 1

(4) X1 , X2 , . . . , Xn are iid and continuous R.V.s =⇒ n! arrangements are equally likely
iid
(5) X ∼ Y =⇒ (X − Y ) is symmetric
Ru
(6) PDF of max{X, Y } : fU (u) = {f (u, t) + f (t, u)}dt f : Joint PDF of (X, Y )
−∞
CHAPTER 1. PROBABILITY 6

Conditional Distribution
(7) X ⊥⊥ Y =⇒ E(Y |X = x) = k, some constant ∀ x

(8) X ⊥⊥ Y − ρ σσXY =⇒ E(Y |X = x) is linear in x

(9) E(Y ) = E[E(Y |X)] or E(X) = E[E(X|Y )]

(10) V ar (Y ) = V ar {E(Y |X)} + E{V ar (Y |X)}
V ar {E(Y |X)}
(11) Correlation ratio: ηY2 X = V ar (Y )

N
P
(12) Wald’s equation: {Xn }: sequence of iid R.V.s, P (N ∈ N) = 1. Define, SN = Xi
i=1

=⇒ E(SN ) = E(X1 ) E(N )

=⇒ V ar (SN ) = V ar (X1 ) · E(N ) + E 2 (X1 ) · V ar (N )

1.3 Generating Functions

1.3.1 Moments

(1) MGF: MX (t) = E etX , |t| < t0 , for some t0 > 0 [if E etX < ∞]
It determines a distribution uniquely.
tr
(2) µ′r : coefficient of r!
in the expansion of MX (t), r = 0, 1, 2, . . .
∞
tr µ′r
converges absolutely, then a sequence of moments {µ′r } determine
P
(3) If the power series: r!
r=0
a distribution uniquely. For a bounded R.V this always holds.

(4) Xi are independent with MGF Mi (t) =⇒ MS (t) = ni=1 Mi (t), where S = ni=1 Xi
Q P

(5) Bivariate MGF: MX,Y (t1 , t2 ) = E et1 X+t2 Y for |ti | < hi for some hi > 0, i = 1, 2
tr1 ts2
(6) µ′r,s : coefficient of r!s!
in the expansion of MX,Y (t1 , t2 )

∂ r+s
(7) Also, ∂tr1 ∂ts2
MX,Y (t1 , t2 ) |(t1 =0,t2 =0) = µ′r,s

(8) Marginal MGF: MX (t) = MX,Y (t, 0) & MY (t) = MX,Y (0, t)

(9) X & Y are independent ‘iff’ MX,Y (t1 , t2 ) = MX,Y (t1 , 0) · MX,Y (0, t2 ), ∀(t1 , t2 )

1.3.2 Cumulants
(1) CGF: KX (t) = ln{MX (t)}, provided the expansion is a convergent power series.

(2) k1 = µ′1 (mean), k2 = µ2 (variance), k3 = µ3 & k4 = µ4 − 3k22

(3) For two independent R.V. X & Y , kr (X + Y ) = kr (X) + kr (Y )

CHAPTER 1. PROBABILITY 7

1.3.3 Characteristic Function

(1) CF: ϕX (t) = E eitX

(2) ϕX (0) = 1, |ϕX (t)| ≤ 1

(3) ϕX (t) is continuous on R and always exists for t ∈ R

(4) ϕX (−t) = ϕX (t)

(5) If X has a symmetric distribution about ‘0’ then ϕX (t) is real valued and an even function of t.

(6) Uniqueness property and independence as of MGF.

R∞
(7) Inversion theorem: If −∞ ϕX (t)dt < ∞, then pdf of the distribution is -
Z ∞
1
f (x) = e−itX ϕX (t)dt
2π −∞

1.3.4 Probability Generating Function

(1) PGF: PX (t) = E(tX ), if |t| < 1

(2) It generates probability and factorial moments. It also determines a distribution uniquely.
dr
(3) rth order factorial moment: µ[r] = P (t) | (t=1) ,
dtr X
r = 0, 1, . . .
Qn Pn
(4) X1 , X2 , . . . , Xn are independent with PGF Pi (t) =⇒ PS (t) = i=1 Pi (t), where S = i=1 Xi

1.4 Inequalities
1.4.1 Markov & Chebyshev
E(X)
(1) Markov: For a non-negative R.V. X, P (X ⩾ a) ⩽ a
, for a > 0.
‘=’ holds if X has a two-point distribution.

(2) Chebyshev: P (|X − µ| ⩾ tσ) ⩽ t12 , t > 0 where µ = E(X) & σ 2 = V ar(X) < ∞.
‘=’ holds if X is such that -
1


 2 , if x = µ ± tσ
f (x) = 2t (t > 1)
1 − 1 , if x = µ

t2

(3) One-sided Chebyshev: E(X) = 0, V ar(X) = σ 2 < ∞


σ2
⩽ , if t > 0


P (X ⩾ t) σ 2 + t2
 t2
⩾
 , if t < 0
σ 2 + t2
CHAPTER 1. PROBABILITY 8

(4) If also µ4 < ∞ then,

µ4 − σ 4
P (|X − µ| ⩾ tσ) ⩽
µ4 − σ 4 + (t2 − 1)2 σ 4
µ4
It is an improvement over Chebyshev’s inequality if t2 ≥ σ4

(5) Bivariate Chebyshev: (X1 , X2 ) is a bivariate R.V. with means (µ1 , µ2 ), variances (σ12 , σ22 ) &
correlation ρ. Then for t > 0,
p
1 + 1 − ρ2
P (|X1 − µ1 | ⩾ tσ1 or |X2 − µ2 | ⩾ tσ2 ) ⩽
t2

1.4.2 Cauchy-Schwarz
If a bivariate R.V. (X, Y ) has finite variances and E(XY ) exists, then -

E 2 (XY ) ≤ E(X 2 )E(Y 2 )

‘=’ holds iff X & Y are linearly related passing through the origin i.e. P (X + λY = 0) = 1, for any
λ.

1.4.3 Jensen
f (·) is convex function and E(X) exists, then E [f (X)] ≥ f [E(X)]

Note: A function, f (·) is said to be convex on an interval I, if for x1 , x2 ∈ I and for some
λ ∈ [0, 1], if
f [λx1 + (1 − λ)x2 ] ≤ λf (x1 ) + (1 − λ)f (x2 )

If f (·) is twice differentiable then f ′′ (x) ≥ 0 is the condition for convexity.

1.4.4 Lyapunov
For a R.V. X, define βr = E (|X|r ) (assuming it exists) then -
n 1o 1 1
r+1
βrr is non decreasing i.e. βrr ≤ βr+1
CHAPTER 1. PROBABILITY 9

1.5 Theoretical Distributions

1.5.1 Discrete

X CDF PDF E(X) Var(X) MGF

Nt
#{i:xi ⩽ x} 1 N +1 N 2 −1 e t e −1
U {x1 , . . . , xN } N N 2 12 N et −1

Bernoulli (p) (1 − p)1−x px (1 − p)1−x p p(1 − p) (1 − p + pet )

n
n
Bin (n, p) I1−p (n − x, x + 1)[1] x
px (1 − p)n−x np np(1 − p) (1 − p + pet )
−N p
(Nxp)(Nn−x ) N −n

Hyp (N, n, p) - N np np(1 − p) N −1
-
(n)
1−p 1−p p
Geo (p) 1 − (1 − p)x+1 p(1 − p)x p p2 1−(1−p)et
n
n+x−1
n(1−p) n(1−p) p
NB (n, p) Ip (n, x + 1)[1] n−1
pn (1 − p)x p p2 1−(1−p)et
R∞
eλ(e −1)
t
e−t tx x
Poisson (λ) Γ(x+1)
dt e−λ λx! λ λ
λ

Rp tk−1 (1−t)n−k
[1] Ip (k, n − k + 1) = 0 B(k,n−k+1) dt (Incomplete Beta Function)

Properties
Binomial
(1) Mode: [(n + 1) p] if (n + 1) p is not an integer, else {(n + 1) p − 1} and (n + 1) p.

(2) Factorial Moment: µ(r) = (n)r pr

1
(3) Bin (n, p) is symmetric iff p = 2
1
(4) Variance of Bin (n, p) is minimum iff p = 2
and minimum variance = n4 .
iid 2n 1 2n
(5) X, Y ∼ Bin (n, 21 ) =⇒ P (X = Y ) =

n 2

Geometric
1 1−p
(1) X : number of trials required to get the 1st success, then E(X) = p
and V ar(X) = p2

(2) Lack of Memory: X ∼ Geo (p) ⇐⇒ P (X > m + n | X > m) = P (X ⩾ n), ∀m, n ∈ N

Negative Binomial
h i
(1) Mode: (n−1)(1−p)
p
if (n−1)(1−p)
p
is not an integer, else (n−1)(1−p)
p
−1 , (n−1)(1−p)
p
.

(2) NB (n, p) ≡ Bin (−n, P ) where, P = − 1−p

p
CHAPTER 1. PROBABILITY 10

(3) Y : number of trials required to get the rth success. Then -

y−1 r
P (Y = y) = p (1 − p)y−r , y = r, r + 1, . . .
r−1

Here, Y is discrete waiting time R.V. (Pascal Distribution)

(4) X ∼ Bin (n, p), Y ∼ NB (r, p) =⇒ P (X ≥ r) = P (Y ≤ n)

Poisson
(1) Mode: [λ] if λ is not an integer, else (λ − 1) and λ.

1.5.2 Continuous

X CDF PDF E(X) Var(X) MGF

x−a I{a<x<b} a+b (b−a)2 etb −eta
U (a, b) b−a b−a 2 12 t(b−a)
x
e− θ xn−1 1
Gamma (n, θ) Γx (n, θ)[2] θn Γ(n)
nθ nθ2 (1−tθ)n
x x
Exp (θ) 1 − e− θ 1
θ
e− θ θ θ2 1
(1−tθ)

xm−1 (1−x)n−1 m mn
Beta (m, n) Ix (m, n) B(m,n) m+n (m+n)2 (m+n+1)
-
1 xm−1 m m(m+n−1)
Beta2 (m, n) - B(m,n)
· (1+x)m+n n−1
(n > 1) (n−2)(n−1)2
(n > 2) -
(x−µ)2 t2 σ 2
x−µ √1 e−

N (µ, σ 2 ) Φ σ σ 2π
2σ 2 µ σ2 etµ+ 2

(ln x−µ)2 1

e−
ln x−µ
1 2 2µ+σ 2 σ2
Λ (µ, σ 2 ) Φ σ xσ
√
2π
2σ 2 eµ+ 2 σ e e −1 ×
x−µ
1 1
tan−1 σ

C (µ, σ) 2
+ π σ π{σ 2 +(x−µ)2 }
× × ×
x−µ x−µ
SE (µ, σ) 1 − e−( σ ) 1
σ
e−( σ ) µ+σ σ2 etµ
(1−tσ)
1 x−µ

 e σ
 ,x ≤ µ |x−µ|
DE (µ, σ) 2 1
e− σ µ 2σ 2 etµ
(1−t2 σ 2 )
1 − 1 e− x−µ
2σ
,x > µ
 σ
2
x0 θ
θxθ0 θx0 θx20
Pareto (x0 , θ) 1− x xθ+1 θ−1
(θ > 1) (θ−2)(θ−1)2
(θ > 2) -
x−α
1 1 e β β 2 π2 πβt etα
Logistic (α, β) − x−α x−α 2 α
( ) β 3 sin(πβt)

1+e β 1+e β

t
e− θ tn−1
Rx
[2] Γx (n, θ) = 0 θ n Γ(n) dt (Incomplete Gamma Function)
CHAPTER 1. PROBABILITY 11

Properties
Uniform
br+1 −ar+1
(1) µ′r = (r+1)(b−a)

(2) X ∼ U (0, n), n ∈ N =⇒ X − [X] ∼ U (0, 1)

(3) Classical & Geometric definition of probability is based on ‘Uniform distribution’ over discrete
& continuous space, respectively.

Gamma
Γ(n+r)
(1) Moments: µ′r = θr Γ(n)
, if r > −n

(2) HM: (n − 1) θ, if n > 1

(3) Mode: Mode is at (n − 1) θ, if n > 1; 0, if n = 1 and for 0 < n < 1 no mode.

Exponential
(1) µ′r = θr r!
(2) ξp = −θ ln(1 − p) =⇒ Median = θ ln 2
2θ
(3) Mode is at x = 0, MDθ = e

(4) Lack of Memory: X ∼ Exp (θ) ⇐⇒ P (X > m + n | X > m) = P (X > n), ∀m, n > 0
F ′ (x)
(5) = constant ∀x > 0 ⇐⇒ X ∼ Exponential
1−F (x)
1

(6) X ∼ Exp (λ) =⇒ [X] ∼ Geo p = 1 − e− λ and X ⊥⊥ [X]

1 n−1

(7) X ∼ DE (θ, 1) =⇒ P [X(1) ≤ θ ≤ X(n) ] = 1 − 2

Beta
B(r+m,n)
(1) µ′r = B(m,n)
, if r + m > 0
m−1
(2) HM: m+n−1
, if m > 1
m−1
(3) Mode: m+n−2
, if m > 1, n > 1

(4) If m = n, median = 21 , ∀n > 0 and mode = 21 , if n > 1, else no mode.

D
(5) Beta (1, 1) ≡ U (0, 1)

Beta2
B(r+m,n−r)
(1) µ′r = B(m,n)
, if −m < r < n
n
(2) HM: m−1
, if m > 1
m−1
(3) Mode: n+1
, if m > 1, for 0 < m < 1, no mode.
CHAPTER 1. PROBABILITY 12

Normal
(1) median = mode = µ and bell-shaped (unimodal)
Γ(r+ 21 )
(2) µ2r−1 = 0, µ2r = (2σ 2 )r √
π
= {(2r − 1) · (2r − 3) · · · 5 · 3 · 1} σ 2r
q
(3) MDµ = σ π2
R
(4) t ϕ(t) dt = −ϕ(t) + c
1−Φ(x) ϕ(x)
(5) For x > 0, x1 − x13 < 1

ϕ(x)
< x
=⇒ 1 − Φ(x) ≃ x
, for large x (x > 3)

(6) X ∼ N (0, 1) =⇒ E[X] = − 12

Lognormal
1 2 2
(1) µ′r = erµ+ 2 r σ

1 2 2
(2) HM: eµ− 2 σ , GM: eµ , Median: eµ , Mode: eµ−σ

=⇒ Mean > Median > Mode =⇒ Positively skewed

iid 2
(3) Xi ∼ Λ (µ, σ 2 ) =⇒ GM (X) ∼ Λ (µ, σn )
˜
Cauchy
(1) µ′r exists for −1 < r < 1

(2) Median = Mode = µ

1.5.3 Multivariate
′
A ‘p’-component (dimensional) Random Vector (R.V.), X p×1 = X1 X2 · · · Xp defined on
(Ω, A) is a vector of p real-valued functions X1 (·), X2 (·), . . . , X˜p (·) defined on ‘Ω’ such that -
′
{ω : X1 (ω) ≤ x1 , X2 (ω) ≤ x2 , · · · , Xp (ω) ≤ xp } ∈ A, ∀x = x1 x2 · · · xp ∈ Rp is a random vector.
˜

(1) CDF: FX (x) = P {ω : X1 (ω) ≤ x1 , X2 (ω) ≤ x2 , · · · , Xp (ω) ≤ xp } , ∀x ∈ Rp
˜ ˜ ˜
(a) FX (x) is non-decreasing and right continuous w.r.t. each of x1 , x2 , . . . , xp .
˜ ˜
(b) FX (+∞, +∞, . . . , +∞) = 1, lim FX (x) = 0, ∀i = 1(1)p
˜ xi →−∞ ˜ ˜

(c) For h1 , h2 , . . . , hp > 0 -

P (x1 < X1 < x1 + h1 , x2 < X2 < x2 + h2 , . . . , xp < Xp < xp + hp ) ≥ 0

p
P p
(2) FX (xi ) − (p − 1) ≤ FX (x) ≤ p
FX1 (x1 ) FX2 (x2 ) · · · FXp (xp )
i=1 ˜ ˜ ˜

(3) The distribution of any sub-vector is a marginal distribution. There are (2p − 1) marginals.

p×1 X(1)
(4) Independence: X = ˜ , X(1) ⊥⊥ X(2) ⇐⇒ FX (x) = FX(1) (x(1) )·FX(2) (x(2) ), ∀x ∈ Rp
˜ X (2) ˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜
˜
CHAPTER 1. PROBABILITY 13

(5) E(a′ X) = a′ µ, V ar(a′ X) = a′ Σ a, Cov(a′ X, b′ X) = a′ Σ b for non-stochastic vectors a, b ∈ Rp

˜ ˜ ˜˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜˜ ˜ ˜ ˜ ˜
′ ′
(6) E(AX) = Aµ, D(AX) = AΣA , Cov(AX, B X) = AΣB for non-stochastic matrices Aq×p , B r×p
˜ ˜ ˜ ˜ ˜
(7) E (X − α) A(X − α) = trace(AΣ) + (µ − α)′ A(µ − α)
′
˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜
(8) A matrix Σ = (σij ) is a dispersion matrix if and only if it is n.n.d.
(9) Generalized variance: det(Σ), where Σ = E (X − µ)(X − µ)′ = E(X X ′ )−µµ′ = D (X p×1 )

˜ ˜ ˜ ˜ ˜ ˜ ˜˜ ˜
(10) Σ is p.d. iff there is no a ̸= 0 for which P (a′ X = c) = 1
Σ is p.s.d. iff there is a ˜vector
˜ a ̸= 0 for which
˜ ˜ P a′ (X − µ) = 0 = 1
˜ ˜ ˜ ˜ ˜
(11) det(Σ) > 0 =⇒ Non-singular, det(Σ) = 0 =⇒ Singular Distribution
(12) Σ = BB ′ for any dispersion matrix Σ, where B is n.n.d.
(13) Σ is p.d. =⇒ Σ = BB ′ , B is non-singular and let, Y = B −1 (X−µ) =⇒ E(Y ) = 0, D(Y ) = Ip
˜ ˜ ˜ ˜ ˜ ˜
ρ12 − ρ23 ρ31
(14) ρ12· 3 = p p
1 − ρ213 1 − ρ223

Multinomial
PMF: (X1 , X2 , . . . , Xk ) ∼ Multinomial (n; p1 , p2 , . . . , pk )
(a) Singular -
k k

n! X X
px1 pxk · · · pxkk

 , if xi = n, pi = 1
fX (x1 , x2 , . . . , xk ) = x1 ! x2 ! · · · xk ! 1 k i=1 i=1
˜ 

0 , otherwise
k−1 k−1
k−1 k−1

P P P P
(b) Non-singular - xi ≤ n, pi < 1 xk = n − xi , p k = 1 − pi
i=1 i=1 i=1 i=1
Properties
(
npi (1 − pi ) , if i = j
(1) E(Xi ) = npi , Cov(Xi , Xj ) = , i, j = 1, 2, . . . , k − 1
−npi pj ̸ j
, if i =
r
pi pj k
P
(2) ρij = ρ(Xi , Xj ) = − , i ̸= j As Xi = n, Xi ↑ =⇒ Xj ↓ on an average
(1 − pi )(1 − pj ) i=1

D = diag (p1 , p2 , . . . , pk−1 )

(3) det(Σ) = nk−1 det(D − P P ′ ) = nk−1 det(D)(1 − P ′ D−1 P )
˜˜ ˜ ˜ P = (p1 , p2 , . . . , pk−1 )′
˜
k−1
(4) X k−1×1 ∼ Multinomial (n; p1 , . . . , pk−1 ),
P
pi < 1
˜ i=1
 
k−1
 X p1 
=⇒ X1 | (X2 = x2 , . . . , Xk−1 = xk−1 ) ∼ Bin n − xi ,
 
k−1

 P 
i=2 1− pi
i=2

=⇒ the regression of X1 on X2 , X3 , . . . , Xk−1 is linear and the distribution is heteroscedastic.

CHAPTER 1. PROBABILITY 14
k−1
n
t′ X

pi (eti − 1)
P
(5) MGF: E e˜ ˜ = 1+
i=1
Multiple Correlation
(6) For singular case, ρ1· 23···k = 1
k−1
P
p1 · pi
i=2
(7) For non-singular case, ρ21· 23···k−1 = k−1

P
(1 − p1 ) 1 − pi
i=2
√
p1 p2
ρ12· 34···k−1 = −p p
(1 − p2 − p3 − · · · − pk−1 ) (1 − p1 − p3 − · · · − pk−1 )

Bivariate Normal
(X, Y ) ∼ BN(µ1 , µ2 , σ1 , σ2 , ρ)

(1) X ∼ N (µ1 , σ12 ), Y | X = x ∼ N µ2 + ρ σσ22 (x − µ1 ), σ22 (1 − ρ2 )
2 2
1 X−µ1 X−µ1 Y −µ2 Y −µ2
(2) Q(X, Y ) = 1−ρ2 σ1
− 2ρ σ1 σ2
+ σ2 = U 2 + V 2 ∼ χ22

Y − µ2 − ρ σσ21 (X − µ1 ) X − µ1 iid
where, U = p , V = ∼ N (0, 1)
σ2 1 − ρ2 σ1
(3) (X, Y ) is independent ⇐⇒ ρ = 0
(4) (X, Y ) ∼ BN(0, 0, 1, 1, ρ)
q
X+Y
(a) (X + Y ) ⊥⊥ (X − Y ) =⇒ X−Y ∼ C 0, 1+ρ 1−ρ
q q
(b) E{max(X, Y )} = 1−ρ π
, PDF: f U (u) = 2ϕ(u)Φ u 1−ρ
1+ρ

(c) ρ(X 2 , Y 2 ) = ρ2
(
−1, X < 0
(5) (X, Y ) ∼ BN(0, 0, 1, 1, 0), Y1 = X1 sgn(X2 ), Y2 = X2 sgn(X1 ), where sgn(X) =
1, X>0
2
=⇒ (Y1 , Y2 ) ≁ BN, ρ(Y1 , Y2 ) = π

1.5.4 Truncated Distribution

Univariate
F (x) be the CDF of X over the sample space X. Let, A = (a, b] ⊂ X, then the CDF of X over
truncated space A is -


 0 ,x ≤ a

 F (x) − F (a)
G(x) = P (X ≤ x | X ∈ A) = ,a < x ≤ b

 F (b) − F (a)

1 ,x > b


f (x)
PMF/PDF: P (X∈A)
, x∈A
CHAPTER 1. PROBABILITY 15

Results
E(X) = E(X|A) · P (X ∈ A) + E(X|Ac ) · P (X ∈ Ac )

X ∼ Geo (p) =⇒ (X − k) | X ≥ k ∼ Geo (p)

Truncated Normal distribution is platykurtic.

Truncated Cauchy distribution has finite moments.

Bivariate
(X, Y ): bivariate R.V. with PDF, f (x, y) over the sample space, X ⊆ R2 . Let, A ⊂ X, then the PDF
over the truncated space is -

f (x, y)
g(x, y) = , if (x, y) ∈ A
P (X, Y ) ∈ A

• µ′r,s (A) = E(X r Y s |A) = xr y s f (x,y) dx dy

RR
A P (X,Y )∈A

1.6 Sampling Distributions

1.6.1 Chi-square, t, F
χ2n
(1) E(X) = n, Var (X) = 2n
Γ( n2 + r)
(2) µ′r =2 ·r
n , if r > − n2
Γ( 2 )
D n

(3) χ2n ≡ Gamma 2
,2 , n∈N

tn
n
(1) E(X) = 0 (n > 1), Var (X) = n−2
(n > 2)

Γ( 21 + r) Γ( n2 − r)
(2) µ′2r = nr · √ · , if −1 < 2r < n
π Γ( n2 )
D
(3) t1 ≡ C (0, 1)
D
(4) t2n ≡ F1,n

Fn1 ,n2
n2 2n22 (n1 + n2 − 2)
(1) E(X) = (if n2 > 2), Var (X) = (if n2 > 4)
n2 − 2 n1 (n2 − 2)2 (n2 − 4)
n2 (n1 − 2)
(2) Mode: (if n1 > 2) =⇒ Mean > 1 > Mode (if n1 , n2 > 2)
n1 (n2 + 2)
CHAPTER 1. PROBABILITY 16
r
Γ( n21 + r) Γ( n22 − r)

n2
(3) µ′r = · · , if −n1 < 2r < n2
n1 Γ( n21 ) Γ( n22 )
′
(4) ξp and ξp′ are pth quantile of Fn1 ,n2 and Fn2 ,n1 respectively =⇒ ξp ξ1−p =1
1 D
(5) F ∼ Fn,n =⇒ F ≡ and median (F ) = 1
F
n1 n n
1 2
(6) F ∼ Fn1 ,n2 =⇒ F ∼ Beta2 ,
n2 2 2
(7) Points of inflexion are equidistant from mode (if n > 4)

1.6.2 Order Statistics

Order Statistics: X(1) ≤ X(2) ≤ · · · ≤ X(n)
n
n
{F (x)}k {1 − F (x)}n−k = IF (x) (r, n − r + 1)
P
(1) FX(r) (x) = k
k=r

(2) FX(1) ,X(n) (x1 , x2 ) = {F (x2 )}n − {F (x2 ) − F (x1 )}n , x1 < x2

Only for Absolutely Continuous Random Variable - CDF: F (x), PDF: f (x)
(3) fX(r) (x) = n!
(r−1)!(n−r)!
{F (x)}r−1 f (x) {1 − F (x)}n−r , x ∈ R

x<y
(4) Joint PDF: n!
(r−1)!(s−r−1)!(n−r)!
{F (x)}r−1 f (x) {F (y) − F (x)}s−r−1 f (y) {1 − F (y)}n−s ,
(r < s)
R∞
(5) Sample Range: fR (r) = n(n − 1) {F (r + s) − F (s)}n−2 f (r + s)f (s) ds, 0 < r < ∞
−∞

Results
iid
(6) Xi ∼ U(0, 1) =⇒ X(r) ∼ Beta (r, n − r + 1), r = 1, 2, . . . , n
iid σ σ
(7) X1 , X2 ∼ N (µ, σ 2 ) =⇒ E X(1) = µ − √ , E X(2) = µ + √
π π
iid 1
(8) X1 , X2 , X3 ∼ N (µ, σ 2 ) =⇒ Sample Range: 2
(|X1 − X2 | + |X2 − X3 | + |X3 − X1 |)
iid
(9) Xi ∼ Exp (θ) =⇒ E[X(n) ] = θ 1 + 21 + 31 · · · + 1

n

iid ⊥
⊥ θ

(10) Xi ∼ Exp (θ) =⇒ Ui = X(i) − X(i−1) ∼ Exp n−i+1
, X(0) = 0

(11) X1 , X2 , . . . , X2k+1 : random sample from a continuous distribution, symmetric about µ

=⇒ Distribution of X̃ is also symmetric about µ

X(1) +X(n)
=⇒ E(X̃) = µ, E 2
= µ (if exists)

iid iid
(12) Xi ∼ Shifted Exp (θ, 1) =⇒ (n − i + 1) X(i) − X(i−1) ∼ Exp (1)
n
=⇒ 2n X(1) − θ ∼ χ22 ⊥⊥ 2 X(i) − X(1) ∼ χ22n−2
P
i=2
CHAPTER 1. PROBABILITY 17

1.7 Distribution Relationships

1.7.1 Binomial
n
iid P
(1) Xi ∼ Bernoulli (p) =⇒ Xi ∼ Bin (n, p)
i=1

k
k

⊥
⊥ P P
(2) Xi ∼ Bin (ni , p) =⇒ Xi ∼ Bin ni , p
i=1 i=1

iid
(3) X1 , X2 ∼ Bin (n, 12 ) =⇒ X1 − X2 is symmetric about ‘0’.
m
k

⊥⊥ P P nk
(4) Xi ∼ Bin (ni , p) =⇒ Xk Xi = t ∼ Hyp N = ni , t, , k = 1(1)m
i=1 i=1 N
(5) Bin (n, p) → Poisson (λ = np), for n → ∞ and p → 0 such that np is finite.

(6) Bin (n, p) → N np, np(1 − p) , for large n and moderate p.

1.7.2 Negative Binomial

iid
(1) Xi ∼ Geo (p) =⇒ X(1) ∼ Geo (1 − q n ), where q = 1 − p
iid
(2) X, Y ∼ Geo (p) ⇐⇒ X X + Y = t ∼ U {0, 1, 2, . . . , t}
iid
(3) X, Y ∼ Geo (p) =⇒ min{X, Y } ⊥⊥ (X − Y )
n
iid P
(4) Xi ∼ Geo (p) =⇒ Xi ∼ NB (n, p)
i=1

k
k

⊥
⊥ P P
(5) Xi ∼ NB (ni , p) =⇒ Xi ∼ NB ni , p
i=1 i=1

(6) NB (n, p) → Poisson (λ = n(1 − p)), for n → ∞ and p → 1 such that n(1 − p) is finite.

1.7.3 Poisson
k
k

⊥
⊥ P P
(1) Xi ∼ Poisson (λi ) =⇒ Xi ∼ Poisson λi
i=1 i=1

m
m
⊥
⊥ P λk P
(2) Xi ∼ Poisson (λi ) =⇒ Xk Xi = t ∼ Bin t, p = , k = 1(1)m, where λ = λi
i=1 λ i=1

k
⊥
⊥ P
(3) Xi ∼ Poisson (λi ) =⇒ (X1 , X2 , . . . , Xk ) Xi = t ∼ Multinomial (t, p1 , p2 , . . . , pk ), where
i=1
λi
pi = k
, i = 1, 2, . . . , k
P
λi
i=1
CHAPTER 1. PROBABILITY 18

1.7.4 Normal
D
(1) X ∼ N (0, σ 2 ) =⇒ X ≡ −X
n
n n

⊥
⊥
N (µi , σi2 ) a2i σi2
P P P
(2) Xi ∼ =⇒ ai X i ∼ N ai µ i ,
i=1 i=1 i=1
n n
iid
(3) Xi ∼ N (µ, σ 2 ),
P P
ai Xi ⊥⊥
bi Xi ⇐⇒ a.b = 0
i=1 ˜˜
i=1
X̄ and (X1 − X̄, X2 − X̄, . . . , Xn − X̄) are independently distributed

1.7.5 Gamma
iid θ

(1) Xi ∼ Exp (θ) =⇒ X(1) ∼ Exp n
iid
(2) X, Y ∼ Exp (θ) =⇒ X X + Y = t ∼ U(0, t)
(3) X ∼ Shifted Exp (µ, θ) =⇒ (X − µ) ∼ Exp (θ)
X−µ
(4) X ∼ DE (µ, σ) =⇒ σ
∼ Exp (θ = 1) and |X| ∼ Shifted Exp (µ, σ)
n
iid P
(5) Xi ∼ Exp (θ) ≡ Gamma (n = 1, θ) =⇒ Xi ∼ Gamma (n, θ)
i=1
n
k

⊥
⊥ P P
(6) Xi ∼ Gamma (ni , θ) =⇒ Xi ∼ Gamma ni , θ
i=1 i=1

1.7.6 Beta
X
(1) X ∼ Beta (m, n) =⇒ 1−X
∼ Beta2 (m, n)
X
(2) X ∼ Beta2 (m, n) =⇒ 1+X
∼ Beta (m, n)
√
(3) X1 ∼ Beta (n1 , n2 ) & X2 ∼ Beta (n1 + 21 , n2 ), independently =⇒ X1 X2 ∼ Beta (2n1 , 2n2 )

1.7.7 Cauchy
iid
(1) Xi ∼ C (µ, σ) =⇒ X̄n ∼ C(µ, σ)
iid iid
(2) Xi ∼ C (0, σ) =⇒ X1i ∼ C 0, σ1 =⇒ HMX ∼ C(0, σ)

˜
n
n n

⊥⊥ P P P
(3) Xi ∼ C (µi , σi ) =⇒ Xi ∼ C µi , σi
i=1 i=1 i=1

1.7.8 Others
n
iid
Xi2 ∼ χ2n
P
(1) Xi ∼ N (0, 1) =⇒
i=1

(2) U ∼ N (0, 1), V ∼ χ2n , independently =⇒ √U ∼ tn

V /n

⊥
⊥ U1 /n1
(3) Ui ∼ χ2ni =⇒ U2 /n2
∼ Fn1 ,n2
D
(4) X is symmetric about ‘0’ =⇒ X ≡ −X
CHAPTER 1. PROBABILITY 19

1.8 Transformations
1.8.1 Orthogonal
y = T (x) = An×n xn×1 → Linear Transformation. [If det(A) ̸= 0, Jacobian: J = det(A−1 )].
˜ ˜ ˜
(1) If T (x) is orthogonal transformation then AT A = In =⇒ det(A) = ±1 & |J| = 1
˜
(2) y y = xT x =⇒ |y|2 = |x|2 (length is preserved)
T
˜ ˜ ˜ ˜ ˜ ˜
n
iid
Xi2 = X T A1 X + X T A2 X, where A1 , A2 are n.n.d.
P
(3) Cochran’s theorem: Xi ∼ N (0, 1) &
i=1 ˜ ˜ ˜ ˜
matrices with ranks r1 , r2 , r1 + r2 = n

=⇒ X T A1 X ∼ χ2r1 and X T A2 X ∼ χ2r2 , independently.

˜ ˜ ˜ ˜

1.8.2 Polar
(1) For a point with Cartesian coordinates (x1 , x2 , . . . , xn ) in Rn -

x1 = r cos θ1

x2 = r sin θ1 cos θ2
x3 = r sin θ1 sin θ2 cos θ3
..
.
xn−1 = r sin θ1 · · · sin θn−2 cos θn−1
xn = r sin θ1 · · · sin θn−2 sin θn−1
n
where, r2 = x2i ,
P
0 < r < ∞ and 0 < θ1 , θ2 , . . . , θn−2 < π, 0 < θn−1 < 2π
i=1

Jacobian: |J| = rn−1 (sin θ1 )n−2 (sin θ2 )n−3 · · · sin θn−2

(2) X = R cos θ, Y = R sin θ, 0 < R < ∞, 0 < θ < 2π

iid θ ∼ U (0, 2π)
X, Y ∼ N (0, 1) ⇐⇒ D , independently.
R2 ∼ Exp (2) ≡ χ22

(3) θ ∼ U(0, 2π) ⊥⊥ R2 ∼ χ22 =⇒ R sin(θ + θ0 ) ∼ N (0, 1), θ0 is a fixed quantity

1.8.3 Special Transformations

X−a

(1) X ∼ U(a, b) =⇒ − ln b−a
∼ Exp (1)
iid
(2) X1 , X2 ∼ U(0, 1) =⇒ X1 + X2 ∼ Triangular (0, 2), |X1 − X2 | ∼ Beta (1, 2)

(3) Box-Muller Transformation:

√
iid Y1 = √−2 ln X1 cos(2πX2 ) iid
X1 , X2 ∼ U (0, 1) =⇒ ∼ N (0, 1)
Y2 = −2 ln X1 sin(2πX2 )
CHAPTER 1. PROBABILITY 20

(4) X ∼ Gamma (n1 , θ), Y ∼ Gamma (n2 , θ)

X
=⇒ X + Y ∼ Gamma (n1 + n2 , θ), X+Y
∼ Beta (n1 , n2 ), independently
X
=⇒ X + Y ∼ Gamma (n1 + n2 , θ), Y
∼ Beta2 (n1 , n2 ), independently
iid X X
(5) X, Y ∼ N (0, 1) =⇒ ,
Y |Y |
∼ C(0, 1)
iid 1
=⇒ (X1 − X2 ), RX1 − (1 − R)X2 ∼ DE 0, 1θ

(6) X1 , X2 ∼ Exp (θ), R ∼ Bernoulli 2

(7) X ∼ Beta (a, b) ⊥⊥ Y ∼ Beta (a + b, c) =⇒ XY ∼ Beta (a, b + c)

(8) X ∼ U − π2 , π2 ⇐⇒ tan X ∼ C(0, 1)

iid
(9) Dirichlet Transformation: Xi ∼ Exp (θ)
n−k+1
P
n
Xi
P i=1
=⇒ Y1 = Xi ∼ Gamma (n, θ), Yk = n−k+2
∼ Beta (n − k + 1, 1), k = 2, 3, . . . , n
i=1 P
Xi
i=1

Y1 , Y2 , . . . , Yn are independently distributed.

iid
(10) X1 , X2 , X3 , X4 ∼ N (0, 1) =⇒ X1 X2 ± X3 X4 ∼ DE (0, 1) (valid for any combination)
n
iid 2
Xi ∼ χ22n
P
(11) Xi ∼ Exp (θ) =⇒ θ
i=1

(12) X ∼ Beta (θ, 1) =⇒ − ln X ∼ Exp 1θ

(13) X ∼ Pareto (θ, x0 ) =⇒ ln xX0 ∼ Exp 1

θ

aX1 + bX2
(14) X1 , X2 ∼ χ22 =⇒ ∼ U(a, b) (a < b)
X1 + X2
Chapter 2

Statistics

2.1 Point Estimation

2.1.1 Minimum MSE
(1) Measures of Closeness: T : Statistic/Estimator, ψ(θ): Parametric function
Destroying the randomness, general measures of closeness are -

(a) E|T − θ|r , for some r > 0 (smaller value is better)

(b) P |T − θ| < ϵ , for ϵ > 0 (higher value is better)

2
(2) Mean Square Error: MSEψ(θ) (T ) = E[T − ψ(θ)]2 = V ar(T ) + b ψ(θ), T
T can be said a ‘good estimator’ of ψ(θ) if it has a small variance.

(3) E(m′r ) = µ′r , if µ′r exists =⇒ E(X̄) = µ, provided µ = E(X1 ) exists

n

2 1 2
= σ 2 , population variance (if exists) but E(m2 ) ̸= σ 2 = µ2
P
(4) E s = n−1
(Xi − X̄)
i=1

n
rn
iid 2
pπ 1
P 2
P 2
(5) Xi ∼ N (0, σ ) =⇒ E T1 = 2
· n
|Xi | = σ = E T2 = Cn Xi .
i=1 i=1
2
n n Γ( n2 )
Xi2 ,
P P
Here, T1 , T2 are two UEs of σ based on |Xi | and respectively. (Cn = √ )
i=1 i=1 2 Γ( n+1
2
)

iid n−1 1

(6) Xi ∼ Exp (θ) =⇒ E(X̄) = θ, E nX̄
= θ

iid
(7) Xi ∼ N (µ, σ 2 ) =⇒ T ′ = n−1
n+1
· s2 has the smallest MSE in the class {bs2 : b > 0} i.e. a biased
′
estimator T is better than an UE s2 , in terms of MSE.

(8) X ∼ Poisson (λ) =⇒ T (X) = (−1)X is the UMVUE of e−2λ which is an absurd UE.
Note: Absurd unbiased estimator is that unbiased estimator which can take values outside the
parameter space.

21
CHAPTER 2. STATISTICS 22

iid
(9) Xi ∼ Poisson (λ) =⇒ Tα = αX̄ + (1 − α)s2 is an UE of λ for any α ∈ [0, 1]
=⇒ There may be infinitely many UEs

(10) Estimable Parametric Functions

(a) X ∼ Bin (n, p) =⇒ E[(X)r ] = (n)r pr , r = 1, 2, . . . , n

=⇒ Only polynomials of degree ≤ n are estimable.
(b) X ∼ Bernoulli (p) =⇒ Only ψ(p) = a + bp is estimable.
√
(c) X ∼ Poisson (λ) =⇒ E[(X)r ] = λr , r = 1, 2, . . . =⇒ e−λ is estimable but not λ1 , λ.

1
(11) X ∼ Bernoulli (θ), T1 (X) = X and T2 (X) = 2
=⇒ Between T1 and T2 none are uniformly better than the other, in terms of MSE.

iid
(12) Xi ∼ f (x; θ), E T (X1 ) = θ, V ar T (X1 ) < ∞
=⇒ lim V ar(Sn ) = 0, where Sn is the UMVUE of θ
n→∞

(13) Best Linear Unbiased Estimator (BLUE)

T1 , T2 , . . . , Tk be UEs of ψ(θ) with known variances v1 , v2 , . . . , vk and are independent
k
1 X Ti
=⇒ BLUE of ψ(θ) : T = k
P 1 v
i=1 i
vi
i=1

2.1.2 Consistency

P |Tn − θ| < ϵ → 1
Tn is consistent for θ ⇐⇒ or as n → ∞, ∀θ ∈ Ω for every ϵ > 0
P |Tn − θ| > ϵ → 0

(1) Sufficient Condition

P
E(Tn − θ)2 → 0 ⇐⇒ E(Tn ) → θ, V ar(Tn ) → 0 as n → ∞ =⇒ Tn −→ θ

n
P
(2) m′r = 1
xri −→ µ′r = E(X1r ), r = 1, 2, . . . , k (if k th order moment exists)
P
n
i=1

P
(3) If Tn −→ θ then -
P
(a) bn Tn −→ θ, if bn → 1 as n → ∞
P
(b) an + Tn −→ θ, if an → 0 as n → ∞

This also shows that, ‘unbiasedness’ and ‘consistency’ are not interrelated.

P P
(4) Invariance Property: Tn −→ θ =⇒ ψ(Tn ) −→ ψ(θ), provided ψ(·) is continuous
CHAPTER 2. STATISTICS 23

2.1.3 Sufficiency
S is sufficient for θ ⇐⇒ (X1 , X2 , . . . , Xn ) | S = s is independent of θ, ∀ s
S is sufficient for θ ⇐⇒ T | S = s is independent of θ, ∀ s, for all statistic T .

(1) Any one-to-one function of a sufficient statistic is also sufficient for a parameter.

(2) Factorization Theorem

n
Y
f (xi ; θ) = g T (x); θ · h(x) ⇐⇒ T (X) is sufficient for θ
i=1
˜ ˜ ˜

where, g T (x); θ depends on θ and on x only through T (x) and h(x) is independent of θ.
˜ ˜ ˜ ˜
(3) Trivial Sufficient Statistic: (X1 , X2 , . . . , Xn ) and (X(1) , X(2) , . . . , X(n) ).
Sufficiency means “space reduction without losing any information”. In this aspect, the
order statistics, (X(1) , X(2) , . . . , X(n) ) is better as a sufficient statistic than the whole sample
i.e. (X1 , X2 , . . . , Xn ), with respect to data summarization.

(4) T1 , T2 are two sufficient statistic for θ =⇒ they are related

n
iid P
(5) Xi ∼ DE (µ, σ), ∃ non-trivial sufficient statistic if µ is known (say, µ0 ) and that is |Xi − µ0 |.
i=1

Minimal Sufficient Statistic

(6) T0 is a minimal sufficient of θ if,

(a) T0 is sufficient
(b) T0 is a function of every sufficient statistic
f (x;θ)
(7) Theorem: For two sample points x and y, the ratio f (˜y;θ)
is independent of θ if and only if
˜ ˜ ˜
T (x) = T (y), then T (X) is minimal sufficient for θ.
˜ ˜ ˜

2.1.4 Completeness
T is complete for θ ⇐⇒ “E[h(T )] = 0, ∀θ ∈ Ω =⇒ P [h(T ) = 0] = 1, ∀θ ∈ Ω”

Remark
If a two component statistic (T1 , T2 ) is minimal sufficient for a single component parameter θ, then
in general (T1 , T2 ) is not complete.
It is possible to find h1 (T1 ) and h2 (T2 ) such that,

E[h1 (T1 )] = ψ(θ) = E[h2 (T2 )], ∀θ

=⇒ E[h(T1 , T2 )] = 0, ∀θ where, h(T1 , T2 ) = h1 (T1 ) − h2 (T2 ) ̸= 0

=⇒ (T1 , T2 ) is not complete.
CHAPTER 2. STATISTICS 24

2.1.5 Exponential Family

One Parameter
An one parameter family of PDFs or PMFs, {f (x; θ) : θ ∈ Ω} that can be expressed in the form -

f (x; θ) = exp T (x)u(θ) + v(θ) + w(x) , x ∈ S

with the following regularity conditions -

C1 : The support, S = {x : f (x; θ) > 0} is independent of θ
C2 : The parameter space, Ω is an open interval in R i.e. Ω = {θ : a < θ < b}
C3 : {1, T (x)} and {1, u(θ)} are linearly independent i.e. T (x) and u(θ) are non-constant functions
is called One Parameter Exponential Family (OPEF)

K Parameter
A K-parameter family of PDFs or PMFs, {f (x; θ) : θ ∈ Ω ⊆ Rk } satisfying the form -
" k ˜ ˜ #
X
f (x; θ) = exp Ti (x)ui (θ) + v(θ) + w(x) , x ∈ S
˜ i=1
˜ ˜

with the following regularity conditions -

C1 : The support, S = {x : f (x; θ) > 0} is independent of θ
˜ ˜
C2 : The parameter space, Ω ⊆ Rk is an open rectangle in Rk i.e. ai < θi < bi , i = 1(1)k
C3 : {1, T1 (x), . . . , Tk (x)} and {1, u1 (θ), . . . , uk (θ)} are linearly independent
˜ ˜
is called K-parameter Exponential Family

Theorem
n
iid P
(a) X ∼ f (x; θ) ∈ OPEF =⇒ T (Xi ) is complete and sufficient for the family.
˜ i=1
n n

iid P P
(b) X ∼ f (x; θ) ∈ K-parameter Exponential Family =⇒ T1 (Xi ), . . . , Tk (Xi ) is complete
˜ ˜ i=1 i=1
and sufficient for the family.

Distributions in Exponential Family

∞
a(x) θx
a(x) θx
P
(a) f (x; θ) = g(θ)
, x = 0, 1, 2, . . . ; 0 < θ < ρ, a(x) ≥ 0, g(θ) = (Power Series)
x=0
=⇒ Binomial (n known), Poisson, Negative Binomial (n known) are in OPEF.
(b) Normal, Exponential, Gamma, Beta, Pareto (x0 known) are in the Exponential family.
(c) Uniform, Cauchy, Laplace, Shifted Exponential, {N (θ, θ2 ) : θ ̸= 0}, {N (θ, θ) : θ > 0} are not
in the Exponential Family.
The last two families are identified by Lehmann as ‘Curved Exponential Family’.
CHAPTER 2. STATISTICS 25

2.1.6 Methods of finding UMVUE

Theorem 2.1.6.1 (Necessary & Sufficient n Condition for UMVUE) Let X has a distribution
o

from {f (x; θ) : θ ∈ Ω}. Define, Uψ = T (X) : Eθ T (X) = ψ(θ), V arθ T (X) < ∞, ∀θ ∈ Ω and
n o
U0 = u(X) : Eθ T (X) = 0, V arθ u(X) < ∞, ∀θ ∈ Ω . Then, T0 ∈ Uψ is UMVUE of θ if and
only if Covθ (T0 , u) = 0, ∀θ ∈ Ω, ∀u ∈ U0

Results
UMVUE if exists, is unique
k k
Ti is UMVUE of ψ(θ) =⇒
P P
ai Ti is UMVUE of ai ψi (θ)
i=1 i=1

T is UMVUE =⇒ T k is UMVUE =⇒ any polynomial function, f (T ) is UMVUE of their

expectations

Theorem 2.1.6.2 (Rao-Blackwell) Let X has a distribution from {f (x; θ) : θ ∈ Ω} and h be a

statistic from Uψ = {h : E(h) = ψ(θ), V ar(h) < ∞, ∀θ ∈ Ω}. Let, T be a sufficient statistic for θ.
Then-

(a) E(h | T ) is an UE of ψ(θ)

(b) V ar E(h | T ) ≤ V ar(h), ∀θ ∈ Ω

Implication: UMVUE is necessarily a function of minimal sufficient statistic

Theorem 2.1.6.3 (Lehmann-Scheffe) Let X has a distribution from {f (x; θ) : θ ∈ Ω} and T be

a complete sufficient statistic for θ. Then-

(a) If E h(T ) = ψ(θ), then UMVUE of ψ(θ) is the unique UE, h(T )

(b) If h∗ is an UE of ψ(θ), then E(h∗ | T ) is the UMVUE of ψ(θ)

UMVUE of Different Families

Binomial
n
iid P
Xi ∼ Bernoulli (p) =⇒ Complete Sufficient: T = Xi
i=1

T
(1) p = E(X1 ) : n

T (n−T )
(2) p(1 − p) = V ar(X1 ) : n(n−1)

(T )r T (T −1)···(T −r+1)
(3) pr : (n)r
= n(n−1)···(n−r+1)
, r = 1, 2, . . . , n

n
iid P
Xi ∼ Bin (n, p) =⇒ Complete Sufficient: T = Xi
i=1
X̄n T
→ p: = 2
n n
CHAPTER 2. STATISTICS 26

Poisson
n
iid P
Xi ∼ Poisson (λ) =⇒ Complete Sufficient: T = Xi
i=1

(T )r
(1) λr : nr
, r = 1, 2, . . .
k T (n−1)T −k
(2) e−k λk! = P (X1 = k) :

k nT

k T
(3) e−kλ = P (X1 = 0, X2 = 0, . . . , Xk = 0) : 1 −

n
, 1≤k<n

Geometric
n
iid P
Xi ∼ Geo (p) =⇒ Complete Sufficient: T = Xi
i=1
n−1
→ p = P (X1 = 0) : n−1+T

Uniform
iid
(1) Discrete: Xi ∼ U{1, 2, . . . , N } =⇒ Complete Sufficient: T = X(n)
T n+1 −(T −1)n+1
→N : T n −(T −1)n

iid
(2) Continuous: Xi ∼ U(0, θ) =⇒ Complete Sufficient: T = X(n)
n ′ o
→ ψ(θ) : T ψn(T ) + ψ(T ) ψ(θ) = θr : n+r
r
n
T
iid
Also if, Xi ∼ U(θ1 , θ2 ) =⇒ Complete Sufficient: T = X(1) , X(n)
nX(1) −X(n) nX(n) −X(1)
→ θ1 : n−1
θ2 : n−1

Gamma
n
iid P
Xi ∼ Exp (θ) =⇒ Complete Sufficient: T = Xi
i=1

T
(1) θ = E(X1 ) : n
1 n−1
(2) θ
: T
k k n−1
(3) P (X1 > k) = e− θ : 1 −

T
, if k < T
k (n−1) k n−2
(4) f (k; θ) = 1θ e− θ :

T
1− T
, if k < T
n n

iid P P
Xi ∼ Gamma (p, θ) =⇒ Complete Sufficient: ln Xi , Xi (mean = p θ)
i=1 i=1

Γ(np)
For known p, θr : T r , r > −np
Γ(np + r)
iid σ0
Xi ∼ Shifted Exp(θ, σ0 ) =⇒ Complete Sufficient: X(1) → θ : X(1) −
n
n Γ(n)
iid
|Xi − µ0 | → σ r : T r , r > −n
P
Xi ∼ DE(µ0 , σ) =⇒ Complete Sufficient: T =
i=1 Γ(n + r)
CHAPTER 2. STATISTICS 27

Beta
n n

iid P P
Xi ∼ Beta (θ1 , θ2 ) =⇒ Complete Sufficient: ln Xi , ln(1 − Xi )
i=1 i=1
n−1
For θ2 = 1, UMVUE of θ1 is n
P
− ln Xi
i=1

Normal
iid
n σ02
Xi ∼ N (µ, σ02 ) =⇒ Complete Sufficient: µ2 : X̄ 2 −
P
Xi or X̄ → µ : X̄
i=1 n
n
iid
Xi ∼ N (µ0 , σ 2 ) =⇒ Complete Sufficient: (Xi − µ0 )2 or S02 → σ r : S0r Kn,r , r > −n
P
i=1

n n

iid
Xi ∼ N (µ, σ 2 ) =⇒ Complete Sufficient: Xi2 or X̄, S 2
P P
Xi ,
i=1 i=1

(1) µ = E(X1 ) : X̄
" r
#
(n − 1) 2 Γ( n−1
2
)
(2) σ 2 : S 2 , σ r : S r Kn−1,r , r > −(n − 1) Kn−1,r = r n−1+r
2 2 Γ( 2 )
µ
(3) : X̄ · Kn−1,−r S −r , r < (n − 1)
σr
(4) pth quantile of X1 = ξp = µ + σ Φ−1 (p) : X̄ + Kn−1,1 S Φ−1 (p)
h i
iid
Xi ∼ N (θ, 1) ϕ(x; µ, σ 2 ) : PDF of N (µ, σ 2 )
r
n
(1) Φ(k − θ) = P (X1 ≤ k) : Φ (k − X̄)
n−1
(2) ϕ k; θ, 1) : ϕ(k; X̄, n−1

n

h2
(3) ehθ : ehX̄− 2n

Others
n n
iid Q P
Xi ∼ Pareto(θ, x0 ) =⇒ Complete Sufficient: Xi or ln Xi (x0 known)
i=1 i=1
P Xi r
n
1 Γ(n)
→ r : ln x0 , r > −n
θ Γ(n + r) i=1
n−1
Special case: r = −1 =⇒ θ : P
n
ln Xi
x0
i=1

iid
Xi ∼ Pareto(θ0 , α) =⇒ Complete Sufficient: X(1) (θ0 known)

r r r

→α : 1− X(1) if r < nθ0
nθ0
CHAPTER 2. STATISTICS 28

2.1.7 Cramer-Rao Inequality

Let X has a distribution from {f (x; θ) : θ ∈ Ω} satisfying the following regularity conditions -
(i) The parameter space, Ω is an open interval in R i.e. Ω = {θ : a < θ < b}

(ii) The support, S = {x : f (x; θ) > 0} is independent of θ

∂

(iii) For each x ∈ S, ∂θ ln f (x; θ) exists and finite
P R
(iv) The identity “ f (x; θ) = 1” or “ f (x; θ)dx = 1” can be differentiated under the summation
x∈S S
or integral sign.
n o
(v) T ∈ Uψ = T (X) : Eθ T (X) = ψ(θ), V arθ T (X) < ∞, ∀θ ∈ Ω is an UE of ψ(θ) such that

the derivative of ψ(θ) = E T (X) with respect to θ can be evaluated by differentiating under
the summation or integral sign.
′ (θ)}2 2
Then, V ar T (X) ≥ {ψI(θ)
∂
where I(θ) = E ∂θ ln f (x; θ) >0

Equality Case
‘=’ holds in CR Inequality iff -
∂ I(θ)
ln f (x; θ) = ± ′ {T − ψ(θ)} a.e. . . . . . . (∗)
∂θ ψ (θ)
⇐⇒ the family {f (x; θ) : θ ∈ Ω} belongs to OPEF
→ (∗) is the necessary and sufficient condition for attaining CRLB by an UE, T (X) of ψ(θ).

Remarks

(1) Even in OPEF, the only parametric function for which T (X) attains CRLB, is that E T (X)
′ (θ)
(2) If MVBUE T (X) of ψ(θ) exists, then it is given by, T (X) = ψ(θ) ± ψI(θ) ∂

· ∂θ ln f (X; θ)
MVBUE is also the UMVUE but UMVUE may not be MVBUE always -
Non-regular case: one of the regularity conditions does not hold, eg. {U(0, θ) : θ > 0}
If all the regularity conditions hold but CRLB is not attainable, then there may exist
UMVUE but that is not the MVBUE
(3) Fisher’s Information
∂ 2 h 2 i
∂
(a) I(θ) = E ∂θ ln f (X; θ) = E − ∂θ2 ln f (X; θ)

(b) IX (θ) = n · IX1 (θ), if the regularity conditions hold

˜
iid
(c) X ∼ {f (x; θ) : θ ∈ Ω} =⇒ for any statistic T (X), IT (X) (θ) ≤ IX (θ)
˜ ˜ ˜ ˜
‘=’ holds if and only if T (X) is sufficient
˜ ∂ 2
E(T ) {ψ ′ (θ) + b′ (θ)}2
(4) Lower bound for the MSE of any estimator: MSEψ(θ) (T ) ≥ ∂θ =
I(θ) I(θ)
(5) {C(θ, 1) : θ ∈ R} is a regular family as the CR inequality holds, but CRLB is not attainable
CHAPTER 2. STATISTICS 29

2.1.8 Methods of Estimation

Method of Moments
If the sample drawn is a good representation of the population, then this method is quite reasonable.
Equate ‘k’ sample moments m′r with corresponding population moments µ′r and solve for k unknowns
for a k-parameter family.

Method of Least Squares

Here we minimize the sum of squares of errors with respect to the parameter (θ1 , θ2 , . . . , θk )

Model: yi = E(Y | X = xi ) + zi

Assumptions: Conditional distribution of Y | X = xi is homoscedastic.

Method of Maximum Likelihood

(1) Bernoulli (p)

(a) p ∈ (0, 1) =⇒ No MLE of p when x = 0 or x = 1, else X̄

˜ ˜ ˜ ˜
(b) Ω = p : p ∈ {Q′ ∩ [0, 1]} =⇒ No MLE of p ∈ Ω

iid
(2) Xi ∼ U(0, θ), θ > 0 =⇒ θ̂ = X(n)

iid
(3) Xi ∼ U(α, β), α < β =⇒ θˆ = α̂, β̂ = X(1) , X(n)

˜
(4) MLE is not unique
iid
Xi ∼ U(θ − k1 , θ + k2 ) =⇒ θ̂ = α(X(n) − k2 ) + (1 − α)(X(1) + k1 ), α ∈ [0, 1]

iid
(5) Xi ∼ U(−θ, θ), θ > 0 =⇒ θ̂ = max {|Xi |} = max −X(1) , X(n)
i=1(1)n

n

iid 2 2 1
P 2
(6) Xi ∼ N (µ, σ ) =⇒ µ̂, σ = X̄, n (Xi − X̄)
b
i=1

iid X̄
(7) Xi ∼ Gamma (p0 , θ) =⇒ θ̂ = p0
(p0 known)
iid n
(8) Xi ∼ Beta (θ, 1) =⇒ θ̂ = n
P
− ln Xi
i=1

 

iid
(9) Xi ∼ Pareto (x0 , θ) =⇒ xˆ0 , θ̂ = X(1) , n
n 
P Xi
ln X(1)
i=1

n

iid 1
P
(10) Xi ∼ DE (µ, σ) =⇒ (µ̂, σ̂) = X̃, n |Xi − X̃|
i=1

iid
(11) Xi ∼ Shifted Exp (µ, σ) =⇒ (µ̂, σ̂) = X(1) , X̄ − X(1)
In particular if µ = σ > 0, then µ̂ = X(1)
CHAPTER 2. STATISTICS 30

iid 1 3

(12) Truncated parameter: Xi ∼ Bernoulli (p), p ∈ ,
4 4
. Here, the MLE of p is -
1 1


 , if X̄ <


 4 4
1 3

p̂(X) = X̄, if ≤ X̄ ≤
˜ 
 4 4
 3 , if 3


X̄ >

4 4
1 3
It is better than the UMVUE, X̄ of p ∈ 4 , 4 , in terms of variability

Properties
(13) MLE, if exists is a function of (minimal) sufficient statistic
(14) Under the regularity conditions of CR inequality MVBUE exists, then that is the MLE
(15) Invariance property: θ̂ is the MLE of θ =⇒ h(θ̂) is the MLE of h(θ) for any function h(·)
(16) For large n, the bias of MLE become insignificant
(17) Under normality, LSE ≡ MLE.
(18) Asymptotic property
(a) Under certain regularity conditions, the MLE θ̂ of θ is consistent and also
√

a 1 1
a 1
θ̂ ∼ N θ, = or n θ̂ − θ ∼ N 0,
nI1 (θ) In (θ) I1 (θ)

(b) In OPEF, let θ̂ is the MLE of θ then -

!
√ √ n {ψ ′ (θ)}2

a 1 o
a
n θ̂ − θ ∼ N 0, =⇒ n ψ(θ̂) − ψ(θ) ∼ N 0,
I1 (θ) I1 (θ)

2.2 Testing of Hypothesis

2.2.1 Tests of Significance
Univariate Normal
(1) H0 : µ = µ0 against H1 : µ ̸= µ0
n √ o √
(a) σ = σ0 (known) → ω = x : n(x̄−µ σ0
0)
> τ α
2
n(x̄−µ0 )
σ0
> τα if H1 : µ > µ0
˜
n √ o n
(b) σ unknown → ω = x : n(x̄−µ 0) 2 1
(Xi − X̄)2
P
s
> t α
2
; n−1 , s = n−1
˜ i=1

(2) H0 : σ = σ0 against H1 : σ ̸= σ0
n o
ns20 2 2
(a) µ = µ0 (known), Z = σ02
→ ω = Zobs > χ α ; n or Zobs < χ1− α ; n
2 2
n o
(n−1)s2
(b) µ unknown, Z = σ02
→ ω = Zobs > χ2α ; n−1 or Zobs < χ21− α ; n−1
2 2
CHAPTER 2. STATISTICS 31

Two Independent Univariate Normal

(1) H0 : µ1 − µ2 = ξ0 (known) against H0 : µ1 − µ2 ̸= ξ0

(a) σ1 , σ2 are known Z = X̄r1 −σ2X̄2 −ξ

σ 2
0
→ ω = |Zobs | > τ α2
1+ 2
n1 n2

1 −X̄2 −ξ0
X̄q
(n1 −1)s21 +(n2 −1)s22
(b) σ1 = σ2 = σ (unknown), Z = → ω = |Zobs | > t α2 ; n1 +n2 −2 , s2 = n1 +n2 −2
s n1 + n1
1 2

σ1 σ1
(2) H0 : σ2
= ξ0 (known) against H1 : σ2
̸= ξ0
n o
s210 1 1
(a) µ1 , µ2 are known, F = · → ω = Fobs > F 2 ; n1 ,n2 or Fobs > F 2 ; n2 ,n1
s220 ξ02
α α

n o
s21 1 1
(b) µ1 , µ2 are unknown F = s2 · ξ2 → ω = Fobs > F 2 ; n1 −1,n2 −1 or Fobs > F 2 ; n2 −1,n1 −1
α α
2 0

Bivariate Normal (Correlated Case)

n √ o
n(x̄−ȳ−ξ0 )
(1) H0 : µ1 − µ2 = ξ0 (known) → ω = (x, y) : sxy
> t 2 ; n−1 , s2xy = s2x + s2y + 2rsx sy
α
˜ ˜
n √ o
(2) H0 : ρ = 0 → ω = r√1−r n−2
2 > t α2 ; n−2 [r: sample correlation coefficient of (x, y)]
˜ ˜
√
U = X + ξ0 Y
(3) H0 : σσ21 = ξ0 → ω = r√ uv n−2
> t α2 ; n−2
2
1−ruv V = X − ξ0 Y

Binomial Proportion
(I) Single Proportion - H0 : p = p0 , observed value: x0

(a) H1 : p > p0 , p-value = P1 = PH0 (X ≥ x0 )

(b) H1 : p < p0 , p-value = P2 = PH0 (X ≤ x0 )
(c) H1 : p ̸= p0 , p-value = P3 = 2 · min{P1 , P2 } (Reject H0 if p-value ≤ α)

(II) Two Proportions - H0 : p1 = p2 = p, observed value of X1 : x10 and X1 + X2 : x0

(a) H1 : p1 > p2 , p-value = P1 = PH0 (X1 ≥ x10 | X1 + X2 = x0 )

(b) H1 : p1 < p2 , p-value = P2 = PH0 (X1 ≤ x10 | X1 + X2 = x0 )
(c) H1 : p1 ̸= p2 , p-value = P3 = 2 · min{P1 , P2 } (Reject H0 if p-value ≤ α)

Poisson Mean
n
P
(I) Single Population - H0 : λ = λ0 , observed value of S = X i : s0
i=1

(a) H1 : λ > λ0 , p-value = P1 = PH0 (S ≥ s0 )

(b) H1 : λ < λ0 , p-value = P2 = PH0 (S ≤ s0 )
(c) H1 : λ ̸= λ0 , p-value = P3 = 2 · min{P1 , P2 } (Reject H0 if p-value ≤ α)
CHAPTER 2. STATISTICS 32

n1
P
(II) Two Populations - H0 : λ1 = λ2 = λ, observed value of S1 = X1i : s10 and S1 + S2 : s0
i=1

(a) H1 : λ > λ0 , p-value = P1 = PH0 (S1 ≥ s10 | S1 + S2 = s0 )

(b) H1 : λ < λ0 , p-value = P2 = PH0 (S1 ≤ s10 | S1 + S2 = s0 )
(c) H1 : λ ̸= λ0 , p-value = P3 = 2 · min{P1 , P2 } (Reject H0 if p-value ≤ α)

2.3 Interval Estimation

T : sufficient statistic and θ1 (T ), θ2 (T ) is a confidence interval with confidence coefficient (1 − α)

=⇒ P θ1 (T ), θ2 (T ) ∋ ψ(θ) = 1 − α ∀θ ∈ Ω

2.3.1 Methods of finding C.I.

Find a function ϕ(T, θ), whose sampling distribution is completely specified. This is the pivot.
Then find c1 , c2 based on the restriction: Pθ [c1 < ϕ(T, θ) < c2 ] = 1 − α

Note
For a parameter θ, the method of guessing θ is known as estimation and an interval estimate of θ
is known as confidence interval for θ.
For a R.V. Y , a method of guessing Y is known as prediction and an interval prediction of Y is
known as prediction limits.
q
iid
Xi ∼ N (µ, σ 2 ), i = 1(1)n =⇒ Prediction limits for Xn+1 : X̄ ∓ t α2 ; n−1 n+1
n
s

2.3.2 Wilk’s Optimum Criteria

∗
Definition: A (1 − α) level confidence interval θ∗ (T ), θ (T ) of θ ∈ Ω, is said to be shortest length

confidence interval, in the class of all level (1 − α) confidence intervals based on a pivot ψ(T, θ), if
∗
Eθ θ∗ (T ) − θ (T ) ≤ Eθ θ(T ) − θ(T ) , ∀θ ∈ Ω

whatever the other (1 − α) level confidence interval θ(T ), θ(T ) based on ψ(T, θ).

2.3.3 Test Inversion Method

Let A(θ0 ) be the “Acceptance Region” of a size ‘α’ test of H0 : θ = θ0 . Define,

I(x) = {θ ∈ Ω : A(θ) ∋ x} , x ∈ X
˜ ˜
then I(x) is a confidence interval for θ at confidence coefficient (1 − α).
˜
CHAPTER 2. STATISTICS 33

List of Confidence Intervals

iid X(n) X(n)
(1) Xi ∼ U(0, θ) : P ivot = θ
=⇒ X(n) , √
nα

iid σ0
(2) Xi ∼ Shifted Exp (µ, σ0 ) : P ivot = σn0 [X(1) − µ] =⇒ X(1) +

n
ln α, X(1) (finite length)
Infinite Length: −∞, X(1) + σn0 ln(1 − α)

(σ0 known)
iid √
(3) Xi ∼ N (µ, σ 2 ) : P ivot = n X̄−µS
=⇒ X̄ ∓ t α
2
S
√
; n−1 n

!
n 2T 2T
iid 2
P
(4) Xi ∼ Exp (θ) : P ivot = θ
Xi =⇒ 2
, 2
i=1 χ α ;2n χ1− α ;2n
2 2
!
2n 2nX(1) 2nX(1)
Based on X(1) , P ivot = X(1) =⇒ ,
θ
χ2α ;2 χ21− α ;2
2 2

2.4 Large Sample Theory

2.4.1 Modes of Convergence
(I) Convergence in Distribution
Definition: A sequence {Xn } of random variables with the corresponding sequence Fn (x) of D.F.s
is said to converge to a random variable X with D.F. F (x), if

lim Fn (x) = F (x), at every continuity point of F(x)

n→∞

Results
iid D D
(1) Xn ∼ U(0, θ) =⇒ n θ − X(n) −→ Exp (θ) ←− nX(1)
iid D
(2) Xn ∼ Shifted Exp (0, θ) =⇒ n X(n) − θ −→ Exp (1)
iid D
(3) Xn ∼ N (µ, σ 2 ) =⇒ X̄ −→ µ
D
(4) Xn ∼ Geo (pn = nθ ) =⇒ Xn
n
−→ Exp ( 1θ )
D
(5) X ∼ NB (n, p) =⇒ 2pX −→ χ22n as p → 0
Xn D
(6) Xn ∼ Gamma (n, β) =⇒ n
−→ β

Limiting MGF
D
(7) MGF → Xn : Mn (t), X : M (t), E(Xn ) exists ∀ n and Xn −→ X
If lim Mn (t), lim E(Xn ) is finite then MN (t) → M (t), E(Xn ) → E(X) as n → ∞
n→∞ n→∞

(8) Theorem: Let, {Fn } be a sequence of D.F.s with corresponding M.G.F.s {Mn } and suppose
that Mn (t) exists for |t| ≤ t0 , ∀n. If there exists a D.F. F with corresponding M.G.F. M ,
which exists for |t| ≤ t1 < t0 , such that Mn (t) → M (t) as n → ∞ for every t ∈ [−t1 , t1 ], then
W
Fn −→ F
CHAPTER 2. STATISTICS 34

(II) Convergence in Probability

Definition: Let, {Xn } be a sequence of R.V.s defined on the probability space (Ω, A, P). Then we
say that {Xn } converges in probability to a R.V. X, defined on (Ω, A, P), if for every ϵ > 0,

lim P [|Xn − X| < ϵ] = 1 or lim P [|Xn − X| > ϵ] = 0

n→∞ n→∞

Sufficient Condition: If {Xn } is a sequence of R.V.s such that E(Xn ) → C and V ar(Xn ) → 0 as
P
n → ∞ or E(Xn − C)2 → 0 as n → ∞, then Xn −→ C.
Counter example:

1

1 −
 ,x = k
P (Xn = x) = n =⇒ E(Xn − k)2 ̸→ 0 as n → ∞ but Xn −→ k
P
1
,x = k + n


n

Results
n
n1
iid Q P 1
(1) Xi ∼ U(0, 1) =⇒ Xi −→ e
i=1

P P
(2) Xn −→ X, lim an = a ∈ R =⇒ an Xn −→ aX
n→∞

D P
(3) Xn −→ X, lim an = ∞, an > 0 ∀n =⇒ a−1
n Xn −→ 0
n→∞

P P
(4) Limit Theorems: If Xn −→ X, Yn −→ Y , then,
P
(a) Xn ± Yn −→ X ± Y
P
(b) Xn Yn −→ XY
P
(c) g(Xn ) −→ g(X), if g(·) is continuous (Invariance Property)
Xn P X
(d) −→ , provided P (Yn = 0) = 0 = P (Y = 0)
Yn Y
Chapter 3

Mathematics

3.1 Basics
3.1.1 Combinatorial Analysis
(1) For a population with n elements, the number of samples of size r is -
(
nr , WR
ordered sample = n
Pr or (n)r , WOR

n n
unordered sample = Cr or , WOR
r

(2) Partition of population - The number of ways in which a population of n elements can be
divided into k ordered parts of which ith part consists of ri members, i = 1, 2, . . . , k is -

n n − r1 n − r1 − r2 − · · · − rk−1
··· , r1 + r2 + · · · + rk = n
r1 r2 rk

n! n
= =
r1 !r2 ! · · · rk ! r1 r2 · · · rk

(a) The number of different distributions of r identical balls into n cells i.e. the number of
different solutions (r1 , r2 , . . . , rn ) of the equation:

n+r−1 n+r−1
r1 + r2 + · · · + rn = r where, ri ≥ 0, are integers, is =
r n−1

(b) The number of different distributions of r indistinguishable balls into n cells in which no
cell remains empty i.e. the number of different solutions (r1 , r2 , ..., rn ) of the equation:

r−1
r1 + r2 + · · · + rn = r where, ri ≥ 1, are integers, is
n−1

n 2 n 2 n 2 2n

(3) (a) 0
+ 1
+ · · · + n
= n
n
n n
n n
n 2n

(b) 0 m
+ 1 m+1 + · · · + n−m n
= n−m

35
CHAPTER 3. MATHEMATICS 36

3.1.2 Difference Equation

{xn } is a sequence, xn = f (xn−1 , . . . , x2 , x1 ) is difference equation
A.P.: xn = xn−1 + d
G.P.: xn = r xn−1
xn = a1 xn−1 + · · · + ap xn−p is a linear difference equation of order p.

First Order Linear Difference Equation

xn = axn−1 + b, n ≥ 1
=⇒ xn − c = (x1 − c)an−1 , b = c(1 − a)

Second Order Linear Difference Equation

xn = axn−1 + bxn−2 , n ≥ 2
Characteristic Equation: u2 − au − b = 0 with roots u1 , u2
Case I: u1 ̸= u2 =⇒ xn = Aun1 + Bun2
Case II: u1 = u2 = u =⇒ xn = (A + Bn)un
CHAPTER 3. MATHEMATICS 37

3.2 Linear Algebra

3.2.1 Vectors & Vector Spaces
r n n
a2i and
P P
(1) Length of a = ai = 1.a where, 1 = {1, 1, . . . , 1}
˜ i=1 ˜˜ i=1 ˜
rn
p P
(2) Distance between a and b = |b − a| = (b − a).(b − a) = (bi − ai )2
˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜ i=1

a.b
(3) Angle (θ) between two non-null vectors a and b is given by, cos θ = ˜ ˜
˜ ˜ |a| |b|
˜ ˜
2 2 2
(4) Cauchy-Schwarz: (a.b) ≤ |a| |b| ‘=’ holds iff b = λa for some λ
˜˜ ˜ ˜ ˜ ˜
(5) Triangular Inequality: |a − b| + |b − c| ≥ |a − c|
˜ ˜ ˜ ˜ ˜ ˜

Vector Spaces
A vector space V over a field F is a non-empty collection of elements, satisfying the following
axioms:

(a) a, b ∈ V =⇒ a + b ∈ V [closed under vector addition]

(b) a ∈ V =⇒ αa ∈ V, ∀ α ∈ F [closed under scalar multiplication]

Note: Every vector space must include the null vector (0).
˜
Useful Results
Pr
(1) “ λi ai = 0 =⇒ λ = 0” ⇐⇒ {a1 , a2 , . . . , ar } are Linearly Independent.
i=1 ˜ ˜ ˜ ˜ ˜ ˜ ˜
Pr
The set is L.D. iff ∃ λ ̸= 0 for which λi ai = 0
˜ ˜ i=1 ˜ ˜
(2) A set of vectors is LIN =⇒ any subset is LIN
A set of vectors is LD =⇒ any superset is LD

(3) Basis Vector: (i) Spans V (ii) LIN

(a) {1, t, t2 , . . . , tn } : polynomial of degree ≤ n

(b) {e1 , e2 , . . . , en } : n dimensional vector space
˜ ˜ ˜
(4) The representation of every vector in terms of basis is unique.
Pr
(5) Replacement theorem: {a1 , a2 , . . . , ar } is a basis vector and b = λi ai
˜ ˜ ˜ ˜ i=1 ˜
If λi ̸= 0 then replace ai by b =⇒ {a1 , a2 , . . . , ai−1 , b, ai+1 , . . . , ar } is also a basis vector.
˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜
(6) (a) Any set of basis of basis vector for Vn space contains exactly n vectors
(b) Any n LIN vectors from Vn form a basis for Vn
(c) Any set of (n + 1) vectors from Vn is LD
CHAPTER 3. MATHEMATICS 38

(7) Extension theorem: A set of m(< n) LIN vectors from Vn can be extended to a basis of Vn

(8) Dimension: Number of vectors in a basis or maximum number of LIN vectors in the space
Dimension of a subspace: {Total no. of vectors} - {No. of LIN restrictions}

(9) V = {0} has no basis =⇒ dim(V ) is undefined [We assume dim(V ) = 0]

˜
(10) Consider two subspaces S, T of a vector space V over the field F -
p
(a) S ∩ T is also a subspace and dim(S ∩ T ) ≤ min{dim(S), dim(T )} ≤ dim(S) dim(T )
(b) S + T = {a + b : a ∈ S, b ∈ T } =⇒ dim(S + T ) = dim(S) + dim(T ) − dim(S ∩ T )
(c) S ⊆ T =⇒ dim(S) ≤ dim(T ) and dim(S) = dim(T ) ⇐⇒ S = T
In general, dim(S) = dim(T ) ⇏ S = T

Orthogonal Vectors
(11) a, b ∈ E n are orthogonal (⊥) if a.b = 0 [0 is orthogonal to every vector]
˜ ˜ ˜˜ ˜
(12) The set of vectors {a1 , a2 , . . . , an } is mutually orthogonal if ai .aj = 0, ∀ i ̸= j
˜ ˜ ˜ ˜ ˜
(13) If a mutually orthogonal set includes the null vector then it becomes LD, else LIN

(14) E n ∋ a ⊥ Sn ⊆ E n ⇐⇒ a is orthogonal to a basis of Sn

˜ ˜ ˜
(15) Ortho complement of Sn = O(Sn ) : Collection of all vectors in E n which are orthogonal to Sn

(16) (a) Sn ∩ O(Sn ) = {0} (b) Sn + O(Sn ) = E n (c) O{O(Sn )} = Sn

˜
where, Sn is a subspace of E n

(17) S ⊕ T if, S + T = {x + y : x ∈ S, y ∈ T }
˜ ˜ ˜ ˜
(18) S ⊕ T ⇐⇒ “S ∩ T = {0}” ⇐⇒ “If x ∈ S, y ∈ T, x, y ̸= 0 then {x, y} is LIN”
˜ ˜ ˜ ˜ ˜ ˜
⇐⇒ “x + y = 0 =⇒ x = y = 0 for x ∈ S, y ˜∈ T ” ⇐⇒ “dim(S + T ˜) = dim(S) + dim(T )”
˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜
(19) S, T ∈ V is said to be complement if S ⊕ T = V

(20) Mn×n (R) : Vector space of all (n × n) real matrices

S : (n × n) Symmetric matrices, T : (n × n) Skew-symmetric matrices
n(n + 1) n(n − 1)
=⇒ S, T are subspaces of Mn×n and S ⊕ T = Mn×n , dim(S) = , dim(T ) =
2 2

3.2.2 Matrices
(1) Consider a matrix Am×n

(a) Row Space: R(A) = {x′ A : x ∈ Rm }

˜ ˜
(b) Column Space: C(A) = {Ax : x ∈ Rn }
˜ ˜
(c) Null Space: N (A) = {x ∈ Rn : Ax = 0}
˜ ˜ ˜
(d) Left Null Space: N ′ (A) = {x ∈ Rm : x′ A = 0′ }
˜ ˜ ˜
(2) N (A) = O{R(A)} =⇒ dim N (A) = n − dim R(A) = nullity of A
CHAPTER 3. MATHEMATICS 39

(3) Am×n , B n×p ∋ AB = O =⇒ C(B) ⊆ N (A) =⇒ r(A) + r(B) ≤ n

(4) An×n ∋ A2 = A =⇒ C(In − A) = N (A)

Rank
(5) r (Am×n ) ≤ min{m, n}

(6) r(AB) ≤ min{r(A), r(B)} if AB is defined

(7) r(AB) = r(A), if det B ̸= 0

(8) A2 = A ⇐⇒ r(A) + r(In − A) = n

(9) r(A + B) ≤ r(A) + r(B)

(10) r(A) = r(A′ ) = r(A′ A) = r(AA′ )

r
P
(11) r(A) = r =⇒ A = Mk with r(Mk ) = 1, k = 1, 2, . . . , r
k=1

(12) r(AB − I) ≤ r(A − I) + r(B − I)

(13) Am×n , B s×n ∋ AB ′ = O =⇒ r(A′ A + B ′ B) = r(A) + r(B)

(14) A2 = A, B 2 = B, det(I − A − B) ̸= 0 =⇒ r(A) = r(B)

A B
(15) r ≥ r(A) + r(C)
O C

(16) Sylvester Inequality: r(AB) ≥ r(A) + r(B) − n

=⇒ r(A) + r(B) − n ≤ r(AB) ≤ min{r(A), r(B)}

(17) r(A + B) ≤ r(A) + r(B) ≤ r(AB) + n, provided AB, (A + B) is defined

(18) r(AB) = r(B) − dim {N (A) ∩ C(B)}

(19) r(x′ x) = r(x′ y) = r(xy ′ ) = 1, where x, y =

̸ 0 ∈ Rn
˜˜ ˜˜ ˜˜ ˜ ˜ ˜
Other Results
(20) Sum of all entries in a matrix A : 1′ A1
˜ ˜
2
(21) A = A =⇒ (In − A) is also idempotent
n
(22) C m×r = Am×n B n×r =⇒ C = (cij ) =
P
aik bkj
k=1

(23) tr(A + B) = tr(A) + tr(B)

tr(AB) = tr(BA)
CHAPTER 3. MATHEMATICS 40

3.2.3 Determinants
1 a a2
(1) 1 b b2 = (a − b)(b − c)(c − a)
1 c c2

a b ··· b
b a ··· b
(2) .. .. . . .. = a + n − 1 b (a − b)n−1
. . . .
b b ··· a n×n

a+b b+c c+a a b c 1 0 1 a b c

(3) b + c c + a a + b = b c a × 1 1 0 = 2 b c a
c+a a+b b+c c a b 0 1 1 c a b

(4) Tridiagonal matrix

a b 0 0 ··· 0 0
c a b 0 ··· 0 0
An = 0 c a b · · · 0 0 =⇒ An = aAn−1 − bcAn−2 = 1 + bc + (bc)2 + · · · + (bc)n
.. .. .. .. .. .. ..
. . . . . . .
0 0 0 0 ··· c a n×n
 
x21 + y12 x1 x2 + y 1 y 2 · · · x1 xn + y 1 y n x1 y 1 0 ··· 0
2 2
x2 x1 + y2 y1 x2 + y 2 · · · x 2 xn + y 2 y n  x2 y 2
 0 ··· 0
(5) = |A| |A′ | = 0, A =  ..

.. .. ... .. .. .. .. .. 
. . . . . . . .
xn x1 + yn y1 xn x2 + y n y 2 · · · x2n + yn2 xn y n 0 ··· 0

n×n n×n n−1
(6) A and B differ only by a single row (or column) =⇒ |A + B| = 2 |A| + |B|

(7) A = (aij ), B = (bij ) = ri−j aij =⇒ |B| = |A|

I O I B A B I O A B
(8) = |A|, = |A| =⇒ = × = |C||A| = |A||C|
O A O A O C O C O I

(9) An×n , A(adj A) = (adj A)A = |A| In

(
Pn |A|, r = s
ari Asi = r, s = 1, 2, . . . , n
i=1 0, r ̸= s

(10) (a) |adj A| = |A|n−1

(b) (adj A)−1 = adj (A−1 ) = A
|A|

(c) adj(adj A) = |A|n−2 A

2
(d) |adj(adj A)| = |A|(n−1)

(11) |kA| = k n |A|, adj(kA) = k n−1 adj A

(12) adj(AB) = adj(B) adj(A) [if |A|, |B| =

̸ 0]
CHAPTER 3. MATHEMATICS 41

(13) Adjoint of a symmetric matrix is symmetric

Adjoint of a skew-symmetric matrix is symmetric for even order, skew-symmetric if odd order
Adjoint of a diagonal matrix is diagonal

0, if r(A) ≤ n − 2

(14) r(adj A) = 1, if r(A) = n − 1

n, if r(A) = n


Inverse of a Matrix
(15) (a) (AB)−1 = B −1 A−1
′
(b) (A−1 ) = (A′ )−1
−1
(c) |A + B| = ̸ 0 =⇒ |A−1 + B −1 | = ̸ 0 =⇒ (B −1 + A−1 ) = A(A + B)−1 B
k×k
n×n A11 A12
(16) A = , |A| =
̸ 0
A21 A22
(
|A11 ||A22 − A21 A−1 11 A12 |, if |A11 | =
̸ 0
|A| = −1
|A22 ||A11 − A12 A22 A21 |, if |A22 | = ̸ 0

A −u
(17) M = ̸ 0 =⇒ |M | = |A|(1 + v ′ A−1 u) = |A + uv ′ |
, |A| =
v 1˜ ˜ ˜ ˜˜
˜
̸ 0, |A + uv ′ | =
(18) |A| = ̸ 0 ⇐⇒ (1 + v ′ A−1 u) ̸= 0
˜˜ ˜ ˜
A−1 uv ′ A−1
(A + uv ′ )−1 = A−1 − ˜˜
˜˜ 1 + v ′ A−1 u
˜ ˜
−1
(19) An×n ′
a,b = (a − b)In + b 11 =⇒ Aa,b = Ac,d iff ∆ = (a − b)(a + n − 1 b) ̸= 0
˜˜
where, c = a+(n−2)b
∆
, d = − ∆b
n n
(20) An×n = (aij ), |A| = bij = k1 , ∀ i where A−1 = (bij )
P P
̸ 0, aij = k, ∀ i =⇒
j=1 j=1

Orthogonal Matrix
 
a′1
˜a′ 
 2
(21) An×n = ˜..  where, a′i = ai1 ai2 · · ·

ain , i = 1, 2, . . . , n
. ˜
′
an
˜ (
1, when i = j
AA′ = In =⇒ a′i aj = |A| = ±1
˜˜ 0, when i ̸= j

(22) AA′ = A′ A = In , (In + A) ̸= 0 =⇒ (In + A)−1 (In − A) is skew-symmetric

(23) A, B real orthogonal ∋ |A| + |B| = 0 =⇒ |A + B| = 0

(24) AA′ = kIn =⇒ A′ A = kIn

CHAPTER 3. MATHEMATICS 42

Rank & Determinant

(25) Theorem: For a matrix Am×n the rank of A is the order of the “highest order non-vanishing
minor” of A.

(26) An×n , r(A) = n ⇐⇒ |A| =

̸ 0

(27) Elementary Matrices: An elementary matrix is a matrix which differs from the Identity
matrix by single row (or column) operation.

(28) Elementary Row Operation ≡ Pre-multiplying by corresponding elementary row matrix

Elementary Column Operation ≡ Post-multiplying by corresponding elementary column matrix

I k O
(29) r(Am×n ) = k, ∃ P m×m , Qn×n , |P |, |Q| =
̸ 0 ∋ P AQ = (Normal Form)
O O
(30) Any non-singular matrix can be written as a product of elementary matrices

(31) Rank Factorization: Let, r(Am×n ) = k, then a pair (P m×k , Qk×n ) of matrices is said to be a
rank factorization of A if, Am×n = P m×k Qk×n (non-unique way)

(32) Am×n = P m×k Qk×n , r(A) ≤ k, the following statements are equivalent -

(a) r(A) = k i.e. (P, Q) is a rank factorization of A

(b) r(P m×k ) = r(Qk×n ) = k
(c) The columns of P forms a basis of C(A) and the rows of Q forms a basis of R(A)

(33) A2 = A, Am×m = P m×k Qk×m where k = r(A) =⇒ (i) QP = In (ii) r(A) = tr(A)

(34) r(A | B) = r(A) ⇐⇒ B = AC, for some C

3.2.4 System of Linear Equation

System of Linear Equations: Am×n xn×1 = bm×1
˜ ˜
(1) Homogeneous System: Ax = 0
˜ ˜
(a) x = 0 is always consistent as x = 0 is a trivial solution
˜ ˜ ˜ ˜
(b) Am×n x = 0 has a non-trivial solution iff r(A) < n
˜ ˜
(c) The no. of LIN solutions of Ax = 0 = dim N (A) = n − r(A)
˜ ˜
(d) Elementary row operation on a matrix A doesn’t alter the N (A)

(2) General System: Ax = b, b ̸= 0

˜ ˜ ˜ ˜
(a) r(A | b) is either r(A) or r(A) + 1
˜
C(A | b) ⊇ C(A)
˜ (
Consistent ⇐⇒ r(A | b) = r(A)
(b) Ax = b → ˜
˜ ˜ Inconsistent ⇐⇒ r(A | b) > r(A)
( ˜
Unique solution ⇐⇒ r(A) = n
(c) Am×n x = b is consistent →
˜ ˜ At least two solutions ⇐⇒ r(A) < n
CHAPTER 3. MATHEMATICS 43

(d) Ax1 = Ax2 = b =⇒ αx1 + (1 − α)x2 is also a solution

˜ ˜ ˜ ˜ ˜
=⇒ If a system has two distinct solutions then it has infinitely many solutions

(3) Theorem: Ax = b be a consistent system with x0 as a particular solution. Then the set of all
˜ of
possible solutions ˜ Ax = b is given by x + N (A)˜ = {x + u : u ∈ N (A)}
0 0
˜ ˜ ˜ ˜ ˜ ˜
• Point, lines, planes not necessarily passing through the origin are called ‘flats’. If W is non-empty
flat and x0 is a fixed vector, then the translation of W by x0 is, x0 + W = {x0 + w : w ∈ W } and is
˜
a flat parallel to W . ˜ ˜ ˜ ˜ ˜

(Lecture Notes in Mathematics 1730) Siegfried Graf, Harald Luschgy (Auth.) - Foundations of Quantization For Probability Distributions (2000, Springer-Verlag Berlin Heidelberg) PDF
No ratings yet
(Lecture Notes in Mathematics 1730) Siegfried Graf, Harald Luschgy (Auth.) - Foundations of Quantization For Probability Distributions (2000, Springer-Verlag Berlin Heidelberg) PDF
237 pages
Real Analysis Short Notes
No ratings yet
Real Analysis Short Notes
106 pages
Real Analysis Problem Book Guide
100% (1)
Real Analysis Problem Book Guide
1 page
Finite Abelian Groups
No ratings yet
Finite Abelian Groups
6 pages
Algebra Written Notes
No ratings yet
Algebra Written Notes
96 pages
334C5C
0% (1)
334C5C
3 pages
Probability Theory - Varadhan
No ratings yet
Probability Theory - Varadhan
225 pages
Graph Theory - S A Chodum
No ratings yet
Graph Theory - S A Chodum
289 pages
Understanding Markov Chains Examples and Applications 2nd Edition Nicolas Privault Available Full Chapters
No ratings yet
Understanding Markov Chains Examples and Applications 2nd Edition Nicolas Privault Available Full Chapters
135 pages
(Ebook) Introduction To Probability and Its Applications, Third Edition by Richard L. Scheaffer, Linda J. Young ISBN 9780534386719, 0534386717 Download
No ratings yet
(Ebook) Introduction To Probability and Its Applications, Third Edition by Richard L. Scheaffer, Linda J. Young ISBN 9780534386719, 0534386717 Download
151 pages
Notes On Stochastic Processes: 1 Learning Outcomes
No ratings yet
Notes On Stochastic Processes: 1 Learning Outcomes
26 pages
DIPS Previous Yeare Chapterwise
No ratings yet
DIPS Previous Yeare Chapterwise
250 pages
Fundamental of Mathematical Statistics-S C Gupta & V K Kapoor - 1 - 1
No ratings yet
Fundamental of Mathematical Statistics-S C Gupta & V K Kapoor - 1 - 1
25 pages
Student Solutions Manual For Introduction To Probability With Statistical Applications
100% (1)
Student Solutions Manual For Introduction To Probability With Statistical Applications
58 pages
Isi Mstat Pyq Book (2024-2016)
No ratings yet
Isi Mstat Pyq Book (2024-2016)
178 pages
Isi Mma Paper Sample Questions & Solutions Set - 1: WWW - Ctanujit.in
No ratings yet
Isi Mma Paper Sample Questions & Solutions Set - 1: WWW - Ctanujit.in
12 pages
NM - Unit 3
No ratings yet
NM - Unit 3
39 pages
MA-203: Note On Green's Function: April 19, 2020
No ratings yet
MA-203: Note On Green's Function: April 19, 2020
10 pages
Booklist and Strategy For Paper 2 - NITISH K
No ratings yet
Booklist and Strategy For Paper 2 - NITISH K
15 pages
Higher Ordered Derivatives
No ratings yet
Higher Ordered Derivatives
7 pages
Math F424 2191
100% (1)
Math F424 2191
3 pages
Mathocrat PYQ 1992-2022 Calculus
No ratings yet
Mathocrat PYQ 1992-2022 Calculus
17 pages
A Foundation Course in Mathematics: Ajit Kumar Bhaba Kumar Sarma
0% (1)
A Foundation Course in Mathematics: Ajit Kumar Bhaba Kumar Sarma
2 pages
Statistical Inference
No ratings yet
Statistical Inference
35 pages
Linear Algebra (IAS) 1979-2006 Solved
No ratings yet
Linear Algebra (IAS) 1979-2006 Solved
153 pages
Selected Solutions To Dummit and Foote's Abstract Algebra Third Edition
No ratings yet
Selected Solutions To Dummit and Foote's Abstract Algebra Third Edition
157 pages
Robert v. Hogg, Allen T. Craig - Introduction To M
No ratings yet
Robert v. Hogg, Allen T. Craig - Introduction To M
448 pages
Functional Analysis D - Somasundaram
No ratings yet
Functional Analysis D - Somasundaram
197 pages
Linear Algebra
No ratings yet
Linear Algebra
65 pages
Numerical Analysis & C Programming
100% (1)
Numerical Analysis & C Programming
666 pages
2.mean Value Theorems
No ratings yet
2.mean Value Theorems
29 pages
Webpage:: Textbook: "Matrix Algebra Useful For Statistics", Searle
0% (1)
Webpage:: Textbook: "Matrix Algebra Useful For Statistics", Searle
18 pages
Numerical Analysis Assignments
No ratings yet
Numerical Analysis Assignments
32 pages
Applied Statistics
No ratings yet
Applied Statistics
227 pages
Descriptive Statistics & Probability
No ratings yet
Descriptive Statistics & Probability
34 pages
Numerical Methods for MCA Students
No ratings yet
Numerical Methods for MCA Students
3 pages
Mathematics-Msc-2020 Allahabad University
No ratings yet
Mathematics-Msc-2020 Allahabad University
41 pages
CSIR NET Statistics PYQs
No ratings yet
CSIR NET Statistics PYQs
94 pages
Dips Assignment For Mathematical Statistics IIT JAM 2
No ratings yet
Dips Assignment For Mathematical Statistics IIT JAM 2
39 pages
Inference 2
No ratings yet
Inference 2
64 pages
LPP Book
No ratings yet
LPP Book
23 pages
Fundamental of Statistics Volume - 1
100% (1)
Fundamental of Statistics Volume - 1
561 pages
Math Foundation for Students
No ratings yet
Math Foundation for Students
2 pages
Advance Calculus Paper 1 Unit 3 Maths
No ratings yet
Advance Calculus Paper 1 Unit 3 Maths
170 pages
B.SC Statistics PDF
No ratings yet
B.SC Statistics PDF
16 pages
Advanced Math Exam Guide
No ratings yet
Advanced Math Exam Guide
5 pages
Detailed Paper Analysis of Iit Jam Eduncle
0% (1)
Detailed Paper Analysis of Iit Jam Eduncle
14 pages
Math
No ratings yet
Math
234 pages
Christy's Classes - Average and Mixtures
No ratings yet
Christy's Classes - Average and Mixtures
58 pages
Greedy Algorithms: Week 3 Topics
No ratings yet
Greedy Algorithms: Week 3 Topics
2 pages
Lecture01a (Linear Algebra)
100% (1)
Lecture01a (Linear Algebra)
113 pages
An Introduction To Statistical Computing A Simulation Based Approach 1st Edition Jochen Voss Available Any Format
0% (1)
An Introduction To Statistical Computing A Simulation Based Approach 1st Edition Jochen Voss Available Any Format
159 pages
Book Nut
No ratings yet
Book Nut
1 page
Characteristic Functions - Lukacs, Eugene - 1970
No ratings yet
Characteristic Functions - Lukacs, Eugene - 1970
368 pages
Continuity in Metric Spaces
100% (1)
Continuity in Metric Spaces
6 pages
Programme Project Report (PPR) of Master of Science (Mathematics) M.Sc. (Mathematics)
No ratings yet
Programme Project Report (PPR) of Master of Science (Mathematics) M.Sc. (Mathematics)
23 pages
Homework 2 Solutions
No ratings yet
Homework 2 Solutions
2 pages
Lecture Notes - Probability Theory: Manuel Cabral Morais
No ratings yet
Lecture Notes - Probability Theory: Manuel Cabral Morais
297 pages
Probability for Electrical Engineers
No ratings yet
Probability for Electrical Engineers
144 pages
Adding Mixed Numbers PDF
No ratings yet
Adding Mixed Numbers PDF
2 pages
Grade 4 Maths 1
No ratings yet
Grade 4 Maths 1
5 pages
Back Track
No ratings yet
Back Track
37 pages
Atlas of Emergency Medicine 2nd Edition K Knoop PDF Version
100% (3)
Atlas of Emergency Medicine 2nd Edition K Knoop PDF Version
72 pages
Sample Set Theory Midterm
No ratings yet
Sample Set Theory Midterm
1 page
Solutions For Problems in The 9 International Mathematics Competition For University Students
No ratings yet
Solutions For Problems in The 9 International Mathematics Competition For University Students
4 pages
LM Pre Calculus Q1 M8 V2-RoberontaEstela
No ratings yet
LM Pre Calculus Q1 M8 V2-RoberontaEstela
22 pages
Probability of Simple Event
100% (2)
Probability of Simple Event
15 pages
Optimization in Chem Eng Course
No ratings yet
Optimization in Chem Eng Course
4 pages
Lab Programs
No ratings yet
Lab Programs
4 pages
Compound Growth and Present Discounted Value
No ratings yet
Compound Growth and Present Discounted Value
26 pages
Three-Dimensional Geometrical Characterization of Cerebral Aneurysms
No ratings yet
Three-Dimensional Geometrical Characterization of Cerebral Aneurysms
10 pages
(Lecture Notes in Physics 786) Christian Bär, Christian Becker (auth.), Christian Bär, Klaus Fredenhagen (eds.) - Quantum Field Theory on Curved Spacetimes_ Concepts and Mathematical Foundations-Sprin.pdf
No ratings yet
(Lecture Notes in Physics 786) Christian Bär, Christian Becker (auth.), Christian Bär, Klaus Fredenhagen (eds.) - Quantum Field Theory on Curved Spacetimes_ Concepts and Mathematical Foundations-Sprin.pdf
166 pages
WAEC Core Mathematics General Mathematics Syllabus Cegast Academy For PDF
No ratings yet
WAEC Core Mathematics General Mathematics Syllabus Cegast Academy For PDF
16 pages
CMMW Chapter-2 Part-3 Relations-Functions
No ratings yet
CMMW Chapter-2 Part-3 Relations-Functions
51 pages
Pizzazz PDF
75% (4)
Pizzazz PDF
96 pages
Arivu Pro Maths Notes Final
No ratings yet
Arivu Pro Maths Notes Final
239 pages
Chapter 2 Fourier Series and Transform - 1
No ratings yet
Chapter 2 Fourier Series and Transform - 1
14 pages
BODMAS Rule Explained with Examples
No ratings yet
BODMAS Rule Explained with Examples
4 pages
Arthur Cayley
No ratings yet
Arthur Cayley
8 pages
Paul Samuelson PHD Dissertation
100% (2)
Paul Samuelson PHD Dissertation
6 pages
Weekly Exam 7: Analytic Geometry 2/ Differential Calculus 1 and 2
No ratings yet
Weekly Exam 7: Analytic Geometry 2/ Differential Calculus 1 and 2
2 pages
Graph Theory: Euler and Hamiltonian Paths
No ratings yet
Graph Theory: Euler and Hamiltonian Paths
17 pages
2024 District Math Quiz Team Orals
No ratings yet
2024 District Math Quiz Team Orals
5 pages
Question KPMT 1 2010
No ratings yet
Question KPMT 1 2010
4 pages
Everything You Need To Ace Pre Algebra and Algebra I in One Big Fat Notebook 2nd Edition Jason Wang Get PDF
No ratings yet
Everything You Need To Ace Pre Algebra and Algebra I in One Big Fat Notebook 2nd Edition Jason Wang Get PDF
90 pages
Class Xi Maths Competency Based Questions 2023-24 (KVS, Raipur)
No ratings yet
Class Xi Maths Competency Based Questions 2023-24 (KVS, Raipur)
116 pages
Advanced Math Contest Insights
No ratings yet
Advanced Math Contest Insights
4 pages
Canadian Math Challenge Solutions
No ratings yet
Canadian Math Challenge Solutions
4 pages
W3 Lesson 3 - Systems of Linear Equation - Module
No ratings yet
W3 Lesson 3 - Systems of Linear Equation - Module
11 pages