0% found this document useful (0 votes)
467 views8 pages

EC2303 Final Formula Sheet PDF

This document provides formulas and definitions for key concepts in descriptive statistics, probability, and sampling distributions. It includes: 1) Formulas for population and sample means, variances, covariances, and correlations. 2) Definitions of probability rules like addition, multiplication, and Bayes' theorem. 3) Distributions for discrete and continuous random variables like the binomial, Poisson, uniform, and normal distributions. 4) Formulas for the mean and variance of the sample mean and the central limit theorem.

Uploaded by

norman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
467 views8 pages

EC2303 Final Formula Sheet PDF

This document provides formulas and definitions for key concepts in descriptive statistics, probability, and sampling distributions. It includes: 1) Formulas for population and sample means, variances, covariances, and correlations. 2) Definitions of probability rules like addition, multiplication, and Bayes' theorem. 3) Distributions for discrete and continuous random variables like the binomial, Poisson, uniform, and normal distributions. 4) Formulas for the mean and variance of the sample mean and the central limit theorem.

Uploaded by

norman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

EC2303 Formula Sheet

Descriptive statistics:
Population mean, variance and coefficient of variation:
𝑁 𝑁
1 1 𝜎
𝜇 = ∑ 𝑥𝑖 ; 𝜎 2 = ∑(𝑥𝑖 − 𝜇)2 ; 𝐶𝑉 = ∗ 100%
𝑁 𝑁 𝜇
𝑖=1 𝑖=1

Population covariance and correlation:


𝑁
1 𝑐𝑜𝑣(𝑥, 𝑦) 𝜎𝑥𝑦
𝑐𝑜𝑣(𝑥, 𝑦) = 𝜎𝑥𝑦 = ∑(𝑥𝑖 − 𝜇𝑥 )(𝑦𝑖 − 𝜇𝑦 ) ; 𝜌𝑥𝑦 = =
𝑁 𝜎𝑥 𝜎𝑦 𝜎𝑥 𝜎𝑦
𝑖=1

Sample mean and variance:


𝑛 𝑛
1 1 𝑠
𝑥̅ = ∑ 𝑥𝑖 ; 𝑠 2 = ∑(𝑥𝑖 − 𝑥̅ )2 ; 𝐶𝑉 = ∗ 100%
𝑛 𝑛−1 𝑥̅
𝑖=1 𝑖=1

Sample covariance and correlation:


𝑛
1 𝑐𝑜𝑣(𝑥, 𝑦) 𝑠𝑥𝑦
𝑐𝑜𝑣(𝑥, 𝑦) = 𝑠𝑥𝑦 = ∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅) ; 𝑟𝑥𝑦 = =
𝑛−1 𝑠𝑥 𝑠𝑦 𝑠𝑥 𝑠𝑦
𝑖=1

Introduction to probability:
Permutation rule formula:
𝑛!
𝑃𝑥𝑛 = 𝑛(𝑛 − 1)(𝑛 − 2) … . (𝑛 − 𝑥 + 1) =
(𝑛 − 𝑥)!
Where 𝑥! = 𝑥(𝑥 − 1)(𝑥 − 2) … (1)
Combination rule formula:
𝑛(𝑛 − 1)(𝑛 − 2) … . (𝑛 − 𝑥 + 1) 𝑛! 𝑃𝑥𝑛
𝐶𝑥𝑛 = = =
𝑥(𝑥 − 1) … (1) 𝑥! (𝑛 − 𝑥)! 𝑥!
Where 𝑥! = 𝑥(𝑥 − 1)(𝑥 − 2) … (1)
Addition rule:
𝑃(𝐴) + 𝑃(𝐵) = 𝑃(𝐴 ∪ 𝐵) + 𝑃(𝐴 ∩ 𝐵)
Conditional probability:
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴|𝐵) =
𝑃(𝐵)
Multiplication rule:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴|𝐵)𝑃(𝐵) = 𝑃(𝐵|𝐴)𝑃(𝐴)
1
Law of total probability:
𝑃(𝐴) = 𝑃(𝐴|𝐵1)𝑃(𝐵1 ) + 𝑃(𝐴|𝐵2)𝑃(𝐵2 ) + … + 𝑃(𝐴|𝐵𝑘 )𝑃(𝐵𝑘 )
where 𝐵1 , 𝐵2 , … , 𝐵𝑘 are mutually exclusive and collectively exhaustive events.
Bayes’ theorem:
𝑃(𝐵|𝐴)𝑃(𝐴)
𝑃(𝐴|𝐵) =
𝑃(𝐵)

Discrete random variable distribution:


Probability distribution function
𝑃(𝑥) = 𝑃(𝑋 = 𝑥) for all values of 𝑥
Cumulative probability distribution

𝐹(𝑥𝑚 ) = 𝑃(𝑋 ≤ 𝑥𝑚 ) = ∑ 𝑃(𝑥)


𝑥≤𝑥𝑚

Mean and variance:

𝐸[𝑋] = 𝜇 = ∑ 𝑥𝑃(𝑥) ; 𝜎 2 = 𝐸[(𝑋 − 𝜇)2 ] = ∑(𝑥 − 𝜇)2 𝑃(𝑥)


𝑥 𝑥

Let 𝑌 = 𝑔(𝑋) be any function of 𝑋:

𝐸[𝑌] = 𝐸[𝑔(𝑋)] = ∑ 𝑔(𝑥)𝑃(𝑥)


𝑥

Bernoulli distribution
𝑃(𝑋 = 1) = 𝑃 ; 𝑃(𝑋 = 0) = 1 − 𝑃
𝑀𝑒𝑎𝑛: 𝜇 = 𝑃; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝑃(1 − 𝑃)
Binomial distribution
𝑛!
𝑃(𝑥) = 𝑃 𝑥 (1 − 𝑃)𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
𝑀𝑒𝑎𝑛: 𝜇 = 𝑛𝑃; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝑛𝑃(1 − 𝑃)
Poisson distribution
𝜆𝑥
𝑃(𝑥) = 𝑒 −𝜆 𝑓𝑜𝑟 𝑥 = 0,1,2, …
𝑥!
𝑀𝑒𝑎𝑛: 𝜇 = 𝜆 𝑎𝑛𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝜆

Continuous random variable distribution:


Probability of 𝑋 between 𝑎 and 𝑏:
𝑏
𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎

2
Cumulative distribution function:
𝑥
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = 𝑃(𝑋 < 𝑥) = ∫ 𝑓(𝑥)𝑑𝑥
𝑥𝑚𝑖𝑛

Mean and variance:


𝑥𝑚𝑎𝑥 𝑥𝑚𝑎𝑥
𝜇 = 𝐸[𝑋] = ∫ 𝑥𝑓(𝑥)𝑑𝑥 ; 𝜎 2 = 𝐸[(𝑥 − 𝜇)2 ] = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
𝑥𝑚𝑖𝑛 𝑥𝑚𝑖𝑛

Uniform distribution
1 𝑥−𝑎
𝑓(𝑥) = ; 𝐹(𝑥) = 𝑓𝑜𝑟 𝑎𝑛𝑦 𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎 𝑏−𝑎
𝑎+𝑏 2
(𝑏 − 𝑎)2
𝑀𝑒𝑎𝑛: 𝜇 = ; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 =
2 12
Normal distribution
1 2 /2𝜎 2
𝐹𝑜𝑟 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑓(𝑥) = 𝑒 −(𝑥−𝜇)
√2𝜋𝜎 2
Standard normal distribution:
1 2 /2
𝐹𝑜𝑟 𝑍~𝑁(0,1), 𝑓(𝑧) = 𝑒 −𝑥
√2𝜋
Exponential distribution:
𝑓(𝑡) = 𝜆𝑒 −𝜆𝑡 ; 𝐹(𝑡) = 1 − 𝑒 −𝜆𝑡 for t > 0
1 1
𝑀𝑒𝑎𝑛: 𝜇 = 𝑎𝑛𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 2
𝜆 𝜆
Joint Probability Distribution:
Joint probability distribution:
𝑃(𝑥, 𝑦) = 𝑃(𝑋 = 𝑥 ∩ 𝑌 = 𝑦)
Marginal probability distributions

𝑃(𝑥) = ∑ 𝑃(𝑥, 𝑦) and 𝑃(𝑦) = ∑ 𝑃(𝑥, 𝑦)


𝑦 𝑥

conditional probability distribution:


𝑃(𝑥, 𝑦) 𝑃(𝑥, 𝑦)
𝑃(𝑦|𝑥) = ; 𝑃(𝑥|𝑦) =
𝑃(𝑥) 𝑃(𝑦)
Conditional mean

𝜇𝑌|𝑋 = 𝐸[𝑌|𝑋] = ∑(𝑦|𝑋)𝑃(𝑦|𝑋)


𝑌

Conditional variance
2 2 2
𝜎𝑌|𝑋 = 𝐸 [(𝑌 − 𝜇𝑌|𝑋 ) |𝑋] = ∑ [(𝑌 − 𝜇𝑌|𝑋 ) |𝑋] 𝑃(𝑦|𝑋)
𝑌

Covariance

3
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )] = ∑ ∑(𝑥 − 𝜇𝑋 )(𝑦 − 𝜇𝑌 )𝑃(𝑥, 𝑦)
𝑦 𝑥

Correlation
𝐶𝑜𝑣(𝑋, 𝑌)
𝜌 = 𝐶𝑜𝑟𝑟(𝑋, 𝑌) =
𝜎𝑋 𝜎𝑌
General rules:
𝐸[𝑋 + 𝑌] = 𝜇𝑋 + 𝜇𝑌
𝐸[𝑋 − 𝑌] = 𝜇𝑋 − 𝜇𝑌
𝑉𝑎𝑟(𝑋 + 𝑌) = 𝜎𝑋2 + 𝜎𝑌2 + 2𝐶𝑜𝑣(𝑋, 𝑌)
𝑉𝑎𝑟(𝑋 − 𝑌) = 𝜎𝑋2 + 𝜎𝑌2 − 2𝐶𝑜𝑣(𝑋, 𝑌)
𝑉𝑎𝑟(𝑋 + 𝑌 + 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 + 2𝐶𝑜𝑣(𝑋, 𝑌) + 2𝐶𝑜𝑣(𝑋, 𝑍) + 2𝐶𝑜𝑣(𝑌, 𝑍)
𝑉𝑎𝑟(𝑋 − 𝑌 + 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 − 2𝐶𝑜𝑣(𝑋, 𝑌) + 2𝐶𝑜𝑣(𝑋, 𝑍) − 2𝐶𝑜𝑣(𝑌, 𝑍)
𝑉𝑎𝑟(𝑋 − 𝑌 − 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 − 2𝐶𝑜𝑣(𝑋, 𝑌) − 2𝐶𝑜𝑣(𝑋, 𝑍) + 2𝐶𝑜𝑣(𝑌, 𝑍)
𝐶𝑜𝑣(𝑋, 𝑋) = 𝑉𝑎𝑟(𝑋)
𝐶𝑜𝑣(𝑋 + 𝑌, 𝑍) = 𝐶𝑜𝑣(𝑋, 𝑍) + 𝐶𝑜𝑣(𝑌, 𝑍)

Sampling Distribution of Sample Mean:


Denote 𝑋̅ as the sample mean of a random sample from a population with mean 𝜇 and
variance 𝜎 2 .
Mean of sample mean:
𝜇𝑋̅ = 𝜇
Variance of sample mean if the population size is infinitely large:
𝜎2
𝜎𝑋2̅ =
𝑛
n
Variance of sample mean if the population size is finite (i.e., N < 0.05):

𝜎2 𝑁 − 𝑛
𝜎𝑋2̅ =
𝑛 𝑁−1

Central Limit Theorem: Let 𝑋1 , 𝑋2 , … 𝑋𝑛 be a sample of 𝑛 independent random variables


with an arbitrary distribution with mean 𝜇 and variance 𝜎 2 , and let 𝑋̅ be the sample mean. As
𝑋̅−𝜇
𝑛 gets larger, the distribution of 𝑍 = 𝜎𝑋
approaches the standard normal distribution.
̅

Sampling Distribution of Sample Proportion:


Denote 𝑝̂ as the sample proportion of a random sample from a population with mean
proportion of 𝑃.
Mean of sample proportion:
𝜇𝑝̂ = 𝑃
Variance of sample proportion:

4
𝑃(1 − 𝑃)
𝜎𝑝2̂ =
𝑛

Central Limit Theorem: Let 𝑋1 , 𝑋2 , … 𝑋𝑛 be a sample of 𝑛 independent random variables


with an arbitrary distribution with population proportion 𝑃, and let 𝑝̂ be the sample proportion.
𝑝̂−𝑃
As 𝑛 gets larger, the distribution of 𝑍 = 𝜎 approaches the standard normal distribution.
̂
𝑝

Sampling Distribution of Sample Variance:


Denote 𝑠 2 as the sample variance of a random sample from population with mean 𝜇 and
variance 𝜎 2 .
Mean of sample variance:
𝐸(𝑠 2 ) = 𝜎 2
If the population distribution is normal, then
(𝑛 − 1)𝑠 2 2
= 𝜒𝑛−1
𝜎2
has a 𝜒 2 distribution with 𝑛 − 1 degrees of freedom.
If the population distribution is normal, variance of sample variance:
2𝜎 4
𝑉𝑎𝑟(𝑠 2 ) =
𝑛−1
Confidence Interval:
Confidence
90% 95% 98% 99%
Level
𝜶 0.10 0.05 0.02 0.01
𝒛𝜶/𝟐 1.645 1.96 2.33 2.58

If population is normal or sample size is large, and population variance 𝜎 2 is known, the
100(1 − 𝛼)% confidence intervals of population mean is:
𝜎
𝑋̅ ± 𝑧𝛼/2
√𝑛
If population is normal or sample size is large, and population variance 𝜎 2 is unknown, the
100(1 − 𝛼)% confidence intervals of population mean is:
𝑠
𝑋̅ ± 𝑡𝑛−1,𝛼/2
√𝑛
If population is normal or sample size is large, the 100(1 − 𝛼)% confidence intervals of
population proportion is:

𝑝̂ (1 − 𝑝̂ )
𝑝̂ ± 𝑧𝛼/2 √
𝑛

If population is normal, the 100(1 − 𝛼)% confidence intervals of population variance is:

5
(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
2 < 𝜎2 < 2
𝜒𝑛−1,𝛼/2 𝜒𝑛−1,1−𝛼/2

If population size is infinitely large, margin of error for population mean is required to be 𝑀𝐸,
sample size 𝑛 is determined by
2
𝑧𝛼/2 𝜎2
𝑛=
𝑀𝐸 2
If population size is finite, margin of error for population mean is required to be 𝑀𝐸, sample
size 𝑛 is determined by
𝑛0 𝑁
𝑛=
𝑛0 + 𝑁 − 1
2
𝑧𝛼/2 𝜎2
Where 𝑛0 = 𝑀𝐸 2

If margin of error for population proportion is required to be 𝑀𝐸, sample size 𝑛 is


determined by
2
0.25𝑧𝛼/2
𝑛=
𝑀𝐸 2
Hypothesis Test:
One-sided

Significance Level 10% 5% 1%

𝜶 0.10 0.05 0.01

𝒛𝜶 1.28 1.645 2.33

Two-sided

Significance Level 10% 5% 1%

𝜶 0.10 0.05 0.01

𝒛𝜶/𝟐 1.645 1.96 2.58

If population variance 𝜎 2 is known, reject the null hypothesis 𝐻0 : 𝜇 = 𝜇0 at significance level


of 𝛼,
against one-sided alternative 𝐻1 : 𝜇 > 𝜇0 if
𝑋̅ − 𝜇0
𝑧= > 𝑧𝛼
𝜎/√𝑛
against one-sided alternative 𝐻1 : 𝜇 < 𝜇0 if
𝑋̅ − 𝜇0
𝑧= < −𝑧𝛼
𝜎/√𝑛
against two-sided alternative 𝐻1 : 𝜇 ≠ 𝜇0 if
6
𝑋̅−𝜇0 𝑋̅−𝜇0
𝑧= > 𝑧𝛼/2 or 𝑧 = < −𝑧𝛼/2
𝜎/√𝑛 𝜎/√𝑛

If population variance 𝜎 2 is unknown, reject the null hypothesis 𝐻0 : 𝜇 = 𝜇0 at significance


level of 𝛼,
against one-sided alternative 𝐻1 : 𝜇 > 𝜇0 if
𝑋̅ − 𝜇0
𝑡= > 𝑡𝑛−1,𝛼
𝑠/√𝑛
against one-sided alternative 𝐻1 : 𝜇 < 𝜇0 if
𝑋̅ − 𝜇0
𝑡= < −𝑡𝑛−1,𝛼
𝑠/√𝑛
against two-sided alternative 𝐻1 : 𝜇 ≠ 𝜇0 if
𝑋̅−𝜇0 𝑋̅−𝜇0
𝑡= 𝑠/√𝑛
> 𝑡𝑛−1,𝛼/2 or 𝑡 = 𝑠/√𝑛
< −𝑡𝑛−1,𝛼/2

Univariate Regression (Least Square Regression):


Population regression model:
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜖𝑖
where 𝜖𝑖 is the population error.
Sample regression model:
𝑦𝑖 = 𝑦̂𝑖 + 𝑒𝑖 = 𝑏0 + 𝑏1 𝑥𝑖 + 𝑒𝑖
where 𝑒𝑖 is the regression residual.
Slope coefficient
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅) 𝑐𝑜𝑣(𝑥, 𝑦) 𝑠𝑦
𝑏1 = 𝑛 = 2 = 𝑟
∑𝑖=1(𝑥𝑖 − 𝑥̅ )2 𝑠𝑥 𝑠𝑥
∑𝑛 ̅)
𝑖=1(𝑥𝑖 −𝑥̅ )(𝑦𝑖 −𝑦 𝑐𝑜𝑣(𝑥,𝑦)
where 𝑐𝑜𝑣(𝑥, 𝑦) = 𝑛−1
and 𝑟 = 𝑠𝑥 𝑠𝑦
.

Intercept coefficient
𝑏0 = 𝑦̅ − 𝑏1 𝑥̅
Error/residual sum of squared, SSE, is:
𝑛 𝑛

𝑆𝑆𝐸 = ∑ 𝑒𝑖2 = ∑(𝑦𝑖 − 𝑦̂𝑖 )2


𝑖=1 𝑖=1

Total sum of squared, SST, is

𝑆𝑆𝑇 = ∑(𝑦𝑖 − 𝑦̅)2

Regression sum of squared, SSR, is

𝑆𝑆𝑅 = ∑(𝑦̂𝑖 − 𝑦̅)2

𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸


Coefficient of determination, R-squared, 𝑅 2
7
𝑆𝑆𝑅 𝑆𝑆𝐸
𝑅2 = =1−
𝑆𝑆𝑇 𝑆𝑆𝑇
Under univariate least squared regression,
𝑟2 = 𝑅2
If the population error 𝜖 follows normal distribution and variance of the population error 𝜎 2 is
known, 𝑏1 follows the normal distribution:
𝑏1 ~𝑁(𝛽1 , 𝜎𝑏21 )
where
𝜎2 𝜎2
𝜎𝑏21 = =
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 (𝑛 − 1)𝑠𝑥2
The estimator for the variance of the population error
∑𝑛𝑖=1 𝑒𝑖2
𝜎̂ 2 = 𝑠𝑒2 =
𝑛−2
Where 𝑠𝑒 is called the residual standard error.
If the population error 𝜖 follows normal distribution, and variance of the population error, 𝜎 2 ,
is unknown,
𝑏1 − 𝛽1
𝑡= ~𝑡𝑛−2
𝑠𝑏1
follows the Student’s t-distribution with 𝑛 − 2 degree of freedom:
𝑠𝑒2 𝑠𝑒2
𝑠𝑏21 = =
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 (𝑛 − 1)𝑠𝑥2
The (100 − 𝛼)% confidence interval for the population slope coefficient 𝛽1 is
𝑏1 − 𝑡𝑛−2,𝛼/2 𝑠𝑏1 < 𝛽1 < 𝑏1 + 𝑡𝑛−2,𝛼/2 𝑠𝑏1
Reject the null hypothesis of 𝐻0 : 𝛽1 = 𝛽 ∗ against alternative 𝐻1 : 𝛽1 ≠ 𝛽 ∗ if
𝑏1 −𝛽 ∗ 𝑏1 −𝛽 ∗
𝑡= 𝑠𝑏1
> 𝑡𝑛−2,𝛼/2 or 𝑡 = 𝑠𝑏1
< −𝑡𝑛−2,𝛼/2

You might also like