0% found this document useful (0 votes)

54 views65 pages

Stats Review

The document provides an overview of key topics in statistics, including descriptive statistics, probability, random variables, hypothesis testing, and linear regression. Descriptive statistics involve collecting, presenting, and characterizing data to describe the sample. Inferential statistics are used to make decisions about population characteristics through estimation and hypothesis testing. Probability concepts discussed include events, sample spaces, joint and compound events, and independence. Random variables can be discrete or continuous, and are characterized by probability distributions and moments like mean and variance. Hypothesis testing involves identifying hypotheses, choosing a significance level, and determining whether to reject the null hypothesis based on the sample statistic's value in the sampling distribution.

Uploaded by

Bhowal RS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views65 pages

Stats Review

Uploaded by

Bhowal RS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Review of Statistics

Topics
Descriptive Statistics
Mean, Variance

Probability
Union event, joint event

Random Variables
Discrete and Continuous
Distributions, Moments

Two Random Variables

Covariance and correlation

Central Limit Theorem

Hypothesis testing
z-test, p-value

Simple Linear Regression

Statistical Methods

Statistical
Methods

Descriptive
Statistics

Inferential
Statistics

Descriptive Statistics
Involves
Collecting Data
Presenting Data
Characterizing Data

Purpose
Describe Data

90
80
70
60
50
40
30
20
10
0

East
West
North

1st 2nd 3rd 4th

Qtr Qtr Qtr Qtr

Inferential Statistics
Involves
Estimation
Hypothesis
Testing

Purpose
Make Decisions
About Population
Characteristics

Population?

Descriptive Statistics

Mean

Measure of central tendency

Acts as Balance Point
Affected by extreme values (outliers)
Formula:
n

X =

Xi
i =1

X1 + X2 +
n

+ Xn

Median

Measure of central tendency

Middle value in ordered sequence

If odd n, Middle Value of Sequence
If even n, Average of 2 Middle Values

Value that splits the distribution into two

halves

Not Affected by Extreme Values

Median (Example)
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8

Median =

16 + 16
2

= 16

Mode
Measure of Central Tendency
Value That Occurs Most Often
Not Affected by Extreme Values
There May Be Several Modes
Raw Data:
17
16
Ordered:
11
12

Sample Variance
n

S =

(Xi X)
i =1

n 1
2

n - 1 in denominator!
(Use n if population
variance)

(X1 X) + (X2 X) +
n 1

+ (X n X)

Sample Standard Deviation

S
n

(Xi X )
i =1

n 1
2

(X1 X ) + (X 2 X ) +
n 1

+ (Xn X )

Probability

Event, Sample Space

Event:
Sample space:

one possible outcome

collection of all the possible events

S ={

Probability of an outcome: proportion of times that the
outcome occurs in the long run
The complement of event A: includes all the events that
are not part of the event A: Symbol A
Event A { }
Complement of A A {

Properties of an Event
1. Mutually Exclusive
Two outcomes that cannot occur
at the same time

2. Collectively Exhaustive
One outcome in sample space
must occur

Experiment: Observe
gender of one person

Joint Events
Joint event: Event that has two or more characteristics
means intersection of event (set) A and event (set) B
Example:
A and B, (AB): Female, Under age 20

Compound Events
Union of event A and event B ( A B ): Total area of the
two circles
A B contains all the outcomes which are part of event
(set) A, part of event (set) B or part of both A and B
means union of event A and event B

Compound Probability Addition Rule

Used to Get Compound Probabilities for Unions of Events
P(A OR B)

= P(A B)
= P(A) + P(B) - P(A B)

For Mutually Exclusive Events:

P(A OR B)
= P(A B) = P(A) + P(B)
Mutually Exclusive Events

A
B

Random variables
Random variable
numerical summary of a random outcome
a function that assigns a numerical value to each simple
event in a sample space
Discrete or continuous random variables
Discrete: only a discrete set of possible values
=> summarized by probability distribution:
list of all possible values of the variables and the
probability that each value will occur.
Continuous: continuum of possible values
=> summarized by the probability density function
(pdf)

Discrete Probability Distribution

List of pairs [ Xi, P(Xi) ]

Xi = Value of Random Variable (Outcome)
P(Xi) = Probability Associated with Value

Mutually exclusive (no overlap)

Collectively exhaustive (nothing left out)

0 P(Xi) 1

P(Xi) = 1

Joint Probability Using Contingency

Table
Conditional
probability:

Event
Event

Total

P(A1 B1) P(A1 B2) P(A1)

P(A2 B1) P(A2 B2) P(A2)

Total

P(B1)

P(B2)

Joint Probability

Marginal Probability

Joint distribution:

Marginal distributions:
Conditional distribution:

P( A B1 )
P ( A1 B1 )
P ( B1 )
P ( A2 B1 )
P( B1 )

Contingency Table Example

Joint Event: Draw 1 Card. Note Kind, Color
Color
Type
Ace

Red

Black

Total

2/52

4/52

Non-Ace 24/52

24/52 48/52

26/52

26/52 52/52

Total
P(Red)

P(Ace)

P(Ace AND Red)

Moments Discrete Case

Moment: Summary of a certain aspect of a
distribution
Mean, Expected Value
Mean of Probability Distribution
Weighted Average of All Possible Values
= E(X) = Xi P(Xi)
Variance
Weighted Average Squared Deviation about
Mean
2 = E[ (Xi )2 ] = (Xi )2 P(Xi)

Statistical Independence
When the outcome of one event (B) does not affect the
probability of occurrence of another event (A), the events A
and B are said to be statistically independent.
Example: Toss a coin twice => no causality
Condition for independence:
Two events A and B are statistically independent if and
only if (iff)
P(A | B) = P(A)

Bayes Theorem and Multiplication Rule

Bayes Theorem
P(A | B) =

P(A B)
P(B)

The difficult part is P(A B)

Use above equation to derive P(A B)
P(A and B) = P(A B)
= P(A)P(B | A)
= P(B)P(A | B)
For independent events:
P(A and B) = P(A B) = P(A)P(B)

Covariance
Measures the joint variability of two random variables
N

= (X )(Y )P(X Y )
i
X
i
Y
i, i
XY
i=1

Can take any value in the real numbers

Depends on units of measurement (e.g., dollars, cents,
billions of dollars)
Example:
positive covariance = y and x are positively related;
when y is above its mean, x tends to be above its mean;
when y is below its mean, x tends to be below its mean.

Correlation
Standardized covariance, takes values in [-1, 1]
Does not depend on unit of measurement
Correlation coefficient () formula:

cov( XY ) XY
=
=
X Y
X Y
Covariance and correlation measure only linear dependence!
Example: Cov(X,Y)=0
Does not necessarily imply that y and x are independent.
They may be non-linearly related.
But if X and Y are jointly normally distributed, then they
are independent.

Sum of Two Random Variables

Expected Value of the Sum of Two Random
Variables

E(X + Y) = E(X) + E(Y)

Variance of the Sum of Two Random Variables
2
Var (X + Y) = X+Y
= 2X + 2 Y + 2XY

Continuous Probability Distributions Normal Distribution

Bell-Shaped, symmetrical
Mean, median, mode are equal

f(X)

Infinite range
68% of the data are within 1 standard
deviation of the mean

95% of the data are within 2 standard

deviations of the mean
In early 1800's, German
mathematician and physicist Karl
Gauss used it to
analyze astronomical
data, therefore known
as Gaussian distribution.

Mean, Median,
Mode

Normal Distribution
Probability Density Function
1 X
2

f (X )= 1 e
2
f(X)

=
=
=
=
=

frequency of random variable X

3.14159; e = 2.71828
population standard deviation
value of random variable (- < X < )
population mean

Effect of Varying Parameters ( & )

f(X)
B
A

C
X

Normal Distribution Probability

Probability is the
area under the
curve!

P (c X d ) =

f ( x)dx ?
c

f(X)

Infinite Number of Normal Distribution

Tables
Normal distributions differ by
mean & standard deviation.

Each distribution would

require its own table.

f(X)

X
Thats an infinite number!

Standardize the Normal Distribution

Normal
Distribution

Standardized
Normal Distribution

z = 1

Z = 0
One table!

Standardizing Example
6
.
2
5

Z=
=
= 0.12

10
Normal
Distribution

= 10

= 5 6.2 X

Standardized
Normal Distribution

Z = 1

Z= 0 .12

Moments: Mean, Variance

(Continuous Case)
Mean, Expected Value
Mean of probability distribution
Weighted average
of
all
possible
values

= E(X) = X f(X) dX
-

Variance
Weighted average squared
deviation about mean
2 = E[ (X )2 ] = (X- )2 f(X) dX
-

Moments: Skewness, Kurtosis

E(X )
Skewness:
S=
Measures asymmetry in distribution
3
The larger the absolute size of the skewness, the more
asymmetric is distribution.
A large positive value indicates a long right tail, and a large
negative value indicates a long left tail. A zero value
indicates symmetry around the mean.
3

E(X )

K=
Kurtosis:
4
Measures thickness of tails of a distribution
A kurtosis above three indicates fat tails or leptokurtosis,
relative to the normal, i.e. extreme events are more likely to
occur.

Central Limit Theorem: Basic Idea

As sample size
gets
large
(n 30) ...

sample mean
will have a normal
distribution.

Important Continuous Distributions

All derived from normal
distribution
2

distribution: arises
from squared normal
random variables,
t distribution: arises
from ratios of normal
2

and
variables
F distribution: arises
2

from ratios of
variables.

distribution

t distribution (red),
normal distribution (blue)

F distribution

Fundamentals of Hypothesis
Testing

Identifying Hypotheses
1.
2.
3.

Question, e.g. test that the population mean is equal to 3

State the question statistically (H0: = 3)
State its opposite statistically (H1: 3)
Hypotheses are mutually exclusive & exhaustive
Sometimes it is easier to form the alternative
hypothesis first.
4. Choose level of significance
Typical values are 0.01, 0.05, 0.10
Rejection region of sampling distribution: the
unlikely values of sample statistic if null hypothesis is
true

Identifying Hypotheses: Examples

1. Is the population average amount of TV viewing
12 hours?
= 12
12
H0: = 12
H1: 12
2. Is the population average amount of TV viewing
different from 12 hours?
12
= 12
H0: = 12
H1: 12

Hypothesis Testing: Basic Idea

Sampling Distribution
It is unlikely
that we would
get a sample
mean of this
value ...

... Therefore, we
reject the null
hypothesis that
= 50.
... if in fact this were
the population mean.
20

= 50

sample mean

Example: Z-test statistic ( known)

1. Convert Sample Statistic (e.g., X ) to Standardized
Z Variable

X x
x

X
=

n
2. Compare to Critical Z Values
If Z-test statistic falls in critical region, reject H0;
Otherwise do not reject H0

p-value
Probability of obtaining a test statistic more
extreme ( or ) than actual sample value
given H0 is true
Smallest value of for which H0 can be
rejected
Used to make rejection decision
If p value , do not reject H0
If p value < , reject H0

One-Tailed Test: Rejection Region

H0: 0 H1: < 0

H0: 0 H1: > 0

Reject H0

Must be significantly
below .

Here: Small values dont

contradict H0.

One-Tailed Z Test: Finding Critical Z

Values
What Is Z Given = 0.025?
.500
- .025
.475

Z = 1

/2 = .025

Standardized Normal
Probability Table (Portion)

.05

.06

.07

1.6 .4505 .4515 .4525

1.7 .4599 .4608 .4616

1.9 .4744 .4750 .4756

0 1.96 Z

1.8 .4678 .4686 .4693

Two-Tailed Test: Rejection Regions

H0: = 0 H1: 0

Sampling Distribution

Level of Confidence

Rejection
Region

Rejection
Region
1-

1/2

Nonrejection
Region

Critical
Value

H0
Sample Statistic
Value Critical
Value

t-test, F-test
Test statistic may not be normally distributed
=> z-test not applicable
Examples: Variance unknown, but estimated.
Hypothesis that the slope of a regression line differs
significantly from zero.
=> t-test
Hypothesis that the standard deviations of two normally
distributed populations are equal.
=> F-test

Jarque-Bera test
Assesses whether a given sample of data is normally
distributed.
Aggregates information in the data about both skewness
and kurtosis. Test of the hypothesis that S = 0 and K = 3,
based on S and K .
2
T
1

2
Test statistic:
JB = S + K 3

(
4

(here T is the number of observations)

Under the null hypothesis of independent normallydistributed observations, the Jarque-Bera
statistic is
2
distributed in large samples as a random variable with 2
degrees of freedom.

Simple Linear Regression

Simple Linear Regression Model

y-intercept

random iid
error t (0, 2 )

slope

yt = 0 + 1 xt + t
dependent
(response)
variable

independent
(explanatory)
variable

Linear Regression Assumptions

1. x is exogenously determined
2. t are iid(0,2)

(iid = independently and identically distributed)

Zero mean
Independence of errors (no autocorrelation)
Constant variance (homoscedasticity)
More things to think about:
Normality of t (if not satisfied, inference procedures only
asymptotically valid)
Model specification (e.g. linearity, 1 constant over time?)

Simple Linear Regression Model

yt = 0 + 1 xt + t

observed
value

t = disturbance

E y x* = 0 + 1 x*

x
observed value

-- Sample Linear Regression Model

y i = b0 + b1x i + ei
ei = Random
Error

y i = b0 + b1x i

Unsampled
Observation

x
Observed Value

Ordinary Least Squares

OLS minimizes sum of squared residuals ( yt yt )

x
min
y

t
0
1 t

0 , 1

t =1

= et 2
t =1

predicted
value

yt = 0 + 1 xt + t

fitted value
(in-sample forecast)
yt = 0 + 1 xt

On Thursday: Evaluating the Model

1. Examine variation measures
coefficient of determination (goodness of fit)
standard error of the estimate

2. Analyze residuals e
serial correlation

3. Test coefficients
for significance

yt = 0 + 1 xt

Random Error Variation

1.Variation of Actual Y from Predicted Y
2. Measured by Standard Error of Estimate
Sample Standard Deviation of e
Denoted SYX

3. Affects Several Factors

Parameter Significance
Prediction Accuracy

Measures of Variation in Regression

1.Total Sum of Squares (SST)
Measures variation of observed Yi around the
mean,Y

2.Explained Variation (SSR)

Variation due to relationship between
X&Y

3.Unexplained Variation (SSE)

Variation due to other factors

Variation Measures

Yi
Total Sum
of Squares
(Yi - Y)2

Unexplained Sum of
Squares (Yi - Y^i)2

Yi = b0 + b1X i
Explained Sum of
Squares (Y^i - Y)2

Coefficient of Determination
Proportion of Variation Explained by
Relationship Between X & Y
0 r2 1
2

r =

Explained Variation
Total Variation
n

i =1

SSR
SST

b0 Yi + b1 X iYi n (Y)
n

Yi
i =1

n (Y)

Coefficients of Determination (r2) and

Correlation (r)
Y r2 = 1, r = +1

Y r2 = 1, r = -1
^=b +b X
Y
i

^=b +b X
Y
i
0
1 i

X
Yr2 = .8, r = +0.9

X
Y

^=b +b X
Y
i
0
1 i
X

1 i

r2 = 0, r = 0
^=b +b X
Y
i
0
1 i
X

Standard Error of Estimate

SYX =

Y
i =1

(
Y

Y
)
i i
i =1

n2
n

i =1

b0 Yi b1 X i Yi
n2

Residual Analysis
1.Graphical Analysis of Residuals
Plot residuals vs. Xi values
Residuals mean errors
Difference between actual Yi & predicted Yi

2.Purposes
Examine functional form (linear vs. non-Linear
Model)
Evaluate violations of assumptions

Test of Slope Coefficient for Significance

1.Tests If There Is a Linear Relationship
Between X & Y
2.Hypotheses
H0: 1 = 0 (No Linear Relationship)
H1: 1 0 (Linear Relationship)

3.Test Statistic

b1
1
t
=
n2 S
b
1

S
YX
where S =
b
n 2
2
1
X

n
(
X
)
i
i =1

Module Wise Important Formulae
No ratings yet
Module Wise Important Formulae
45 pages
1 Intro-Statistics
No ratings yet
1 Intro-Statistics
61 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Probs-Stats Revision Notes
No ratings yet
Probs-Stats Revision Notes
19 pages
EDA Reviewer
No ratings yet
EDA Reviewer
8 pages
Formula Sheet
No ratings yet
Formula Sheet
18 pages
Chap5 (Bus Analytics)
No ratings yet
Chap5 (Bus Analytics)
2 pages
Statistical Methods
No ratings yet
Statistical Methods
16 pages
2 Inferential+Statistics+ (Theoretical)
No ratings yet
2 Inferential+Statistics+ (Theoretical)
4 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
Probability Distributions-Sarin B
No ratings yet
Probability Distributions-Sarin B
20 pages
Probability unit-III
No ratings yet
Probability unit-III
106 pages
LQ1 Notes
No ratings yet
LQ1 Notes
15 pages
Probability & Statistics Facts and Formulae: Guides To Statistical Information 1
No ratings yet
Probability & Statistics Facts and Formulae: Guides To Statistical Information 1
4 pages
NLP Module 2
No ratings yet
NLP Module 2
73 pages
Probability Concepts Overview
No ratings yet
Probability Concepts Overview
4 pages
Statistics and Probability
No ratings yet
Statistics and Probability
43 pages
Mba Statistics Midterm Review Sheet
No ratings yet
Mba Statistics Midterm Review Sheet
1 page
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
27 pages
Probability & Statistics
No ratings yet
Probability & Statistics
108 pages
Statistic S at Probabili TY: Teacher: Aldwin N. Petronio
No ratings yet
Statistic S at Probabili TY: Teacher: Aldwin N. Petronio
44 pages
IELM2220 Topic03.Unlocked
No ratings yet
IELM2220 Topic03.Unlocked
56 pages
Mathematical Statistics: Probability & Distributions
No ratings yet
Mathematical Statistics: Probability & Distributions
87 pages
Unit 3 R As A Set of Statistical Tables
No ratings yet
Unit 3 R As A Set of Statistical Tables
31 pages
Intro to Statistical Methods
No ratings yet
Intro to Statistical Methods
15 pages
Random Variables and Process
No ratings yet
Random Variables and Process
31 pages
OCR MEI S1 Revision Notes
No ratings yet
OCR MEI S1 Revision Notes
7 pages
DMV - Unit I
No ratings yet
DMV - Unit I
44 pages
Finals (MS)
No ratings yet
Finals (MS)
3 pages
Probability and Statistics
No ratings yet
Probability and Statistics
5 pages
5 Probability
No ratings yet
5 Probability
51 pages
Probability
No ratings yet
Probability
17 pages
Mod 1 Stats
No ratings yet
Mod 1 Stats
7 pages
Ders 1
No ratings yet
Ders 1
34 pages
Module 1 - Descriptive Stats
No ratings yet
Module 1 - Descriptive Stats
9 pages
What Is Statistics?: Definition of Statistics Statistics
No ratings yet
What Is Statistics?: Definition of Statistics Statistics
108 pages
Probability and Statistics
No ratings yet
Probability and Statistics
8 pages
Probability and Statistics Course Outline
No ratings yet
Probability and Statistics Course Outline
2 pages
Statistics S1 Theory
No ratings yet
Statistics S1 Theory
8 pages
Probability FoundationalMathofAI S24
No ratings yet
Probability FoundationalMathofAI S24
7 pages
Classify Sample Observation
No ratings yet
Classify Sample Observation
2 pages
Probability
No ratings yet
Probability
50 pages
Chapter 13 Mathematics - Class 12 - Formula - Sheet
50% (2)
Chapter 13 Mathematics - Class 12 - Formula - Sheet
5 pages
Sam Roweis Probx
No ratings yet
Sam Roweis Probx
12 pages
Stats 210 Course Book
No ratings yet
Stats 210 Course Book
200 pages
Unit II
No ratings yet
Unit II
140 pages
MAS 102 - Topic 1
No ratings yet
MAS 102 - Topic 1
13 pages
3 - Introduction To Inferential Statistics
No ratings yet
3 - Introduction To Inferential Statistics
32 pages
91 With: Probability
No ratings yet
91 With: Probability
13 pages
FBA Module 2
No ratings yet
FBA Module 2
27 pages
Review Prob
No ratings yet
Review Prob
81 pages
iQRM Warm Up Week 5 February 17 Corrected
No ratings yet
iQRM Warm Up Week 5 February 17 Corrected
39 pages
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
No ratings yet
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
36 pages
Lecture Slides - Inferential Statistics
100% (1)
Lecture Slides - Inferential Statistics
42 pages
CLRM
No ratings yet
CLRM
28 pages
Design of Water Quality Monitoring Based On SVM and Its Simulation Platform by Remote Sensing
No ratings yet
Design of Water Quality Monitoring Based On SVM and Its Simulation Platform by Remote Sensing
5 pages
Time Series Case Study Presentation
100% (1)
Time Series Case Study Presentation
14 pages
0-Cheatsheet Capstone Part 1
No ratings yet
0-Cheatsheet Capstone Part 1
4 pages
0826 Statistics (Class Notes) (Vanessa 2022)
No ratings yet
0826 Statistics (Class Notes) (Vanessa 2022)
43 pages
Mini Project II Instructions Segmentation and Regression
0% (1)
Mini Project II Instructions Segmentation and Regression
6 pages
Statistical Inference & Regression
No ratings yet
Statistical Inference & Regression
5 pages
Sol Linear Regression by Hand
No ratings yet
Sol Linear Regression by Hand
3 pages
Tree Volume Estimation Guide
No ratings yet
Tree Volume Estimation Guide
48 pages
Kinerja Proyek Konstruksi Bangunan Gedung Di Pengaruhi Oleh Beberapa Faktor Seperti Sumber Daya Manusia
0% (1)
Kinerja Proyek Konstruksi Bangunan Gedung Di Pengaruhi Oleh Beberapa Faktor Seperti Sumber Daya Manusia
15 pages
DSP Research Paper by Shanmukh and Meher
No ratings yet
DSP Research Paper by Shanmukh and Meher
33 pages
Market and Demand Analysis
33% (3)
Market and Demand Analysis
25 pages
17BIT051
No ratings yet
17BIT051
26 pages
Moving Average Thesis
100% (3)
Moving Average Thesis
8 pages
Mba Advertising and Brand Management
No ratings yet
Mba Advertising and Brand Management
45 pages
HRM's Impact on Employee Behavior
No ratings yet
HRM's Impact on Employee Behavior
15 pages
MBA Study: Footwear Appraisal Perception
No ratings yet
MBA Study: Footwear Appraisal Perception
6 pages
Tests of Normality: Kolmogorov-Smirnov Shapiro-Wilk Statistic DF Sig. Statistic DF Sig. Standardized Residual For Daya
No ratings yet
Tests of Normality: Kolmogorov-Smirnov Shapiro-Wilk Statistic DF Sig. Statistic DF Sig. Standardized Residual For Daya
2 pages
Load Profiles v2.0 Cgi
No ratings yet
Load Profiles v2.0 Cgi
31 pages
EI 2023 Electronics and Instrument Engineering Etr 2023 Paper
No ratings yet
EI 2023 Electronics and Instrument Engineering Etr 2023 Paper
56 pages
What Is The Difference Between Coefficient of Determination, and Coefficient of Correlation - Gaurav Bansal
No ratings yet
What Is The Difference Between Coefficient of Determination, and Coefficient of Correlation - Gaurav Bansal
2 pages
Extending The Linear Model With R Second Edition Julian James Faraway Instant Download
100% (15)
Extending The Linear Model With R Second Edition Julian James Faraway Instant Download
170 pages
Content & Writing of Chapter 2
No ratings yet
Content & Writing of Chapter 2
21 pages
Bayesian Autoregressions Smiranda
No ratings yet
Bayesian Autoregressions Smiranda
65 pages
Nihms 499339
No ratings yet
Nihms 499339
19 pages
ATE A Learner Guide 2024
No ratings yet
ATE A Learner Guide 2024
16 pages
View
No ratings yet
View
4 pages
MGT555 Individual Assignment 2
No ratings yet
MGT555 Individual Assignment 2
9 pages
BSC Statistics Syllabus CBCS
No ratings yet
BSC Statistics Syllabus CBCS
29 pages
Data Science Career Boost Course
No ratings yet
Data Science Career Boost Course
7 pages

Stats Review

Uploaded by

Stats Review

Uploaded by

Review of Statistics

Two Random Variables

Central Limit Theorem

Simple Linear Regression

1st 2nd 3rd 4th

Measure of central tendency

Measure of central tendency

Middle value in ordered sequence

Value that splits the distribution into two

Not Affected by Extreme Values

Sample Standard Deviation

Event, Sample Space

one possible outcome

Compound Probability Addition Rule

For Mutually Exclusive Events:

Discrete Probability Distribution

List of pairs [ Xi, P(Xi) ]

Mutually exclusive (no overlap)

Collectively exhaustive (nothing left out)

Joint Probability Using Contingency

P(A1 B1) P(A1 B2) P(A1)

P(A2 B1) P(A2 B2) P(A2)

Contingency Table Example

P(Ace AND Red)

Moments Discrete Case

Bayes Theorem and Multiplication Rule

The difficult part is P(A B)

Can take any value in the real numbers

Sum of Two Random Variables

E(X + Y) = E(X) + E(Y)

Continuous Probability Distributions Normal Distribution

95% of the data are within 2 standard

frequency of random variable X

Effect of Varying Parameters ( & )

Normal Distribution Probability

Infinite Number of Normal Distribution

Each distribution would

Standardize the Normal Distribution

Moments: Mean, Variance

Moments: Skewness, Kurtosis

Central Limit Theorem: Basic Idea

Important Continuous Distributions

Question, e.g. test that the population mean is equal to 3

Identifying Hypotheses: Examples

Hypothesis Testing: Basic Idea

Example: Z-test statistic ( known)

One-Tailed Test: Rejection Region

H0: 0 H1: > 0

Here: Small values dont

One-Tailed Z Test: Finding Critical Z

1.6 .4505 .4515 .4525

1.7 .4599 .4608 .4616

1.9 .4744 .4750 .4756

1.8 .4678 .4686 .4693

Two-Tailed Test: Rejection Regions

(here T is the number of observations)

Simple Linear Regression

Simple Linear Regression Model

Linear Regression Assumptions

(iid = independently and identically distributed)

Simple Linear Regression Model

-- Sample Linear Regression Model

Ordinary Least Squares

OLS minimizes sum of squared residuals ( yt yt )

On Thursday: Evaluating the Model

Random Error Variation

3. Affects Several Factors

Measures of Variation in Regression

2.Explained Variation (SSR)

3.Unexplained Variation (SSE)

Coefficients of Determination (r2) and

Standard Error of Estimate

Test of Slope Coefficient for Significance

You might also like