0% found this document useful (0 votes)

23 views7 pages

Mod 1 Stats

Uploaded by

jennylehuynh29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views7 pages

Mod 1 Stats

Uploaded by

jennylehuynh29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Module 1: Descriptive Statistics

Lecture 1: Introduction and Descriptive Statistics

Data Types
Qualitative/categorical
● Mutually exclusive labels (one label cannot mean two things)
● Not often numbers, if so, numbers have no mathematical meaning
- Nominal: ordering/ranking makes no sense, numerical labels are arbitrary
- Ordinal: ordering/ranking has meaning/can be interpreted, numerical labels
respect the ordering
Quantitative/numerical
● Numbers used to record certain events, numbers have mathematical meaning
- Interval: quantity in difference is meaningful, but in ratio is not; zero has no
natural meaning
- Ratio : difference and ratio of two quantities is
also meaningful; zero is meaningful

Using categorical/qualitative data

Frequency distribution
● Frequency: the total number of occurrences for each
category

➗
● Relative frequency: the fraction of total number of
items belonging to category (eg. 102 808 = 0.1262)
● Percent frequency: relative frequency x 100%
Histograms
● Categories on x-axis
● Frequency, relative frequency, percent frequency on y-axis

Using numerical/quantitative data

Frequency distributions and histograms
● Categories on x-axis are grouped (eg. 0-5, 5-10, 10-15)
● Density frequency

Probability theory
● Random variable (r.v.) - a variable’s value appears randomly
● population - the complete pool of a certain random variable
● Sample - a random collection of certain size from the population

Probability distribution
● Probability distribution - the general shape of probability for
values that a random variable may take

Notation
● Random variable denoted by X, Y (capital letters)
- Eg. X: number of children in household
- Eg. Y: amount of time spent by husband on
housework per day
● realisations/observations of a random variable denoted by xᵢ, yᵢ (lowercase letters
with subscript)
- Eg. x₁: number of children in household is 1
- Eg. y₁₃₇:amount of time spent by husband is 137 on housework per day
● N and n denote the size or number of observations.
- N is referred to population size
- n denotes the sample size

Descriptive Statistics
Central tendency
● Measure of central tendency yields info about the centre of a set of numbers
(distribution of a r.v.’s) – does not focus on the span of the dataset or how far values
are from middle numbers
● gives an idea of what a typical, middle, or average that a r.v. can take
● sometimes called measures of location

three measures of central tendency

Mode ● most frequently occurring value in a set of data

● If there are 2 modes, the 2 modes are listed and the data is said to be bimodal
● Datasets with 3 or more modes are referred to as multimodal
● Concept of mode is often used in determining sizes
● Appropriate descriptive summary measure for categorical data

Median ● middle value in an ordered array of numbers

𝑛+1
● locate the median by finding the 2 th term in the ordered array
● Large and small values do not inordinately influence the median – hence the
● best measure of location to use in the analysis of variables in which extreme but
acceptable values can occur at just one end of the data
● Not all info from the dataset is used
● Data must be quantitative or be able to be ranked

Mean ● Average of a set of numbers

● Sample mean is represented by X̄
● Population mean is represented by µ
● Data should be quantitative as it needs to be summed
● Affected by all values – advantage because it reflects all the data, but
disadvantage because extreme values pull the mean towards extremes
● To calculate the mean forecast value, we need to multiply each possible value by
its probability and sum up the products.

- If we denote the r.v. by X:

Variability
● Measures of variability yield info about the likelihood of a realisation of the r.v. is
away from the centre of its distribution, describes the spread/dispersion of a dataset
● Gives an idea of fluctuation and volatility across realisations of the r.v.
● The more variability in a dataset, the less typical they are of the whole set
● Using measures of variability in conjunction with measures of central tendency
makes possible a more complete numerical description of the data (measure of
variability is necessary to complement the mean value when describing data)
● Conveys fluctuations and volatility across realisation of random variable
● The more spread out the r.v. is, the larger the risk/dispersion the variability is
● Also called measures of scale, spread, dispersion or risk
● Measures of variability
- Variance (Var) - average of squared distance from the mean
- Standard deviation (std): square root of variance
- Coefficient of variation - standard deviation/ mean x100%

Variability formulas
Variance
● It computes the average squared distance between data points and their mean,
depending on sample or population
● Population variance
- Finite population
- Denoted by σ² (stigma square) or
Var(X)/Variance of X
● Sample variance
- Denoted by s²
Standard deviation
● Standard deviation solves the problem of squared units. It has the same unit of the
original data
● Population standard deviation
- Denoted by σ (stigma) or std(X)
● Sample standard deviation
- Denoted by s
Coefficient of variation
● Measures standard deviation per unit of
mean
● In finance when the r.v. X denotes assets returns, CV measures risk per unit of
expected return
● It is unit free, because both the numerator and denominator have the same unit as
the original data and they cancel each other
● Population CV
- when σ increase, CV increase
- when µ increase, CV decreases
- Ratio between risk and expected return
Skewness
Shape
● Central tendency and variability are useful to describe and summarise data or the
distribution of r.v.’s
● Skewness - a measure of asymmetry
● Mode: value on the horizontal axis where the high point of the curve occurs
● Mean: towards the tail of the distribution (drawn towards the extreme values)
● Median: generally located somewhere between the mode and the mean

Lecture 2: Probability theory

● Multi-dimensional data
● Experiment: a random process that creates outcomes (eg. the data collection
procedure)
● Sample space: the set of all possible outcomes
● Event: a set of outcomes (can contain no outcome, single outcome or multiple
outcomes) of an experiment to which probability is assigned. So an event is a subset
of the sample space
● Relative frequency: outcomes receive probability corresponding to their number of
occurrences → P(outcomes)= number of occurrences of outcomeı ÷ total number of
occurrences of all outcomes

Law of addition
Joint vs marginal probabilities
● Distinguish joint and marginal probability through multidimensional outcomes
● Joint probability: denotes relative frequency when asking about all dimensions
- Eg. what is relative frequency that customer bought a $49 plan on a weekday
● Marginal probability: displays relative frequency when only asking about a single
dimension

Law of total probability, version 1

● Complement of the event denoted as A’ → pronounced as A prime - meaning not A -
if there is a dash at the top = not the outcome
● When referring to joint probability, we use
intersection “∩”. The event A∩B (it reads: the
intersection of A and B or A intersection B) means
the event where both A and B are true or both A and B occur

Venn diagram: visualisation of probability

● Venn diagram shows logic relations across sets
● The external rectangle indicates the whole sample space
● The internal circle indicates some event A
Joint events
● Joint events such as A ∩ B is the intersection (∩) of A and B
Union of events
● Indicates the event A or B happens
● This is denoted by A∪B, pronounced as the union of A and B or A union B.
So P(A∪B) indicates the probability that A or B is true or that A or B occurs

General rule of addition

Mutually exclusive events

● If event A occurs only if event B does not occur (cannot occur at the
same time), we say A and B are mutually exclusive (events)
● Any event and its complement are mutually exclusive. Either “A
occurs” or “A does not occur
● P(A∩A’) = 0

Collectively exhaustive events

● If the occurrence of events A and B covers the whole sample
space, we say A and B are collectively exhaustive (events
● Any event and its complement are collectively exhaustive. “A
occurs” and “A does not occur” make up all possible outcomes
● P( A∪A’) = 1

Conditional probability and independence

Conditional probabilities
● P(A|B) denotes the probability that event A occurs, conditional on that B occurs.
● The symbol P(X=x|Y=y) denotes the probability of r.v. X taking value x, conditional on
the r.v. Y taking value y
● formula:

● Bayes rule:

Law of total probability

● Joint probability = conditional probability multiplied by the marginal probability

Independent events: formula

● If A and B are independent events, whether or not B occurs should not affect the
probability that A occurs; also, whether or not A occurs should affect the probability
that B occurs
● Formula:

● Bayes rule:

Implications of formulas

Binomial experiments
● Eg. toss a coin 3 times in a row and you are interested in how likely it is that you get
exactly two heads
● A binomial experiment assesses the number of a certain outcome from repeated
independent trials
● Each trial has two possible outcomes (eg. heads or tails, success or failure)

Binomial tree
● When two outcomes are independent, P(A|B) = P(A)
● Suppose we have three products, each can be defect
(D) with probability p or functional (F) with probability q=
=1-p
Binomial distribution
● A r.v. X taking value in (0,1,...,n) is said to follow the binomial distribution denoted by
𝑋 ~ 𝐵𝑖𝑛(𝑛, 𝑝)

𝑥
● 𝑝 : the probability of x successes
𝑛−𝑥
● (1 − 𝑝) : the probability of n-x failures. So in total we have n trials
● The factor (combinatorial operator)

- computes the number of cases/combinations of choosing x

objects from the set of n objects. Remember the factorial
operator m! = 1 x 2 x 3 x … x (m-1) x m

● Properties of binomial distribution:

- Almost all distributions have expectation (i.e. mean) and variance (and thus
standard deviation).
- Every distribution (their pdf) is characterised by some parameters.
→ The binomial distribution has two parameters, 𝒏 (the number of trials) and
𝒑 (the success probability or success rate)
→ the mean (expectation) and variance of 𝑋~𝐵𝑖𝑛(𝑛, 𝑝) are given by:

Module 1 - Descriptive Stats
No ratings yet
Module 1 - Descriptive Stats
9 pages
Mba Statistics Midterm Review Sheet
No ratings yet
Mba Statistics Midterm Review Sheet
1 page
L2 - Mathematical Preliminaries
No ratings yet
L2 - Mathematical Preliminaries
24 pages
Understanding Measures of Variation
No ratings yet
Understanding Measures of Variation
63 pages
Central Tendency & Variability Guide
No ratings yet
Central Tendency & Variability Guide
7 pages
Business Statistics Notes
No ratings yet
Business Statistics Notes
50 pages
Psych 101 Endterm Notes
No ratings yet
Psych 101 Endterm Notes
9 pages
Intro to Statistical Methods
No ratings yet
Intro to Statistical Methods
15 pages
7.1 Fundamental Theories of Probability: Reporter: Erika Dianne Salma
No ratings yet
7.1 Fundamental Theories of Probability: Reporter: Erika Dianne Salma
22 pages
Statistical Methods
No ratings yet
Statistical Methods
16 pages
Stats Review
No ratings yet
Stats Review
65 pages
Module Wise Important Formulae
No ratings yet
Module Wise Important Formulae
45 pages
Business Statistics - Sessions 4 To 7
No ratings yet
Business Statistics - Sessions 4 To 7
43 pages
Econ1203 Notes
67% (3)
Econ1203 Notes
35 pages
STA301 IMP Notes Headings and Some Questions Answers Prepared by
No ratings yet
STA301 IMP Notes Headings and Some Questions Answers Prepared by
32 pages
Lecture Methods 3
No ratings yet
Lecture Methods 3
23 pages
Intro to Descriptive Statistics
No ratings yet
Intro to Descriptive Statistics
51 pages
Geostatistics & Reservoir Analysis
No ratings yet
Geostatistics & Reservoir Analysis
83 pages
Descriptive Statistics Course Guide
No ratings yet
Descriptive Statistics Course Guide
50 pages
Week 3 - Measures of Central Tendency
No ratings yet
Week 3 - Measures of Central Tendency
4 pages
2 Descriptive Statistics Handout
No ratings yet
2 Descriptive Statistics Handout
2 pages
Probability & Statistics Concepts
No ratings yet
Probability & Statistics Concepts
19 pages
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
No ratings yet
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
36 pages
Stats Summary Notes
No ratings yet
Stats Summary Notes
32 pages
Introduction Into Statistics: Vladimir Kozlov
No ratings yet
Introduction Into Statistics: Vladimir Kozlov
20 pages
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
No ratings yet
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
14 pages
Chapter Two
No ratings yet
Chapter Two
36 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Statistics Basics for Beginners
No ratings yet
Statistics Basics for Beginners
18 pages
Stats Week 1 PDF
No ratings yet
Stats Week 1 PDF
6 pages
Intro to Quantitative Data Analysis
No ratings yet
Intro to Quantitative Data Analysis
47 pages
ISA Summary Toya
No ratings yet
ISA Summary Toya
38 pages
Lecture 2 Slides With Q&A 20242025
No ratings yet
Lecture 2 Slides With Q&A 20242025
38 pages
ML2 Math Algo
No ratings yet
ML2 Math Algo
72 pages
Descriptive Probability
No ratings yet
Descriptive Probability
12 pages
A. Variables:: Types of Distributions
No ratings yet
A. Variables:: Types of Distributions
10 pages
Statistics in Research Guide
No ratings yet
Statistics in Research Guide
91 pages
STAT Vocab
No ratings yet
STAT Vocab
15 pages
AP Statistics Guide for Students
100% (2)
AP Statistics Guide for Students
12 pages
It0089 Finalreviewer
No ratings yet
It0089 Finalreviewer
143 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
Basic Stat
No ratings yet
Basic Stat
46 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
Lesson 4 Notes
No ratings yet
Lesson 4 Notes
14 pages
GEA1000 Finals Cheatsheet
No ratings yet
GEA1000 Finals Cheatsheet
2 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Viva Que 1
No ratings yet
Viva Que 1
43 pages
Classify Sample Observation
No ratings yet
Classify Sample Observation
2 pages
Stats 1, Lecture
No ratings yet
Stats 1, Lecture
11 pages
Part 1 QT
No ratings yet
Part 1 QT
40 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
Psychology 117 Study Guide
100% (3)
Psychology 117 Study Guide
41 pages
Stats Reviewer
No ratings yet
Stats Reviewer
5 pages
Business Data & Statistics Guide
No ratings yet
Business Data & Statistics Guide
84 pages
1 Intro-Statistics
No ratings yet
1 Intro-Statistics
61 pages
STA301 Statistics and Probability FAQS AND GLOSSARY
No ratings yet
STA301 Statistics and Probability FAQS AND GLOSSARY
33 pages
Cost Accounting Standard Costing and Variance Analysis Practice Exam With Answers
No ratings yet
Cost Accounting Standard Costing and Variance Analysis Practice Exam With Answers
30 pages
(FREE PDF Sample) Quantum Dots Applications in Biology 3rd Edition Adriana Fontes Ebooks
100% (1)
(FREE PDF Sample) Quantum Dots Applications in Biology 3rd Edition Adriana Fontes Ebooks
65 pages
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition Download
100% (7)
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition Download
56 pages
The Validity of Assessment Centres For The Prediction of Supervisory Performance Ratings
No ratings yet
The Validity of Assessment Centres For The Prediction of Supervisory Performance Ratings
8 pages
Predicting Peru's GDP with MIDAS
No ratings yet
Predicting Peru's GDP with MIDAS
17 pages
Based On Earlier Notes by Alistair Sinclair/Manuel Blum/Douglas Young
No ratings yet
Based On Earlier Notes by Alistair Sinclair/Manuel Blum/Douglas Young
5 pages
Employee-Generated Content - The Role of Perceived Brand Citizenship Behavior
No ratings yet
Employee-Generated Content - The Role of Perceived Brand Citizenship Behavior
16 pages
Parametric Tests
No ratings yet
Parametric Tests
3 pages
Essentials of Statistics For Business and Economics Revised 6th Edition David R. Anderson Instant Download
100% (1)
Essentials of Statistics For Business and Economics Revised 6th Edition David R. Anderson Instant Download
59 pages
Revision Guideline and Solved Problems JAN2018
No ratings yet
Revision Guideline and Solved Problems JAN2018
24 pages
Math 110 2 Hypothesis Testing
100% (1)
Math 110 2 Hypothesis Testing
74 pages
Roberts OReilly JAP1974
No ratings yet
Roberts OReilly JAP1974
7 pages
Chap005 - Introduction Risk and Return - Khoa PDF
No ratings yet
Chap005 - Introduction Risk and Return - Khoa PDF
46 pages
Neudert Et Al. (2024) RDM - Selective Revealing For Building Ecosystems A Conjoint Experiment With Managers of Established Firms
No ratings yet
Neudert Et Al. (2024) RDM - Selective Revealing For Building Ecosystems A Conjoint Experiment With Managers of Established Firms
20 pages
Statistics
No ratings yet
Statistics
13 pages
Covariance Matrix
No ratings yet
Covariance Matrix
14 pages
Survey Adjustment
No ratings yet
Survey Adjustment
97 pages
Realised Semivariance: Measuring Downside Risk
No ratings yet
Realised Semivariance: Measuring Downside Risk
24 pages
Chap 3: Two Random Variables: X X X X X
No ratings yet
Chap 3: Two Random Variables: X X X X X
63 pages
The Concept of An Insect Pest
No ratings yet
The Concept of An Insect Pest
8 pages
2755-Article Text-11216-1-10-20180324
No ratings yet
2755-Article Text-11216-1-10-20180324
10 pages
Probability and Statistics & MA2011 CS, It, Csce: If and Are Independent, Then ( )
No ratings yet
Probability and Statistics & MA2011 CS, It, Csce: If and Are Independent, Then ( )
10 pages
Artificial Intelligence and Pattern Recognition Question Bank
100% (1)
Artificial Intelligence and Pattern Recognition Question Bank
5 pages
Chapter 7 - An Introduction To Portfolio Management
No ratings yet
Chapter 7 - An Introduction To Portfolio Management
19 pages
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
No ratings yet
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
9 pages
Design Load Factors For Structural Columns: Application Example 13
No ratings yet
Design Load Factors For Structural Columns: Application Example 13
5 pages
Business Statistics - Problems On Statistics
No ratings yet
Business Statistics - Problems On Statistics
2 pages
Some Practical Examples of Method Validation in The Analytical Laboratory
No ratings yet
Some Practical Examples of Method Validation in The Analytical Laboratory
10 pages
Referensi - Kuesioner ASS (Academic Stress Scale)
No ratings yet
Referensi - Kuesioner ASS (Academic Stress Scale)
14 pages
Understanding ARIMA and MA Models
No ratings yet
Understanding ARIMA and MA Models
9 pages

Mod 1 Stats

Uploaded by

Mod 1 Stats

Uploaded by

Module 1: Descriptive Statistics

Lecture 1: Introduction and Descriptive Statistics

Using categorical/qualitative data

Using numerical/quantitative data

three measures of central tendency

Mode ● most frequently occurring value in a set of data

Median ● middle value in an ordered array of numbers

Mean ● Average of a set of numbers

- If we denote the r.v. by X:

Lecture 2: Probability theory

Law of total probability, version 1

Venn diagram: visualisation of probability

General rule of addition

Mutually exclusive events

Collectively exhaustive events

Conditional probability and independence

Law of total probability

Independent events: formula

- computes the number of cases/combinations of choosing x

● Properties of binomial distribution:

You might also like