0% found this document useful (0 votes)

33 views25 pages

Lesson 4

Uploaded by

maheshrj83

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views25 pages

Lesson 4

Uploaded by

maheshrj83

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Business Analytics

Mrs Hashani Kumarasinghe

Software Engineer -Data Science
Zone 24X7
hashanik@zone24X7.com
Todays Plan
1. Measures of central tendency
2. Measure of dispersion
3. Chebyshev’s theorem
4. Measure of shape
5. Descriptive statistics for grouped data
6. Descriptive statistics for categorical data
7. Measure of Association
Population and Samples
Population
Population

A population consists of all items of interest for a particular decision or investigation

Sample
sample
A sample is a subset of a population
Measures of central tendency
A measure of central tendency is a summary statistic that represents the center point or typical value of a
dataset. These measures indicate where most values in a distribution fall and are also referred to as the central
location of a distribution.

The three most common measures of central tendency are the mean, median, and mode.

Mean = sum of the observations divided by the number of observations

Population mean =
Measures of central tendency
Sample mean =

Property of mean

The sum of the deviations of each observation from the mean is zero
Measures of central tendency
Median

The measure of location that speciﬁes the middle value when the data are arranged from least to greatest
is the median.

Mode

The mode is the observation that occurs most frequently.

How to calculate Central tendency measures using R ?

Measure of dispersion
Dispersion refers to the degree of variation in the data. The most common dispersion methods are Range ,
Variance and Standard deviation

Range = Difference between the maximum value and the minimum value in the data set.

Variance(Population) =

Variance(Sample) =

The larger the variance, the more the data are spread out from the mean and the more variability one can
expect in the observations.
Measure of dispersion
Standard Deviation: The standard deviation is the square root of the variance.

Population Std Deviation =

Sample std Deviation =

Chebyshev’s Theorem
For any set of data, the proportion of values that lie within k standard deviations (k>1) of the mean is at
least 1 - (1/k^2).

K= 2 → 1-(¼) = ¾ =75%

K= 3 → 1-(1/9=8/9→ 89%)

Example: For Cost per order data in the Purchase Orders database.

two standard deviation interval around the mean is [$33,390.34, $85,980.98] → 89/94 → 94.68%

three-standard deviation interval is [$63,233.17, $115,823.81] → 92/94 → 97.9%

Mean Sales per month for 2020= 200,000 LKR

Standard deviation = 10,000 LKR

Two Std deviation range(2* sigma) = 180,000 - 220,000

1 → 200,000

2→ 210,000

3→ 170,000
Empirical rules
1. Approximately 68% of the observations will fall within one standard deviation of the mean, or between
x - s and x + s.

2. Approximately 95% of the observations will fall within two standard deviations of the mean, or within
x-2s and x+2s.

3. Approximately 99.7% of the observations will fall within three standard deviations of the mean, or
within x -3s and x+3s.
Process Capability index
To measure how well a manufacturing process can achieve the specifications, we usually take a sample of
output, measure the dimension, compute the total variation using the third empirical rule.
Standardized Values
Standardized values (also called standard scores or normal deviates) are the same thing as z-scores. A
standardized value is what you get when you take a data point and scale it by population data. It tells us
how far from the mean we are in terms of standard deviations.
Coefficient of Variation
The coefficient of variation (CV) provides a relative measure of the dispersion in data relative to the mean
and is defined as

The coefﬁcient of variation provides a relative measure of risk to return.

Measures of Shape
Consider the following data set. The histogram of Cost per order variable and A/P terms variables could be
different in shapes
Measures of Shape
Cost per order is ___Positively ____________________ Skewed

A/P Terms is Symmetric_____________________

Measures of Shape
The coefﬁcient of skewness (CS) measures the degree of asymmetry of observations around the mean.
The coefﬁcient of skewness is computed as
Characteristics of a Skewed Distribution
If the distribution is perfectly symmetrical and unimodal→ mean = mode= median

If the distribution is negatively skewed → mean < median < mode

If the distribution is positively skewed → mode < median < mean

Coefficient of kurtosis(CK)
Kurtosis refers to the peakedness (i.e., high, narrow) or flatness (i.e., short, flattopped) of a histogram. The
coefficient of kurtosis (CK) measures the degree of kurtosis of a population

Distributions with values of CK less than 3 are more ﬂat with a wide degree of dispersion; those with values
of CK greater than 3 are more peaked with less dispersion.
Descriptive statistics for grouped data
Mean of the population =

Mean of the Sample =

Population Variance =

Sample Variance =
Descriptive statistics for Categorical data
Proportion: Proportions are key descriptive statistics for categorical data, such as defects or errors in
quality control applications or consumer preferences in market research. It is the fraction of the data
that have a certain characteristic

Ex:
Measure of association
How you will answer the following kind of questions

does a higher percentage of students in the top 10% of their high school class suggest a higher
graduation rate? Is acceptance rate related to the amount spent per student? Do schools with
lower acceptance rates tend to accept students with higher SAT scores?
Measure of association
Co variance = measure of the linear association between two variables.

Thus the Sample covariance will be =

Measure of association
Correlation: Correlation is a measure of the linear relationship between two variables, X and Y, which does not
depend on the units of measurement.

Correlation is measured using correlation coefﬁcient.

Correlation coefﬁcient for population =

Correlation coefﬁcient for sample =

Measure of association

Diagram Source: web

Measures of Dispersion Kurtosis and Skewness
No ratings yet
Measures of Dispersion Kurtosis and Skewness
19 pages
Measure of Dispression
100% (1)
Measure of Dispression
36 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
24 pages
CH 4
No ratings yet
CH 4
6 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
55 pages
Chapter Four
No ratings yet
Chapter Four
27 pages
Lec 8 Measures of Dispersion 2
No ratings yet
Lec 8 Measures of Dispersion 2
16 pages
Measures of Dispersion OR Measures of Variations
No ratings yet
Measures of Dispersion OR Measures of Variations
7 pages
AE 9-Activity 5-Measures of Dispersion and Shape
No ratings yet
AE 9-Activity 5-Measures of Dispersion and Shape
13 pages
Measures of Dispersion (Autosaved)
No ratings yet
Measures of Dispersion (Autosaved)
64 pages
Lecture Sheet D
No ratings yet
Lecture Sheet D
17 pages
RM-Topic 1-Descriptive Statistics
No ratings yet
RM-Topic 1-Descriptive Statistics
12 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
04 - Measures of Variations
No ratings yet
04 - Measures of Variations
24 pages
Lecture 9 Measure of Dispersion
No ratings yet
Lecture 9 Measure of Dispersion
43 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
15 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
7 pages
Central Tendency Summary
No ratings yet
Central Tendency Summary
11 pages
Intro CH 4-1
No ratings yet
Intro CH 4-1
16 pages
Math2101Stat 2 2
No ratings yet
Math2101Stat 2 2
23 pages
Chapter 3
No ratings yet
Chapter 3
121 pages
Numerical Descriptive Techniques (6 Hours)
No ratings yet
Numerical Descriptive Techniques (6 Hours)
89 pages
Measures of Dispersion
100% (1)
Measures of Dispersion
15 pages
BS 5, Agile
No ratings yet
BS 5, Agile
5 pages
Unit1 4variability
No ratings yet
Unit1 4variability
36 pages
Lecture 4. Dispersion
No ratings yet
Lecture 4. Dispersion
6 pages
Chapter 4 - Descriptive Statistical Measures
No ratings yet
Chapter 4 - Descriptive Statistical Measures
18 pages
Dispersion
No ratings yet
Dispersion
15 pages
Statistical Measures
No ratings yet
Statistical Measures
54 pages
Numerical Descriptive Measures: A. Measures of Central Tendency
No ratings yet
Numerical Descriptive Measures: A. Measures of Central Tendency
21 pages
Measures of Dispersion Guide
100% (1)
Measures of Dispersion Guide
11 pages
5.measure of Disperssion
No ratings yet
5.measure of Disperssion
20 pages
Understanding Data Variability
No ratings yet
Understanding Data Variability
79 pages
Statistics: Dispersion Measures
No ratings yet
Statistics: Dispersion Measures
8 pages
Session 3
No ratings yet
Session 3
11 pages
Measures of Dispersion Explained
100% (7)
Measures of Dispersion Explained
18 pages
G4. Descriptive Statistical Measures
No ratings yet
G4. Descriptive Statistical Measures
33 pages
Unit 3 - Measures of Central Tendency
No ratings yet
Unit 3 - Measures of Central Tendency
2 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Biostat Ch-5
No ratings yet
Biostat Ch-5
58 pages
Descriptive
No ratings yet
Descriptive
16 pages
Module 3 - Measures of Dispersion and Shape
No ratings yet
Module 3 - Measures of Dispersion and Shape
6 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
35 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Unit I & Ii Qa
No ratings yet
Unit I & Ii Qa
42 pages
3 5 Measures of Variability Ungrouped
No ratings yet
3 5 Measures of Variability Ungrouped
29 pages
Lecture 7
No ratings yet
Lecture 7
9 pages
Lecture 9descriptivestatistics 171204035552
No ratings yet
Lecture 9descriptivestatistics 171204035552
26 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
6 pages
Lecture III-Measures of Dispersion
No ratings yet
Lecture III-Measures of Dispersion
33 pages
Engineering - Measures of Dispersion and Skewness - 2024
No ratings yet
Engineering - Measures of Dispersion and Skewness - 2024
20 pages
Descriptive Statistics Techniques
No ratings yet
Descriptive Statistics Techniques
108 pages
Unit 3 Measure of Central Location
No ratings yet
Unit 3 Measure of Central Location
29 pages
Statistic For Business
No ratings yet
Statistic For Business
91 pages
Measures of Central Tendency & Variability
No ratings yet
Measures of Central Tendency & Variability
18 pages
CH IV Stat I
No ratings yet
CH IV Stat I
41 pages
STE Mod Research-II-Correlation Q3 Wk-1 Final-1
No ratings yet
STE Mod Research-II-Correlation Q3 Wk-1 Final-1
10 pages
Lidar Data Processing
No ratings yet
Lidar Data Processing
12 pages
Assessment of Learning
No ratings yet
Assessment of Learning
12 pages
Detailed Lesson Plan For Grade 10 Q4 W1
No ratings yet
Detailed Lesson Plan For Grade 10 Q4 W1
11 pages
Math 10 - Q4 - Week 4 - 5 - Module 4 - Solves-Problems-Involving-Measures-Of-Position
50% (10)
Math 10 - Q4 - Week 4 - 5 - Module 4 - Solves-Problems-Involving-Measures-Of-Position
16 pages
Pearson R Correlation
No ratings yet
Pearson R Correlation
2 pages
Paired t-Test Analysis Results
No ratings yet
Paired t-Test Analysis Results
4 pages
Ds Practical
No ratings yet
Ds Practical
25 pages
Descriptive Statistics Part 1
No ratings yet
Descriptive Statistics Part 1
18 pages
Scor Sex Varsta Mediu Scor Gcs Greutate (KG) Inaltime (M) Glicemie 1 2 3 18.4209128 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
No ratings yet
Scor Sex Varsta Mediu Scor Gcs Greutate (KG) Inaltime (M) Glicemie 1 2 3 18.4209128 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
3 pages
NAS Ultimate Algo 2.0V Script
No ratings yet
NAS Ultimate Algo 2.0V Script
17 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Wedge Tabla Formulas
No ratings yet
Wedge Tabla Formulas
3 pages
Introduction To The New Statistics Estimation Open Science and Beyond 1st Edition Geoff Cumming All Chapter Instant Download
100% (6)
Introduction To The New Statistics Estimation Open Science and Beyond 1st Edition Geoff Cumming All Chapter Instant Download
55 pages
Ch5 - Table of Z Scores
No ratings yet
Ch5 - Table of Z Scores
14 pages
Skewed & Symmetric Distributions Foldable
No ratings yet
Skewed & Symmetric Distributions Foldable
4 pages
Week 2-Jane Amelia Ma-12A2-10
No ratings yet
Week 2-Jane Amelia Ma-12A2-10
7 pages
Chapter-3 - El Niño - Apinan
No ratings yet
Chapter-3 - El Niño - Apinan
9 pages
Math AI SL Statistics Practice Blank Test Assesment
No ratings yet
Math AI SL Statistics Practice Blank Test Assesment
12 pages
Assignment
No ratings yet
Assignment
23 pages
Short Answer Type: 2 Marks Each: Statistics - IX Class Test 01
No ratings yet
Short Answer Type: 2 Marks Each: Statistics - IX Class Test 01
6 pages
ML - Lab-3.ipynb - Colab
No ratings yet
ML - Lab-3.ipynb - Colab
2 pages
ECON 380 Notes and Exams MyGUST
No ratings yet
ECON 380 Notes and Exams MyGUST
102 pages
BSC Psychology IV Apr2020 Statistical Inference
No ratings yet
BSC Psychology IV Apr2020 Statistical Inference
3 pages
Economics Students' GDP & Life Study
No ratings yet
Economics Students' GDP & Life Study
16 pages
Girls' Growth Standards Chart
No ratings yet
Girls' Growth Standards Chart
1 page
Intro to Data Science for Beginners
No ratings yet
Intro to Data Science for Beginners
2 pages
CH 08
No ratings yet
CH 08
39 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
Measures of Position
No ratings yet
Measures of Position
18 pages