0% found this document useful (0 votes)

20 views84 pages

Lecture 6

Uploaded by

umutkeiyinci

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views84 pages

Lecture 6

Uploaded by

umutkeiyinci

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

MATH 240 – INTRODUCTION

TO PROBABILITY AND
STATISTICS FOR ENGINEERS
Chapter 1
Introduction to
Statistics and
Data Analysis

Copyright © 2017 Pearson Education ,Ltd. All rights reserved.

Probability and Statistics

• Probability: a field devoted to the study of

random variation in systems
– Inferential statistics (where we use the information
in a sample to draw a conclusion about the
population)
– Other applications in engineering
• Statistics deals with collection, presentation,
analysis and use of data to make decisions and
solve problems.
Statistical Problem-Solving
The engineering or scientific method of formulating and solving
problems follows certain steps:

Where we use
Engineering
Statistics Collect data
Data, Information, and
Knowledge
• Data:
– Quantifiable measurements of some physical
phenomenon
– “Patient A weighs 80 kg”
– “The date is April 1, 2019”
• Information:
– Synthesized data that produces meaning
– “The average weight of patients over a 10 week period
dropped by 5 kgs when taking drug X”
• Knowledge:
– Our internal model of the way the world works
– Present in the human mind
– Used to make predications about cause/effect
relationships
– “I think that drug X causes weight loss”
Data, Information, Decision

Data Measure
Statistics
Information Compare

Knowledg Decide
e
Figure 1.2 Fundamental relationship
between probability and inferential
statistics

• Estimating properties of the population without

examining the entire population
Population and Sample
Random sample

If you were to take two different random

samples from the same population and calculate
the sample means, you would expect them to be
different.
Variability
• Inconsistency
– Repeated experiments (runs) will yield
slightly different results
• Sources of Variation
– Natural variation
– Assignable causes
Making Comparisons

Dot diagram: useful

for displaying a small
number of data points
(≈20).

Allows us to see the

location, and scatter
or variability of the
dataset.

• Does wall thickness have an effect?

• How confident can you be? Do we know that another
specimen will not give another result?
• Is this sample adequate?
• Statistical methodology can answer these questions.
Techniques of Statistical Inference

• Point estimation (lecture 7)

• Interval estimation (lecture 8)
• Hypothesis testing (lecture 9)
Section 1.3
Measures of
Location: The
Sample Mean
and Median

Copyright © 2017 Pearson Education ,Ltd. All rights reserved.

Three things to check while
analyzing data

• Central Tendency
– Address of population
• Spread (Scatter)
– How wide the population
is
– How high the variability is
• Shape
– How the distribution looks
Definition 1.1

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 15

Measures of Central
Tendency

1. Mean (Average)
sample
Individual
observation
n
sample S i=1x i
mean x=
n
sample size
Class Exercise 1:
Xi
11.8
11.9
• What is the sample
11.8 average?
12.4 = 12.0
12.8
12.4
12.1
12.6
12.0
11.3
11.8
11.7
11.5
11.9
Geometric Meaning of Mean

Repeated runs

Mean is the target value.

Geometric Meaning of Mean

n
å (x i - x) = 0
• Centroid of the data i =1
• Fulcrum that balances the weights
Measures of Central
Tendency

1. Mean (Average)
population
individual
observation
N
S i=1x i
µ=
population
mean
N
population
Watch out for the difference
between sample and a size
population!
Mean
• Advantages of Using the mean
– It is the center of gravity of the data
– It uses all data
– No sorting is needed
• Disadvantages of Using the mean
– The mean may not be the actual value of any
data points.
– Extreme data values may distort the picture
Effect of Extreme Points

Extreme point:
outlier
mean

• Assignable cause
• Potential reasons: Wrong measurement,
different population
Trimmed means

• In order to alleviate the problems of extreme

points on the mean, the trimmed means method
may be used.
– Calculated by trimming away a certain percent of
both the largest and smallest set of values.
– 10% trimmed mean would indicate the top and
bottom 10% of data points are omitted from the
calculation of the mean.

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 23

Trimmed means

• Consider the following 5 data points from an

experiment.
– The mean calculation would yield 2.90
– The 20% trimmed mean would yield 2.60, hence
minimize the effect of extreme values.
1.5
2.2
2.5
3.1
5.2
Measures of Central Tendency

2. Median
middle value when the data is ordered
in ascending or descending order
If n is odd If n is even
~
x = x n +1 xn + xn
+1
2 ~
x= 2 2
2
Definition 1.2

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 26

Class Exercise 2:

Xi
11.8
11.9 • What is the median?
11.8
12.4
12.8 = 11.9
12.4
12.1
12.6
12.0
11.3
11.8
11.7
11.5
11.9
Median

• Advantages of using median

– provides an idea where most data are
located
– little calculation required
• Disadvantages of using median
– data must be sorted and arranged
– does not use all the data
– extreme values may be important
Measures of Central
Tendency

3. Mode

the most frequently occurring number

in a data set
Class Exercise 3:
Xi
11.8
11.9 • What is the sample mode?
11.8
12.4
12.8
12.4 Mode = 11.8
12.1
12.6
12.0
11.3
11.8
11.7
11.5
11.9
Mode

• Advantages
– no calculations necessary
– not influenced by extreme values
– an actual value

• Disadvantage
– The data may not have a mode!
Example

• Suppose a data set consists of the following

observations:
0.32 0.53 0.28 0.37 0.47 0.43 0.36 0.42 0.38 0.43
• Find the mean, median, and mode.
• Mean: 0.399
• Median: 0.40
• Mode: 0.43

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 32

Section 1.4
Measures of
Variability

Copyright © 2017 Pearson Education ,Ltd. All rights reserved.

Three things to check while
analyzing data

• Central Tendency
– Address of population
• Spread (Scatter)
– How wide the population
is
– How high the variability is
• Shape
– How the distribution looks
Measures of Spread

1. Range
R = Max - Min
• Advantages
• easy to calculate
• Disadvantages
• does not use all the data
• if n>7, use standard
deviation
Class Exercise 4:

Xi
11.8
11.9 • What is the Range?
11.8
12.4
12.8
12.4 = 12.8 – 11.3 = 1.5
12.1
12.6
12.0
11.3
11.8
11.7
11.5
11.9
Measures of Spread

2. Standard Deviation
population sample

S iN=1 ( x i - µ) 2 S (x i - x)
n 2
s= s= i =1

N n -1

If we knew the mean of the population, we would not

need a sample. In practice, μ is almost never known and
so a sample needs to be used.
However, observations tend to be closer to x than μ. To
compensate for this, we use n-1 as the divisor rather
Measures of Spread

3. Variance
population sample

S N
( x - µ ) 2
S n
( x - x ) 2
s 2 = i =1 i s 2 = i =1 i
N n -1

• Variance is square of standard deviation

• Standard deviation used more frequently
– Same unit as the measures of central tendency
Definition 1.3

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 39

How to calculate s

x = 104.0/8 = 13.0
How to calculate s

1.6
s =
2
= 0.229
7
s = 0.229 = 0.479
Standard Deviation

0.479 0.479
What is the sample standard deviation for the
given dataset?

Xi Xi - X (Xi – X)2
11.8 -0.2 0.04 X = 12.0
11.9 -0.1 0.01
11.8 -0.2 0.04
12.4 0.4 0.16 Sample variance = s2
12.8 0.8 0.64 = 2.30/(14-1) = 0.177
12.4 0.4 0.16
12.1 0.1 0.01
12.6 0.6 0.36
12.0 0.0 0.0 Sample standard
11.3 -0.7 0.49 deviation = s =
11.8 -0.2 0.04 √0.177 = 0.42
11.7 -0.3 0.09
11.5 -0.5 0.25
11.9 -0.1 0.01

Sum=2.30
What is the population variance?

xi x i - x (x i - x )
2

11.8
11.9
11.8
12.4
12.8
12.4
12.1
12.6
12.0
11.3
11.8
11.7
11.5
11.9
What is the population variance?

xi x i - x (x i - x )2
11.8 -0.2 0.04
11.9 -0.1 0.01
11.8 -0.2 0.04 Population variance = s2
12.4 0.4 0.16 = 2.30/(14) = 0.164
12.8 0.8 0.64
12.4 0.4 0.16
12.1 0.1 0.01
12.6 0.6 0.36 Population standard
12.0 0.0 0.00 deviation = s =
11.3 -0.7 0.49 √0.164 = 0.405
11.8 -0.2 0.04
11.7 -0.3 0.09
11.5 -0.5 0.25
11.9 -0.1 0.01
Class Exercise

• Preventing fatigue crack propagation in aircraft

structures is an important element of aircraft
safety. An engineering study to investigate
fatigue crack in n = 9 cyclically loaded wing
boxes reported the following crack lengths (in
mm): 2.13, 2.96, 3.02, 1.82, 1.15, 1.37, 2.04,
2.47, and 2.60. Calculate the sample average and
sample standard deviation. Construct a dot
diagram of the data.
Section 1.6
Statistical
Modeling,
Scientific
Inspection, and
Graphical
Diagnostics

Copyright © 2017 Pearson Education ,Ltd. All rights reserved.

• Graphical presentation of data may yield
important information in engineering.
– Dot diagram
– Scatter plot
– Stem and leaf diagram
– Histogram
– Box plots

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 48

Table 1.1 Data Set for
Example1.2

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 49

Observing Processes Over Time
Example: Tensile strength

Copyright © 2017 Pearson Education Ltd. All rights reserved. 1 - 51

Figure 1.5 Scatter plot of tensile
strength and cotton percentages
One method of presentation is
the scatter plot.

Example: Car Battery Life

Table 1.5 Stem-and-Leaf Plot of
Battery Life
Another method is the stem and leaf plot

30
Frequency (number)

25
20
15
10
5
0
1,5 2,5 3,5 4,5
Battery Life (years)
Table 1.6 Double-Stem-and-Leaf
Plot of Battery Life

Table 1.7 Relative Frequency
Distribution of Battery Life

Figure 1.6 Relative frequency
histogram
A histogram takes the
information from the
stem and leaf table to
graphically represent
data

Figure 1.7 Estimating frequency
distribution

Stem and Leaf example –
number of bins
Stem and Leaf example –
number of bins
Stem and Leaf example –
number of bins
Figure 1.8 Skewness of data

Characteristics of a Stable
Distribution

• Most of the data are

near the average
• centerline divides
curve into two
symmetrical halves
• few points near MAX
and MIN
• bell-shaped
• no points beyond
curve
Unstable Distributions

Skewed Spikes
Unstable Distributions

not a bell-shaped curve

• unstable process
• unpredictable
• assignable causes
• Variation
•Natural
•Assignable causes® Determine

Bi-modal
Here is another example on graphical
representation of data
Graphical Presentation of Data

BOX PLOTS

Percentiles

• kth percentile is point under which

approximately k% of ordered samples lie
beneath (100-k% lie above)
• Computation:
– Compute value 0.01*k*(n+1) to determine samples
above and below
– Use value to interpolate between the two samples
Percentile Example

• Compute 25th percentile of samples of RMS voltage in

office electrical outlet:
– 110v, 111v, 110.5v, 109v, 110v
• 0.01*k*(n+1) = 0.01*25*(5+1) = 1.5
• Interpolate between 1st and 2nd samples:
– 109 and 110
• 25th percentile = 109+(110-109)*0.5=109.5
Quartiles

• When an ordered set of data is divided into four

equal parts, the division points/values are called
quartiles

• 1st Quartile (q1): 25th percentile (25% of dataset

below value)
– Also called Lower Quartile
• 2nd Quartile (q2): 50th percentile (equal to the
median)
• 3rd Quartile (q3): 75th percentile
– Also called Upper Quartile
Quartiles

• Quartiles may not be unique points. You might

need to calculate the 20.25th point in the dataset
– Use interpolation
• q1 = (n+1)/4 q 3 =3(n+1)/4

• Interquartile Range (IQR):

– Difference between Upper and Lower Quartiles: q 3 -
q1
Box Plots to represent data

• Box plots are a graphical display that describes several important

features such as the center, spread, departure from symmetry,
observations that lie unusually far from the bulk of the data (outliers).
• The 3 quartiles are displayed on a rectangular box. q1 is the left or lower
edge of the box, q 3 is the right or upper edge.
• A whisker extends outward by 1.5 IQR from each end of the box.
• Data beyond whiskers are plotted as individual points.
• Data beyond whiskers but within 3 IQR from the edges are outliers.
Points beyond 3 IQR from the edges are extreme outliers.
Box Plots to represent data

Q1 Q3

median IQR=Q3-Q1

Between 1.5 - 3 IQR away ® outlier

Beyond 3 IQR away ® extreme outlier
Box Plot of Compressive Data
Table 1.8 Nicotine Data for
Example 1.5
Figure 1.9 Box-and-whisker plot
for Example 1.5
Exercise

• A battery-operated pacemaker device helps the

human heart to beat in regular rhythm. The
activation rate is important in stimulating the
heart, when necessary. Fourteen activation rates
(in sec.) were collected on a newly designed
device: 0.670, 0.697, 0.699, 0.707, 0.732, 0.733,
0.737, 0.747, 0.751, 0.774, 0.777, 0.804, 0.819,
0.827
• (a) Compute the sample mean and standard
deviation
• (b) Find the sample upper and lower quartiles
• (c) Find the sample median
• (d) Construct a box plot of the data

43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
Math236 Lecture 2
No ratings yet
Math236 Lecture 2
64 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
No ratings yet
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
37 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Engineering Statistics Overview
No ratings yet
Engineering Statistics Overview
31 pages
Statistical Methods in Social Sciences
No ratings yet
Statistical Methods in Social Sciences
69 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
No ratings yet
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
4 pages
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
No ratings yet
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
70 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Math
No ratings yet
Math
6 pages
Statistics: Central Tendency & Variability
No ratings yet
Statistics: Central Tendency & Variability
8 pages
المحاضرة الثالثة
No ratings yet
المحاضرة الثالثة
16 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
19 pages
MCS Lecture 3
No ratings yet
MCS Lecture 3
57 pages
Lecture 06-Describing Data Visual Information
No ratings yet
Lecture 06-Describing Data Visual Information
49 pages
Unit 2 DS PDF
No ratings yet
Unit 2 DS PDF
97 pages
St130: Basic Statistics Week 3: Lecture: School of Computing Information and Mathematical Sciences
No ratings yet
St130: Basic Statistics Week 3: Lecture: School of Computing Information and Mathematical Sciences
62 pages
Screenshot 2024-07-22 at 10.26.36 AM
No ratings yet
Screenshot 2024-07-22 at 10.26.36 AM
35 pages
Module 3 - Branches of Statistics
No ratings yet
Module 3 - Branches of Statistics
50 pages
Measures of Central Tendency
100% (1)
Measures of Central Tendency
48 pages
Lecture 2: Graphical Techniques and Numerical Measures
No ratings yet
Lecture 2: Graphical Techniques and Numerical Measures
40 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Chapter 3 Statistical Description of Data
No ratings yet
Chapter 3 Statistical Description of Data
55 pages
Descriptive Statistics
100% (3)
Descriptive Statistics
41 pages
Introduction To Descriptive Statistics: Jackie Nicholas
No ratings yet
Introduction To Descriptive Statistics: Jackie Nicholas
41 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Business Statistics - Session Descriptive Statistics
No ratings yet
Business Statistics - Session Descriptive Statistics
28 pages
Intro to Statistics for Beginners
No ratings yet
Intro to Statistics for Beginners
6 pages
Central Tendency & Variability
No ratings yet
Central Tendency & Variability
5 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
6.1 Basic Statistic
No ratings yet
6.1 Basic Statistic
10 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Chapt3 Overheads
No ratings yet
Chapt3 Overheads
8 pages
Chapter 2 BSC TY Statistical Data Analysis
No ratings yet
Chapter 2 BSC TY Statistical Data Analysis
124 pages
Week 4 Bioscience
No ratings yet
Week 4 Bioscience
37 pages
Introduction to Statistical Analysis
No ratings yet
Introduction to Statistical Analysis
10 pages
Statistics Assignment Chinar Dawod Ozair
100% (1)
Statistics Assignment Chinar Dawod Ozair
12 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Ec310 Day 2 Lecture Notes
No ratings yet
Ec310 Day 2 Lecture Notes
10 pages
Ebook - Data Analytics Course
No ratings yet
Ebook - Data Analytics Course
510 pages
Six Sigma Statistics Essentials
No ratings yet
Six Sigma Statistics Essentials
38 pages
Intro Summary of Statistics PLTW Slide Show
No ratings yet
Intro Summary of Statistics PLTW Slide Show
47 pages
Chap 003
No ratings yet
Chap 003
15 pages
Chapter 4 Basic Statistics
No ratings yet
Chapter 4 Basic Statistics
22 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
55 pages
Descriptive Statistics Guide
No ratings yet
Descriptive Statistics Guide
42 pages
Lecture Notes 2 - Descriptive Statistics-1720598791715
No ratings yet
Lecture Notes 2 - Descriptive Statistics-1720598791715
21 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
Descriptive Statistics & Data Analysis
No ratings yet
Descriptive Statistics & Data Analysis
48 pages
Describing Data - Numerical Measure
No ratings yet
Describing Data - Numerical Measure
33 pages
Midterms Notes (MMW)
No ratings yet
Midterms Notes (MMW)
8 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Stat Handout
No ratings yet
Stat Handout
7 pages
Lecture 3 - Numerical Statistics
No ratings yet
Lecture 3 - Numerical Statistics
7 pages
Texting 101 Book
83% (6)
Texting 101 Book
39 pages
Source Inspection Complete Setup Process in S4 Han...
No ratings yet
Source Inspection Complete Setup Process in S4 Han...
3 pages
English Teacher's Guide
No ratings yet
English Teacher's Guide
44 pages
Contoh Artikel Tugas 8 (Survey)
No ratings yet
Contoh Artikel Tugas 8 (Survey)
8 pages
List of United States Urban Areas
No ratings yet
List of United States Urban Areas
43 pages
How Does Distance Impact Light Intensity.
No ratings yet
How Does Distance Impact Light Intensity.
31 pages
Relationship Science & Love Styles
No ratings yet
Relationship Science & Love Styles
20 pages
G12 Saffron Clearance 2ND Sem
No ratings yet
G12 Saffron Clearance 2ND Sem
1 page
Field Reportin Coxs Bazar
No ratings yet
Field Reportin Coxs Bazar
24 pages
01 +Deep+Breathing+1-6
No ratings yet
01 +Deep+Breathing+1-6
6 pages
OTC 7799 Strength and Stiffness of Tubular Joints For Assessment/Design Purposes
No ratings yet
OTC 7799 Strength and Stiffness of Tubular Joints For Assessment/Design Purposes
8 pages
Advanced Reading Part 5
No ratings yet
Advanced Reading Part 5
6 pages
Architecture For The New Nation
No ratings yet
Architecture For The New Nation
1 page
Modification Elementary Row Operations To Determine The Inverse of Trapezoidal Fuzzy Numbers Matrix
No ratings yet
Modification Elementary Row Operations To Determine The Inverse of Trapezoidal Fuzzy Numbers Matrix
7 pages
Puzzles Predicaments and Perplexities III v1.1
100% (10)
Puzzles Predicaments and Perplexities III v1.1
42 pages
It's Not All About Me The Top Ten Techniques For Building Quick Rapport With Anyone PDF
74% (19)
It's Not All About Me The Top Ten Techniques For Building Quick Rapport With Anyone PDF
177 pages
Chương 5 - Đánh Giá R I Ro - Safety Risk Assessments - Training Material
No ratings yet
Chương 5 - Đánh Giá R I Ro - Safety Risk Assessments - Training Material
31 pages
Paras
No ratings yet
Paras
2 pages
Chapter 5
No ratings yet
Chapter 5
33 pages
Dr. Suvandan Saraswat: Machine Design I (NME-501)
No ratings yet
Dr. Suvandan Saraswat: Machine Design I (NME-501)
47 pages
Corning Gorilla Glass 4
100% (1)
Corning Gorilla Glass 4
2 pages
Philippine Parental Involvement in Schools
100% (1)
Philippine Parental Involvement in Schools
2 pages
Zenith Discharge Pressure Assembly Spec
No ratings yet
Zenith Discharge Pressure Assembly Spec
1 page
Mesp-Tanzania STD V April End Month Exam
No ratings yet
Mesp-Tanzania STD V April End Month Exam
26 pages
Mathematics: Self-Learning Module 7
50% (4)
Mathematics: Self-Learning Module 7
15 pages
01unit-1 Design Thinking Principles - Note-1
No ratings yet
01unit-1 Design Thinking Principles - Note-1
31 pages
Advertisement Financial Scholarship Opportunity by Islamic Relief Pakistan (IRP) For M.S - M
No ratings yet
Advertisement Financial Scholarship Opportunity by Islamic Relief Pakistan (IRP) For M.S - M
6 pages
Ethics History Theory and Contemporary Issues 7th Edition Unlocked Test Bank
No ratings yet
Ethics History Theory and Contemporary Issues 7th Edition Unlocked Test Bank
318 pages
Example Rogerian Argument Essay
100% (2)
Example Rogerian Argument Essay
7 pages
Adidas and Asics
No ratings yet
Adidas and Asics
2 pages