0% found this document useful (0 votes)

13 views54 pages

CH 06

The document discusses various methods for summarizing and displaying data, including the sample mean, sample variance, and sample range, highlighting their usefulness and limitations. It also covers stem-and-leaf diagrams, frequency distributions, histograms, and box plots, explaining how these tools can reveal important features of data sets. Additionally, time series plots are introduced to illustrate trends and cycles in data collected over time.

Uploaded by

ali282h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views54 pages

CH 06

Uploaded by

ali282h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

6.

6 SCATTER DIAGRAMS
6-1 Numerical Summaries of Data

Definition: Sample Mean

6-1 Data Summary and Display
Example 6-1
6-1 Data Summary and Display
Example 6-1
6-1 Data Summary and Display

Figure 6-1 Dot diagram showing the sample mean as a

balance point for a system of weights.
6-1 Data Summary and Display

Population Mean
For a finite population with N measurements, the
mean is

The sample mean x is a reasonable estimate of the

population mean µ .
6-1 Data Summary and Display

Although the sample mean is useful, it does not convey

all of the information about a sample of data
 Sample Variance
6-1 Data Summary and Display
How Does the Sample Variance Measure Variability?

Figure 6-2 How the sample variance measures variability

through the deviations xi − x .
6-1 Data Summary and Display
6-1 Data Summary and Display
6-1 Data Summary and Display

Example 6-2
6-1 Data Summary and Display

Efficient Computation of s2 :
6-1 Data Summary and Display

Computation of s2
6-1 Data Summary and Display

Population Variance
When the population is finite and consists of N values,
we may define the population variance as

The sample variance is a reasonable estimate of the

population variance.
6-1 Data Summary and Display

Figure 6-3
Relationship between
a population and a
sample.
6-1 Data Summary and Display

Definition: Sample Range

6-1 Data Summary and Display

Sample Range
• It is easy to calculate, but it ignores all of the information
in the sample data between the largest and smallest values.
• Example: the samples [1, 3, 5, 8, 9] and [1, 5, 5, 5, 9] both
have the same range (r = 8). However, the standard
deviation s1 = 3.35 > s2 = 2.83

 The variability is actually less in the second sample

• Sometimes, when the sample size is small (say 8 or 10) the

information loss associated with the range is not too serious
6-1 Data Summary and Display

Tutorial
6-1 Data Summary and Display
Tutorial
6-2 Stem-and-Leaf Diagrams

Steps for Constructing a Stem-and-Leaf Diagram

6-2 Stem-and-Leaf Diagrams
6-2 Stem-and-Leaf Diagrams
6-2 Stem-and-Leaf Diagrams

Figure 6-4 Stem-and-

leaf diagram for the
compressive strength
data in Table 6-2.
6-2 Stem-and-Leaf Diagrams

Example 6-4
Inspection of this display immediately reveals that:

• most of the compressive strengths lie between 110 and 200 psi
• a central value is somewhere between 150 and 160 psi
• the strengths are distributed approximately symmetrically
about the central value

 The stem-and-leaf diagram enables us to determine quickly

some important features of the data that were not immediately
obvious in the original display in Table 6-2
6-2 Stem-and-Leaf Diagrams

Example 6-5
6-2 Stem-and-Leaf Diagrams

Figure 6-5 Stem-and-leaf displays

for Example 6-5
6-2 Stem-and-Leaf Diagrams

Figure 6-6 A
typical computer-
generated Stem-
and-leaf diagram.
6-2 Stem-and-Leaf Diagrams
Data Features
• The median is a measure of central tendency that divides the data
into two equal parts, half below the median and half above.
 If the number of observations is even, the median is
halfway between the two central values.
From Fig. 6-6, the 40th and 41st values of strength as 160
and 163, so the median is (160 + 163)/2 = 161.5.
 If the number of observations is odd  the central value.

The sample mode is the most frequently occurring data value

The range is a measure of variability that can be easily computed

from the ordered stem-and-leaf display  range = 245 - 76 = 169.
6-2 Stem-and-Leaf Diagrams
Data Features
When an ordered set of data is divided into four equal parts, the
division points are called quartiles.

The first or lower quartile, q1 , is a value that has approximately

one-fourth (25%) of the observations below it and approximately
75% of the observations above.

The second quartile, q2, has approximately one-half (50%) of the

observations below its value.
 The second quartile is exactly equal to the median

The third or upper quartile, q3, has approximately three-fourths

(75%) of the observations below its value.
 As the median, the quartiles may not be unique
6-2 Stem-and-Leaf Diagrams
Data Features
• The compressive strength data in Figure 6-6  n = 80 observations

• Minitab calculates the first and third quartiles as the (n + 1)/4 and
3(n + 1)/4 ordered observations and interpolates as needed

 For example, (80 + 1)/4 = 20.25 and 3(80 + 1)/4 = 60.75.

• Therefore, Minitab interpolates:

• between the 20th and 21st observation to obtain q1 = 143.50
• between the 60th and 61st observation to obtain q3 =181.00.
6-2 Stem-and-Leaf Diagrams

Data Features
• The interquartile range is the difference between the upper
and lower quartiles, and it is sometimes used as a measure of
variability.
 IQR=q3-q1

• In general, the 100kth percentile is a data value such that

approximately 100k% of the observations are at or below this
value and approximately 100(1 - k)% of them are above it.
6-2 Stem-and-Leaf Diagrams

Tutorial
6-16.
6-3 Stem-and-Leaf Diagrams

Tutorial: 6-16.

Median? Q1? Q3?

6-3 Stem-and-Leaf Diagrams

Tutorial: 6-16.

Median? Q1? Q3?

6-3 Frequency Distributions and Histograms

• A frequency distribution is a more compact

summary of data than a stem-and-leaf diagram.

• To construct a frequency distribution, we must divide

the range of the data into intervals, which are usually
called class intervals, cells, or bins.

Constructing a Histogram (Equal Bin Widths):

6-3 Frequency Distributions and Histograms

Figure 6-7 Histogram of compressive strength for 80

aluminum-lithium alloy specimens
6-3 Frequency Distributions and Histograms

Figure 6-8 A histogram of the compressive strength data

from Minitab with 17 bins.
6-3 Frequency Distributions and Histograms

Figure 6-9 A histogram of the compressive strength data

from Minitab with nine bins
6-3 Frequency Distributions and Histograms

Choose the number of bins approximately equal to the

square root of the number of observations

A frequency distribution for the comprehensive strength data in

Table 6-2 is:
6-3 Frequency Distributions and Histograms

Relative frequency distribution

 Relative frequencies are found by dividing the observed
frequency in each bin by the total number of observations.

Cumulative frequency distribution

 are often easier to interpret than tables of data.
6-3 Frequency Distributions and Histograms

Figure 6-10 A cumulative distribution plot of the

compressive strength data from Minitab.
6-3 Frequency Distributions and Histograms

Figure 6-11 Histograms for symmetric and skewed distributions.

6-3 Frequency Distributions and Histograms

• Frequency distributions and histograms can also be used with

qualitative or categorical data
6-3 Frequency Distributions and Histograms

Figure 6-12 Boing Airplane production in 1985

6-4 Box Plots

• The box plot is a graphical display that simultaneously

describes several important features of a data set, such
as:
• center
• spread
• departure from symmetry
• identification of observations that lie unusually far
from the bulk of the data (Outlier, Extreme outlier)
6-4 Box Plots

Figure 6-13 Description of a box plot (called

also box-and-whisker plots)
6-4 Box Plots

Figure 6-14 Box plot for compressive strength data in Table 6-2
6-4 Box Plots

Figure 6-14 Box plot for compressive strength data in Table 6-2
6-4 Box Plots
6-4 Box Plots

Figure 6-15
Comparative box
plots of a quality
index at three plants.
6-5 Time Sequence Plots

• A time series or time sequence is a data set in

which the observations are recorded in the order in
which they occur.
• A time series plot is a graph in which the vertical
axis denotes the observed value of the variable (say x)
and the horizontal axis denotes the time (which could
be minutes, days, years, etc.).
• When measurements are plotted as a time series, we
often see
•trends,
•cycles, or
•other broad features of the data
6-5 Time Sequence Plots

Figure 6-16 Company sales by year (a) and by quarter (b).

6-5 Time Sequence Plots

Figure 6-17 A digidot plot of the compressive strength data

in Table 6-2.
6-5 Time Sequence Plots

Figure 6-18 A digidot plot of chemical process concentration

readings, observed hourly.

CH 06 V 1
No ratings yet
CH 06 V 1
42 pages
Chapter #6 Data Statistics
No ratings yet
Chapter #6 Data Statistics
46 pages
Chapter 6 Descriptive Statistics
No ratings yet
Chapter 6 Descriptive Statistics
49 pages
Thang ch06
No ratings yet
Thang ch06
40 pages
CH 06 V 1
No ratings yet
CH 06 V 1
39 pages
Chapter 6 - Descriptive Statistics
No ratings yet
Chapter 6 - Descriptive Statistics
59 pages
Random Sampling & Data Description
No ratings yet
Random Sampling & Data Description
9 pages
FYM - DOE - Lecture #2 PDF
No ratings yet
FYM - DOE - Lecture #2 PDF
51 pages
Intro to Descriptive Statistics
No ratings yet
Intro to Descriptive Statistics
17 pages
ch06 MGR
No ratings yet
ch06 MGR
39 pages
Slides - Montgomery
No ratings yet
Slides - Montgomery
240 pages
Descriptive Statistics Guide
No ratings yet
Descriptive Statistics Guide
16 pages
Descriptive Statistics 2
No ratings yet
Descriptive Statistics 2
32 pages
Ch06-Descriptive Statistics
No ratings yet
Ch06-Descriptive Statistics
29 pages
Day 2 YTU Statistics - 3455436a Becf 436e Bc74 840fabcaee53
No ratings yet
Day 2 YTU Statistics - 3455436a Becf 436e Bc74 840fabcaee53
34 pages
S1 Mesure of Location
No ratings yet
S1 Mesure of Location
33 pages
Data Visualization Essentials
No ratings yet
Data Visualization Essentials
87 pages
3-Data Description
No ratings yet
3-Data Description
91 pages
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
No ratings yet
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
24 pages
CH 06
No ratings yet
CH 06
23 pages
3-Random Sampling and Data Presentation
No ratings yet
3-Random Sampling and Data Presentation
23 pages
Data Description: Võ Văn N Am
No ratings yet
Data Description: Võ Văn N Am
23 pages
Data Description: Learning Objectives
No ratings yet
Data Description: Learning Objectives
16 pages
Data Visualization Techniques
No ratings yet
Data Visualization Techniques
18 pages
Chapter 2 Final of Final
No ratings yet
Chapter 2 Final of Final
158 pages
PDF Document
No ratings yet
PDF Document
28 pages
Unit 16 - Interpreting and Discussing Results
No ratings yet
Unit 16 - Interpreting and Discussing Results
6 pages
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
17 pages
What Is Raw Data?
No ratings yet
What Is Raw Data?
8 pages
QUALITATIVE DATA Are Measurements For Which There Is No Natural
No ratings yet
QUALITATIVE DATA Are Measurements For Which There Is No Natural
9 pages
Descriptive Statistics: Chapter 6 - Random Sampling and Data Description 1
No ratings yet
Descriptive Statistics: Chapter 6 - Random Sampling and Data Description 1
43 pages
Engineering Stats Essentials
No ratings yet
Engineering Stats Essentials
3 pages
Tugas APD Resume
No ratings yet
Tugas APD Resume
61 pages
CS 459 Chapter 2
No ratings yet
CS 459 Chapter 2
84 pages
S1 Mesure of Location
No ratings yet
S1 Mesure of Location
35 pages
As Level Math STATISTIC
No ratings yet
As Level Math STATISTIC
32 pages
Graphical Representations and Frequency Distribution
No ratings yet
Graphical Representations and Frequency Distribution
12 pages
Lecture 7 Quantitative Reasoning
No ratings yet
Lecture 7 Quantitative Reasoning
7 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
58 pages
05.1 Data Organization PRESENTATION
No ratings yet
05.1 Data Organization PRESENTATION
19 pages
Algebra1section9 2
No ratings yet
Algebra1section9 2
13 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
che4C3Notes 2006
No ratings yet
che4C3Notes 2006
96 pages
Data Summary and Presentation
No ratings yet
Data Summary and Presentation
33 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
21 pages
AEM Lecture 2
No ratings yet
AEM Lecture 2
71 pages
ESci 117-Module 2-Lesson 2.1
No ratings yet
ESci 117-Module 2-Lesson 2.1
15 pages
Math236 Lecture 3
No ratings yet
Math236 Lecture 3
62 pages
Types of Plots and Statistics Guide
No ratings yet
Types of Plots and Statistics Guide
3 pages
0 Boxplot
No ratings yet
0 Boxplot
18 pages
Intro to Mathematical Statistics
No ratings yet
Intro to Mathematical Statistics
42 pages
Statistics and Probability
No ratings yet
Statistics and Probability
196 pages
Note 02
No ratings yet
Note 02
31 pages
Statistics and Probability
No ratings yet
Statistics and Probability
253 pages
Topic 21 - Statistics by Ui
No ratings yet
Topic 21 - Statistics by Ui
58 pages
Chapter 2 - Representing Sample Data: Graphical Displays
No ratings yet
Chapter 2 - Representing Sample Data: Graphical Displays
16 pages
Eng 2015 Prelims Reviewer
No ratings yet
Eng 2015 Prelims Reviewer
11 pages
Chapter 3 - I
No ratings yet
Chapter 3 - I
3 pages
Chapter 2 - I
No ratings yet
Chapter 2 - I
7 pages
4-ODE - RK Methods
No ratings yet
4-ODE - RK Methods
49 pages
3-Root Finding-Open Methods-Multivariate
No ratings yet
3-Root Finding-Open Methods-Multivariate
19 pages
1-Root Finding-Open Methods
No ratings yet
1-Root Finding-Open Methods
44 pages
Statistics & R Programming Course
No ratings yet
Statistics & R Programming Course
6 pages
Wa0197.
No ratings yet
Wa0197.
4 pages
Minitab Regression Model Guide
No ratings yet
Minitab Regression Model Guide
2 pages
Econ 2701 Assignment Analysis
No ratings yet
Econ 2701 Assignment Analysis
5 pages
Class 12 Applied Math Pre-Board Exam
No ratings yet
Class 12 Applied Math Pre-Board Exam
6 pages
Educ 404 (Statistics For Educational Research)
No ratings yet
Educ 404 (Statistics For Educational Research)
5 pages
Cau Hoi Tham Khao
No ratings yet
Cau Hoi Tham Khao
5 pages
12.0 PP 92 107 Local Asymptotic Normality
No ratings yet
12.0 PP 92 107 Local Asymptotic Normality
16 pages
Assignments 6th Sem 2018-19
No ratings yet
Assignments 6th Sem 2018-19
4 pages
Power Law Distribution in Empirical Data
No ratings yet
Power Law Distribution in Empirical Data
27 pages
P&S Imp Questions
No ratings yet
P&S Imp Questions
5 pages
Markov Chains: SOA Exam MLC Guide
No ratings yet
Markov Chains: SOA Exam MLC Guide
110 pages
Point Estimation An Introduction
No ratings yet
Point Estimation An Introduction
10 pages
MathEcon18 FinalExam Solution
No ratings yet
MathEcon18 FinalExam Solution
13 pages
Question Bank Itc
0% (1)
Question Bank Itc
14 pages
Favourite Mexican Restaurant N Mean Std. Deviation Std. Error Mean X12 - Friendly Employees Jose's South-Western Cafe 152 4.80 .489 .040 Santa Fe Grill 253 2.94 .947 .060
No ratings yet
Favourite Mexican Restaurant N Mean Std. Deviation Std. Error Mean X12 - Friendly Employees Jose's South-Western Cafe 152 4.80 .489 .040 Santa Fe Grill 253 2.94 .947 .060
13 pages
Dummy Dependent Variables Models
No ratings yet
Dummy Dependent Variables Models
15 pages
Mean, Median, Mode-Jayesh Menashi-2019-08-09
No ratings yet
Mean, Median, Mode-Jayesh Menashi-2019-08-09
3 pages
Bayes Nets: Understanding Uncertainty
No ratings yet
Bayes Nets: Understanding Uncertainty
36 pages
STA256H5F Summer2024 Term Test Solutions
No ratings yet
STA256H5F Summer2024 Term Test Solutions
5 pages
Statistical Methods Test Questions
No ratings yet
Statistical Methods Test Questions
8 pages
Statistics and Aprobability Q3 - Quarterly Assessment
No ratings yet
Statistics and Aprobability Q3 - Quarterly Assessment
3 pages
(Ebook) Beyond Multiple Linear Regression: Applied Generalized Linear Models and Multilevel Models in R by Paul Roback, Julie Legler ISBN 9781439885383, 1439885389 PDF Download
100% (1)
(Ebook) Beyond Multiple Linear Regression: Applied Generalized Linear Models and Multilevel Models in R by Paul Roback, Julie Legler ISBN 9781439885383, 1439885389 PDF Download
68 pages
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
100% (1)
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
4 pages
STAT3007 Problem Sheet 3
No ratings yet
STAT3007 Problem Sheet 3
3 pages
Multiple Regression Guide
No ratings yet
Multiple Regression Guide
61 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
22 pages
Threats
0% (1)
Threats
6 pages
A-Level Statistics Exam Guide
No ratings yet
A-Level Statistics Exam Guide
24 pages

CH 06

Uploaded by

CH 06

Uploaded by

6.

Definition: Sample Mean

Figure 6-1 Dot diagram showing the sample mean as a

The sample mean x is a reasonable estimate of the

Although the sample mean is useful, it does not convey

Figure 6-2 How the sample variance measures variability

The sample variance is a reasonable estimate of the

Definition: Sample Range

 The variability is actually less in the second sample

• Sometimes, when the sample size is small (say 8 or 10) the

Steps for Constructing a Stem-and-Leaf Diagram

Figure 6-4 Stem-and-

 The stem-and-leaf diagram enables us to determine quickly

Figure 6-5 Stem-and-leaf displays

The sample mode is the most frequently occurring data value

The range is a measure of variability that can be easily computed

The first or lower quartile, q1 , is a value that has approximately

The second quartile, q2, has approximately one-half (50%) of the

The third or upper quartile, q3, has approximately three-fourths

 For example, (80 + 1)/4 = 20.25 and 3(80 + 1)/4 = 60.75.

• Therefore, Minitab interpolates:

• In general, the 100kth percentile is a data value such that

Median? Q1? Q3?

Median? Q1? Q3?

• A frequency distribution is a more compact

• To construct a frequency distribution, we must divide

Constructing a Histogram (Equal Bin Widths):

Figure 6-7 Histogram of compressive strength for 80

Figure 6-8 A histogram of the compressive strength data

Figure 6-9 A histogram of the compressive strength data

Choose the number of bins approximately equal to the

A frequency distribution for the comprehensive strength data in

Relative frequency distribution

Cumulative frequency distribution

Figure 6-10 A cumulative distribution plot of the

Figure 6-11 Histograms for symmetric and skewed distributions.

• Frequency distributions and histograms can also be used with

Figure 6-12 Boing Airplane production in 1985

• The box plot is a graphical display that simultaneously

Figure 6-13 Description of a box plot (called

• A time series or time sequence is a data set in

Figure 6-16 Company sales by year (a) and by quarter (b).

Figure 6-17 A digidot plot of the compressive strength data

Figure 6-18 A digidot plot of chemical process concentration

You might also like