0% found this document useful (0 votes)
5 views5 pages

Reviewer in StatAna - Chapter 2

Chapter 2 of the document discusses descriptive statistics, focusing on summarizing categorical and quantitative data through various methods such as frequency distributions, bar charts, pie charts, and histograms. It provides guidelines for determining class limits, class widths, and the use of graphical displays like dot plots and stem-and-leaf displays to visualize data. The chapter also highlights the importance of relative and percent frequency distributions in understanding data insights.

Uploaded by

e-shien.duran-21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

Reviewer in StatAna - Chapter 2

Chapter 2 of the document discusses descriptive statistics, focusing on summarizing categorical and quantitative data through various methods such as frequency distributions, bar charts, pie charts, and histograms. It provides guidelines for determining class limits, class widths, and the use of graphical displays like dot plots and stem-and-leaf displays to visualize data. The chapter also highlights the importance of relative and percent frequency distributions in understanding data insights.

Uploaded by

e-shien.duran-21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Reviewer in StatAna

Chapter 2 – Descriptive Statistics: Rating Frequency


Tabular and Graphical Displays Coca-cola 2
Diet coke 3
Þ Summarizing Data for a Categorical Dr. Pepper 5
Variable Pepsi 9
• Categorical data use labels or Sprite 1
names to identify categories of Total 20
like items.
Þ Summarizing Data for a Quantitative Relative Frequency and Percent Frequency
Variable Distributions
• Quantitative Data are numerical
values that indicate how much or Rating Relative Percent
how many. Frequency Frequency
Coca-cola .10 10
Summarizing Categorical Data Diet Coke .15 15
Dr. Pepper .25 25
§ Frequency Distribution Pepsi .45 45
§ Relative Frequency Distribution Sprite .05 5
§ Percent Frequency Distribution Total 1.00 100
§ Bar Chart
§ Pie Chart Relative Frequency Distribution

Frequency Distribution § The relative frequency of a class is the


fraction of proportion of the total number
§ A frequency distribution is a tabular of data items belonging to a class.
summary of data showing the number
(frequency) of observations in each of Formula:
several non-overlapping categories or Relative frequency of a class =
!"#$%#&'( *! +,# '-.//
&
classes.
§ The objective is to provide insights about
§ A relative frequency distribution is a tabular
the data that cannot be quickly obtained by
summary of a set of data showing the
looking only at the original data.
relative frequency for each class.
Example:
§ Soft Drink purchasers were asked to select
Bar Chart
one among the five popular soft drinks:
Coca-cola, Diet Coke, or Dr. Pepper, Pepsi,
§ A Bar Chart is a graphical display for
and Sprite.
depicting qualitative data.
§ Soft Drink selected by a sample of 20
§ On one axis (usually the horizontal axis), we
purchasers are:
specify the labels that are used for each of
the classes.
Coca-cola Pepsi Dr. Pepper § A Frequency, relative frequency, or percent
Diet Coke Dr. Pepper Dr. Pepper frequency scale can be used for the other
Dr. Pepper Pepsi Pepsi axis (usually the vertical axis).
Pepsi Coca-cola Diet Coke § Using a bar of fixed width drawn above
Pepsi Diet-Coke Dr. Pepper each class label, we extend the height
Pepsi Pepsi Sprite appropriately.
Pepsi Pepsi
§ The bars are separated to emphasize the § Stern-and-Leaf Display
fact that each class is separate.
Frequency Distribution
Pareto Diagram
Þ Step 1: Determine the number of
§ In quality control, bar charts are used to overlapping classes.
identify the most important causes of Þ Step 2: Determine the width of each class
problems. Þ Step 3: Determine the class limits.
§ When the bars are arranged in descending
order of height from left to right (with the Example: Sanderson and Clifford, a small public
most frequently occurring cause appearing accounting firm wants to determine time in days
first) the bar chart is called a Pareto required to complete year end audits. It takes a
Diagram. sample of 20 clients.
§ This diagram is named for its founder,
Vilfredo Pareto, an Italian economist. Year-end Audit Time (In Days)

Pie Chart 12 14 19 18
15 15 18 17
§ The Pie Chart is a commonly used graphical
20 27 22 23
display for presenting relative frequency
22 21 33 28
and percent frequency distributions for
categorical data. 14 18 16 13
§ First draw a circle, then use the relative
frequencies to subdivide the circle into Guidelines for Determining the Width of Each Class
sectors that correspond to the relative
frequency for each class. § Use classes of equal width.
§ Approximate Class Width =
§ Since there are 360 degrees in a circle, a -."0#/+ 1.+. 2.-%#3/4.--#/+ 1.+. 2.-%#
class with a relative frequency of .25 would &%45#" *! '-.//#/

consume .25(360) = 90 degrees of the circle. § Making the classes the same width reduces
§ Example: the chance of inappropriate interpretations.
- Inferences from the Pie Chart
o Almost one-half of the Note on Number of Classes and Class Width
customers surveyed preferred
Pepsi (looking at the left side of § In practice, the number of classes and the
the pie). appropriate class width are determined by
o The second preference is for Dr. trial and error.
Pepper with 25% of the § Once a possible number of classes is
customers opting for it. chosen, the appropriate class width is
o Only 5% of the customers opted found.
for Sprite. § The process can be repeated for a different
number of classes.
Summarizing Quantitative Data § Ultimately, the analyst uses judgement to
determine the combination of the number
§ Frequency Distribution of classes and class width that provides the
§ Relative Frequency and Percent Frequency best frequency distribution for summarizing
Distributions the data.
§ Dot Plot
§ Histogram
§ Cumulative Distributions
Guidelines for Determining the Class Limits Relative Frequency and Percent Frequency
Distributions
§ Class limits must be chosen so that each data
item belongs to one and only class. Example: Sanderson and Clifford
§ The lower class limit identifies the smallest
possible data value assigned to the class Insights obtained from the Percent Frequency
§ The upper class limit identifies the largest Distribution:
possible data value assigned to the class
§ The appropriate values for the class limits Þ 40% of the audits required from 15 to 19
depend on the level of accuracy of the data days.
§ An open-end requires only a lower class limit Þ Another 25% of the audits required 20 to 25
or an upper class limit. days.
Þ Only 5% of the audits required more than
Guidelines for Determining the Number of Classes 30 days.

§ Use between 5 and 20 classes. Audit time (in Relative Percent


§ Data sets with a larger number of elements days) Frequency Frequency
usually require a larger number of classes. 10 – 14 .20 20 (0.2 * 100)
§ Smaller data sets usually require fewer 15 – 19 .40 40
classes. 20 – 25 .25 25
§ The goal is to use enough classes to show 25 – 29 .10 10
the variation in the data, but not so many 30 – 34 .05 5
classes that some contain only a few data Total 1.00 100
items.
Dot Plot
Example: Sanderson and Clifford
§ One of the Simplest graphical summaries of
§ If we choose five classes: data is a dot plot.
§ Approximate Class Width = (33 – 12)/5 = 4.2 § A horizontal axis shows the range of data
@4 values.
§ Then each data value is represented by a
Time in days Frequency dot placed above the axis.
10 – 14 4
15 – 19 8 Example: Sanderson and Clifford
20 – 24 5
25 – 29 2
30 – 34 1
Total 20

Class Midpoint

§ In some cases, we want to know the Histogram


midpoints of the classes in a frequency
distribution for quantitative data. § Another common graphical display of
§ The class midpoint is the value halfway quantitative data is a histogram.
between the lower and upper class limits. § The variable of interest is placed on the
horizontal axis.
§ interval with its height corresponding to the Cumulative Distributions
interval’s frequency, relative frequency, or
percent frequency. § Cumulative frequency distribution – shows
§ Unlike a bar graph, a histogram has no the number of items with values less than
natural separation between rectangles of or equal to the upper limit of each class.
adjacent classes. § Cumulative relative frequency distribution –
shows the proportion of items with values
Histograms Showing Skewness less than or equal to the upper limit of each
class.
§ Moderately Skewed Left § Cumulative percent frequency distribution –
• A longer tail to the left shows the percentage of items with values
• Example: Exam Scores less than or equal to the upper limit of each
class.
§ Example: Sanderson and Cliffords

Audit Cumulative Cumulative Cumulative


time frequency Relative Percent
(days) Frequency Frequency
£ 14 4 .20 20
£ 19 12 .60 60
£ 24 17 .85 85
£ 29 19 .95 95
§ Moderately Right Skewed 20 1.00 100
£ 34
• A longer tail to the right
• Example: Housing Values § The last entry in a cumulative frequency
distribution always equals the total number
of observations.
§ The last entry in a cumulative relative
frequency distribution always equals 1.00.
§ The last entry in a cumulative percent
frequency distribution always equals 100.

Stem-and-Leaf Display

§ A stem-and-leaf display shows both the


§ Symmetric rank order and shape of the distribution of
• Left tail is the mirror image of the the data.
right tail § It is similar to a histogram on its side, but it
• Example: Heights of people has the advantage of showing the actual
data values.
§ The first digits of each data item arranged
to the left of a vertical line.
§ To the right of the vertical line we record
the last digit for each item in rank order.
§ Each line (row) in the display is referred to
as a stem.
§ Each digit on a stem is a leaf.
Number of questions answered correctly by 50
students.

Leaf Units

§ A single digit is used to define each leaf


§ In the preceding example, the leaf unit was
1.
§ Leaf units may be 100, 10, 1, 0.1, and so on.
§ Where the leaf unit is not shown, it is
assumed to equal 1.
§ The leaf unit indicates how to multiply the
stem-and-leaf numbers in order to
approximate the original data.
§ Example: Leaf Unit = 0.1
If we have data with values such as
8.6, 11.7, 9.4, 10.2, 11.0, 8.8

Leaf Unit = 0.1


8 68
9 14
10 2
11 07

§ Example; Leaf Unit = 10


If we have data with values such as
1806, 1717, 1974, 1791, 1682, 1910, 1838

Leaf Unit = 10
16 8
17 19
18 03
19 17

The 82 in 1682 is rounded down to 80 and is


represented as an 8.

You might also like