Statistics-1
Section (1)
Prepared By : Doaa Ghaleb
Definition of statistics Online
Offline
1- A nalyze
● Statistics : Data
2- P resent
3- C ollect
Data Mining , Data Analysis , Algorithms , …etc.
Market Basket Analysis Patterns Decision
Problem Solving Problem Solve Decision
Names IDs
Definition of Data Doaa 20202341
Silvana 20202331
● Data : Unit of information + Set of values + Raw facts which do not carry
Sahar 20202321
any specific meaning
….. ….
Quantitative Data Qualitative Data
(Numeric) (Non-Numeric)
Data is NOT expressed in terms of numbers
Data is expressed in terms of numbers
BUT rather by natural language description
Quantity Quality
Examples : Age , height Examples : Names
Types of Data
Quantitative Data (Numeric) Qualitative Data (Non-Numeric)
Data Type of Data
Family Size Quantitative
Street no. Quantitative
Balance in your account Quantitative
Height = 165 Quantitative
state Qualitative
Note : Height = “tall” Qualitative
No.
# street Qualitative
Number # of students Quantitative
Data vs. Information
Data Information Names IDs GPAs
Raw facts Data with context Doaa 20202341 4
No context Processed Data Silvana 20202331 3.5
Just numbers and text Summarized , Organized , Analyzed Data Sahar 20202321 1.5
….. …. ….. …. ….
Data Information ….
process
Examples :
List of Names Data
List of IDs
Data
Doaa get GPA of 4.0 Information
Average of GPA’s Information
Count of students Information
Quantitative Data Qualitative Data
(Numeric) (Non-Numeric)
Continuous Data Types of Data
Discrete
(cts. / con’t) Distance Quantitative(cts.)
ONLY certain values Any value within Color Qualitative
+ usually integers interval # of rooms Quantitative(Discrete)
Temperature Quantitative(cts.)
Examples : Family Size Examples : Height
Weight Quantitative(cts.)
X a random value that Quantitative(Discrete)
takes = 0.1,0.2,0.3
ONLY
Representing Data
Quantitative Data Qualitative Data
(Numeric) (Non-Numeric)
1. Table ( Frequency)
1. Table ( Frequency)
2. Graph
2. Graph
2.1 Histogram
2.1 Bar Chart
2.2 Polygon
2.2 Pie Chart
3. Numerical Measures
Types of statistics
1- Analyze
● Statistics : Data
2- Present
3- Collect
Descriptive Statistics Inferential Statistics
Describe Data 1. Table ● Sample :Subset of population
using : 2. Graph ● Population : Collect of all observation of a
3. Numerical Measures specified characteristics.
Making Generalization about the whole
(population) by examination of the part (sample)
1.1 Qualitative Data
(Non – Numeric Data )
Representing Data
Quantitative Data Qualitative Data
(Numeric) (Non-Numeric)
1. Table ( Frequency)
1. Table ( Frequency)
2. Graph
2. Graph
2.1 Histogram
2.1 Bar Chart
2.2 Polygon
2.2 Pie Chart
3. Numerical Measures
1- Qualitative Data (Non-Numeric Data)
1. Table ( Frequency)
F F F M M F M M
M M M F M F M M
→ Construct Frequency Table
Classes Frequency (f) Relative Frequency Percentage Frequency 𝒇
Degree = * 360°
𝒇 𝒇 𝒏
(RF) = (%) = * 100
𝒏
𝒏
F //// / 6 𝟔 𝟔 𝟔
* 100 = 37.5% * 360°= 1𝟑𝟓°
𝟏𝟔 𝟏𝟔 𝟏𝟔
M //// //// 10 𝟏𝟎 𝟏𝟎 𝟏𝟎
* 100 = 62.5% * 360°= 225°
𝟏𝟔 𝟏𝟔 𝟏𝟔
Sample Size n = 16
Classes Frequency (f)
F / /// / 6
2. Graph ( Bar Chart / Pie Chart)
Classes Frequency (f) Relative Frequency Percentage Frequency 𝒇
Degree = * 360°
𝒇 𝒇 𝒏
(RF) = (%) = * 100
𝒏
𝒏
F //// / 6 𝟔 𝟔 𝟔
* 100 = 37.5% * 360°= 135°
𝟏𝟔 𝟏𝟔 𝟏𝟔
M //// //// 10 𝟏𝟎 𝟏𝟎 𝟏𝟎
* 100 = 62.5% * 360°= 225°
𝟏𝟔 𝟏𝟔 𝟏𝟔
Bar Chart Pie Chart
F (Female) F (Female)
62.5%
10 - M (Male) M (Male)
8- 37.5%
37.5% F
6-
62.5%
4- M
2-
F M
Some Definitions
● Frequency: is the number of times an event or item occurs in a data set.
● Relative Frequency The relative frequency of a particular observation or class interval is found by
dividing the frequency (f) by the number of observations (n): that is, (f ÷ n). Thus:
Relative frequency = frequency ÷ number of observations
● Percentage Frequency: The percentage frequency is found by multiplying each relative frequency value
by 100. Thus:
● Percentage frequency = relative frequency * 100 = (f ÷ n )* 100
● Cumulative Frequency: Cumulative frequency is the accumulation of the previous frequency. To find the
cumulative frequencies, add all the previous frequencies to the frequency for the current row.
● Frequency distribution: A chart or table showing how often each value or range of values of a variable
appears in a data set.
● Frequency table: a table presenting statistical data by putting together a value of characteristics along
with the number of times each value appears in the data set.
Another Example
● Construct a Pie Chart for the following table.
Classes Frequency (f) 𝒇
Degree = * 360°
𝒏
𝟏𝟎𝟎𝟎
T.V 1000 *360°=60°
𝟔𝟎𝟎𝟎
𝟐𝟎𝟎𝟎
Newspapers 2000 *360°=120°
𝟔𝟎𝟎𝟎
Posters 3000 𝟑𝟎𝟎𝟎
*360°=180°
𝟔𝟎𝟎𝟎
1.2 Quantitative Data
(Numeric Data )
Representing Data
Quantitative Data Qualitative Data
(Numeric) (Non-Numeric)
1. Table ( Frequency)
1. Table ( Frequency)
2. Graph
2. Graph
2.1 Histogram
2.1 Bar Chart
2.2 Polygon
2.2 Pie Chart
3. Numerical Measures
2- Quantitative Data (Numeric Data)
20 , 25 , 22 , 80 , 48 , 21 , 33 , 60 , 40 , 55 .
A sample of 10 with smallest value of 20 and largest value of 80 , Construct
Frequency Table
1. Table ( Frequency)
1) Sort Data 20 , 21 , 22 , 25 , 33 , 40 , 48 , 55 , 60 ,80
2) Get Sample Size n = 10
3) Calculate # of classes 2k ≥ n → 2k ≥ 10 → k = 4
4) Calculate Class Interval (CI) / Class Width
2- Quantitative Data (Numeric Data)
20 , 21 , 22 , 25 , 33 , 40 , 48 , 55 , 60 ,80
1. Table ( Frequency)
Classes Frequency (f) Relative Percentage Frequency Less than Cumulative
Frequency (RF) (%) (<) Frequency
20 -< 35 //// 5 𝟓 𝟓 <20 0
* 100 = 50%
𝟏𝟎 𝟏𝟎
35 -< 50 // 2 𝟐 20% <35 5
𝟏𝟎
50 -<65 // 2 𝟐 20% <50 7
𝟏𝟎
65 - 80 / 1 𝟏 10% <65 9
𝟏𝟎
n = 10 <80 10
2. Graph ( Histogram / Polygon)
Polygon
Histogram
- x
5- x 5
-
4- 4
-
3- 3 x x
-
2- x x 2
-
x
1-
x 1 x x
27.5 -
42.5 -
57.5 -
72.5 -
-
x x
20 -
35 -
50 -
65 -
80 -
Classes Frequency (f) Mid point
20 -< 35 //// 5 (20+35)/2=27.5
35 -< 50 // 2 (35+50)/2=42.5
50 -<65 // 2 (50_65)/2=57.5
65 - 80 / 1 (65+80)/2=72.5
Some Definitions
1- Histogram
It has connected bars that display the frequency or proportion of cases that fall within defined intervals.
2- Frequency Polygon
To draw a polygon Midpoints of the interval of corresponding rectangle in a histogram are joined
together by straight lines.
Exercise (1)
A school nurse weighed 30 students in Year 10. Their weights (in kg) were
recorded as follows :
50 52 53 54 55 65 60 70 48 63
74 40 46 59 68 44 47 56 49 58
63 66 68 61 57 58 62 52 56 58
Present this information in a frequency table.
40 , 44 , 46 , 47 , 48 , 49 , 50 , 52 , 52 , 53 ,
1) Sort Data
54 , 55 , 56 , 56 , 57 , 58 , 58 , 58 , 59 , 60 ,
61 , 62 , 63 , 63 , 65 , 66 , 68 , 68 , 70 , 74
2) Get Sample Size n = 30
3) Calculate # of classes 2k ≥ n → 2k ≥ 30 → k = 5
4) Calculate Class Interval (CI) / Class Width
𝑹𝒂𝒏𝒈𝒆 𝒍𝒂𝒓𝒈𝒆𝒔𝒕 −𝒔𝒎𝒂𝒍𝒍𝒆𝒔𝒕 𝟕𝟒−𝟒𝟎 𝟑𝟒
𝐂𝐈 = = = = = 𝟔. 𝟖 ≈ 𝟕
𝒌 𝒌 𝟓 𝟓
40 , 44 , 46 , 47 , 48 , 49 , 50 , 52 , 52 , 53 ,
1. Table ( Frequency) 54 , 55 , 56 , 56 , 57 , 58 , 58 , 58 , 59 , 60 ,
61 , 62 , 63 , 63 , 65 , 66 , 68 , 68 , 70 , 74
Classes Frequency (f) Less than ( < ) Cumulative
Frequency
40-<47 3 < 40 0
47-<54 7 < 47 3
54-<61 10 < 54 10
61-<68 6 < 61 20
68-<75 4 < 68 26
n=30 < 75 30
Exercise (2)
Given the following frequency table
Classes Frequency (f) Relative Frequency
10 - 7 7/60
15 - 12 12/60
20 - 13 15/60
25 - 20 20/60
30 - 6 6/60
35 - 40 2 2/60
1. What is the sample size? n = 60
2. What is the most frequency class? 25-<30
3. Compute the relative frequency.
4. Represent data graphically.
5. Is these data symmetric? No , it's negative skewed.
Exercise (2)
4. Represent data graphically. Classes Frequency (f)
21 - x 10 - 7
18 - 15 - 12
15 - 20 - 13
12 - x x 25 - 20
9- 30 - 6
6- x x 35 - 40 2
3-
--
x
x x
10 -
15 -
20 -
25 -
30 -
35 -
40 -
-
5. Is these data symmetric?No , it's negative skewed.
Anti-symmetric
OR Skewness
Tail !
Normal
Distribution Positive / Right skew Negative / Left skew
(Symmetric) (Anti-Symmetric) (Anti-Symmetric)
-ve +ve