0% found this document useful (0 votes)
165 views27 pages

Statistics for Data Analysis

1. The document provides information on calculating class boundaries, class midpoints, frequencies, and relative frequencies for grouped data. 2. Formulas are given for calculating measures of central tendency like the mean, median, and mode as well as measures of dispersion like quartiles, percentiles, and standard deviation. 3. Examples are worked out applying the formulas to sample data to find values like the first quartile, third quartile, 77th percentile, and standard deviation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
165 views27 pages

Statistics for Data Analysis

1. The document provides information on calculating class boundaries, class midpoints, frequencies, and relative frequencies for grouped data. 2. Formulas are given for calculating measures of central tendency like the mean, median, and mode as well as measures of dispersion like quartiles, percentiles, and standard deviation. 3. Examples are worked out applying the formulas to sample data to find values like the first quartile, third quartile, 77th percentile, and standard deviation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 27

MODULE I: TEST 3

1. Range Classes
R = HV - LV
R = 19 - 2 17-19
17 14-16
11-13
2. Classes 8-10
k = 1+3.3logn 5-7
k=1+3.3log30 2-4
6

3. Interval
i=R/K
17/6
3 4. Class Boundaries
LCB -0.5
Lower Class Limit 2-0.5 = 1.5
a, a+3=b, b+3=c, and so on. 5-0.5 = 4.5
2,5,8,11,14,17 8-0.5 = 7.5
11-0.5 = 10.5
Lower Class Limit 14-0.5 = 13.5
z, z-3=y, y-3=x, and so on. 17-0.5 = 16.5
19,16,13,10,7,4
I: TEST 3 Classes Class Boundaries
Frequency Class Boundaries Class Midpoint CB
f LCB UCB CM 55 - 59 54.5
3 16.5 - 19.5 18 50 - 54 49.5
5 13.5 - 16.5 15 45 - 49 44.5
6 10.5 - 13.5 12 40 - 44 39.5
8 7.5 - 10.5 9 35 - 39 34.5
5 4.5 - 7.5 6 30 - 34 29.5
3 1.5 - 4.5 3 25 - 29 24.5
30 20 - 24 19.5
n=30

4. Class Boundaries 5. Class Midpoint Legend Class Boundaries


UBC +0.5 CM = (LCB+UCB)/2 Formula LCB - 0.5
19+0.5 = 19.5 (17=19)/2 = 18 55-0.5
16+0.5 =16.5 (14+16)/2 =15 50-0.5
13+0.5 = 13.5 (11+13)/2 - 12 45-0.5
10+0.5=10.5 (8+10)/2 = 9 40-0.5
Solution
7+0.5=7.5 (6+7)/2 = 6 35-0.5
4+0.5=4.5 (2+4)/2 = 3 30-0.5
25-0.5
20-0.5
MODULE I: TEST 4
Class Boundaries Class Midpoint Frequency
CB CM f RF CF< CRF< CF> CRF>
- 59.5 57 5 5.88 85 100.00 5 5.88
- 54.5 52 9 10.59 80 94.12 14 16.47
- 49.5 47 10 11.76 71 83.53 24 28.24
- 44.5 42 22 25.88 61 71.76 46 54.12
- 39.5 37 18 21.18 39 45.88 64 75.29
- 34.5 32 13 15.29 21 24.71 77 90.59
- 29.5 27 6 7.06 8 9.41 83 97.65
- 24.5 22 2 2.35 2 2.35 85 100.00
85 100.00
n=85 100

FORMULA AND SOLUTION


Class Boundaries Class midpoint RF f+CF<befo CRF< f+CF>of CRF>
UCB +0.5 CM = (LCB+UCB)/2 f/n*100 re range CF</n*100 the next CF>/n*100
59+0.5 (55+59)/2 5/85*100 80+5 85/85*100 5 5/85*100
54+0.5 (50+54)/2 9/85*100 71+9 80/85*100 5+9 14/85*100
49+0.5 (45+49)/2 10/85*100 61+10 71/85*100 14+10 24/85*100
44+0.5 (40+44)/2 22/85*100 39+22 61/85*100 24+22 46/85*100
39+0.5 (35+39)/2 18/85*100 21+18 39/85*100 46+18 64/85*100
34+0.5 (30+34)/2 13/85*100 8+13 21/85*100 64+13 77/85*100
29+0.5 (25+29)/2 6/85*100 2+6 8/85*100 77+6 83/85*100
24+0.5 (20+24)/2 2/85*100 2 2/85*100 83+2 85/85*100
14/85*100
24/85*100
46/85*100
64/85*100
77/85*100
83/85*100
85/85*100
MODULE II - TEST II

Given 4 9 11 12 17 5 8 12 14 Σx = 92

x1 x2 x3 x4 x5 x6 x7 x8 x9
Arranged 4 5 8 9 11 12 12 14 17 n= 9

Determine the following


1. mean x̄ 2. Median x͂ 3. Mode Xmo
Formula: Formula Formula Find the value
that appeared with
the highest frequency.
Solution:
4+5+9+11+12+12+14+17+18
x̄ = Solution: x͂ = (9+1)
9 2 12 apperead twice
= 92 = 10 while the rest appeared
9 2 only once.
= 5
= 10.22 X5 = 11 Mode = 12

4. Quartile 1 5. Quartile 3 6. Decile 5


Formula Q1 = X1 (n+1) Formula Q3 = X3 (n+1) Formula
4 4
Solution = x1(9+1) Solution = x3(9+1) Solution
4 4
= x 10 = x (3) (10) = 30
4 4 4
= 2.5 = 7.5
= x2+0.5(x3-x2) = x7+0.5(x8-x7)
= 5+0.5(8-5) = 12+0.5(14-17)
= 5 + 0.5*3 = 12 + 0.5*3
= 5+1.5 = 12+1.5
= 6.5 = 13.5

7. Percentile 77 8. Standard Deviation (s)


Formula P77 = X77 (n+1) Formula
100
Solution = X77 (9+1) Solution
100 x x2
= X77 (10) 4 16 = √9(1080) - 92 2
100 5 25 9(9-1)
= X770 8 64
100 9 81 = √9720 - 8464
= X7.7 11 121 9*8
= x7 +.7 (x8-x7) 12 144
= 12 + .7 (14-12) 12 144 = √1256
= 12 + .7 (2) 14 196 72
= 12+1.4 17 289
= 13.4 92 1080 = √17.44
Σx Σx 2
= 4.18
Class f m fm
50 - 54 1 52 52
45 - 49 5 47 235
40 - 44 11 42 462
35 - 39 18 37 666
30 - 34 30 32 960
25 - 29 24 27 648
20 - 24 11 22 242
Find the value 100 3265
that appeared with
the highest frequency.

12 apperead twice
the rest appeared
only once. m fm
(50+54)/2 1*52
(45+49)/2 5*47
(40+44)/2 11*42
(35+39)/2 18*37
(30+34)/2 30*32
D5 = X5 (n+1) (25+29)/2 24*27
10 (20+24)/2 11*22
= X5 (9+1)
10
= X 5 x 10
10
= 50
10
= 5
MODULE II - TEST III
CF> CB CF<
1 49.5 54.5 100
6 44.5 49.5 99
17 39.5 44.5 94
35 34.5 39.5 83
65 29.5 34.5 65
89 24.5 29.5 35
100 19.5 24.5 11

CF> mean CB = +.5, -.5 Median Mode CF<


add ∑fm/n 50-.5, 54+.5 M=L+n2-cf/f xc xmo = L +(f1+fo/2 x f1-fo-f2) x c add
the 3265/100 45-.4, 49+.5 34.5 + (30-18/2x30-18-24) x -5 the
next 32.65 40-.5, 44+.5 = 34.5+50-35/30 x-5 34.5 + (12/18) x -5 next
f 35-.5, 39+.5 = 34.5+1530⋅-5 34.5=-3.33 f
to 30-.5, 34-.5 = 34.5+-2.5 31.17 to
CF> 25-.5, 29+.5 = 32 CF<
20-.5, 24+.5

Q2=L+2/n4-cf/f⋅c
= 29.5+50-35/30⋅5
= 29.5+15/30⋅5
= 29.5+2.5
= 32

. D9=L+9n10-cf/f⋅c
= 39.5+90-83/11⋅5
= 39.5+7/11⋅5
= 39.5+3.1818
= 42.6818

P95=L+95n/100-cf/f⋅c
= 44.5+95-94/5⋅5
= 44.5+15⋅5
= 44.5+1
= 45.5
MODULE 3 Te

Commo
2000 2006
dities Price (po) Quantity (qo) Price (pn) Quantity (qn)

I 400 35 460 45
II 800 10 1000 18
III 300 15 350 12
IV 4000 10 5000 8
V 250 20 300 27

1. Simple Aggregate of Prices for 2006 (2000 → 100)

ΣP2006
I2006 = x 100
Σ P2000

= 45+18+12+8+27 / 35+10+15+10+20 x 100


= 1.22222222 x100
= 122.22

2.Simple Average of Relative Prices for 2006 (2000 → 100)

PcoI2006 PcoIV2006
IcomI = x 100 Icom4= x 100
PcoI2000 PcoIV2000

= 45/35 x 100 = 8 / 10 x 100


= 1.28571429 x100 = 0.8 x100
= 128.57 = 80.00

PcoII2006 PcoV2006
Icom2 = x 100 Icom5= x 100
PcoII2000 PcoV2000

= 18/10 x 100 = 8 / 10 x 100


= 1.8 x100 = 1.35 x100
= 180.00 = 135.00

PcoIII2006
Icom3= x 100
PcoIII2000 = 125.51 + 180 + 80 + 80 + 135
5
= 12 / 15 x 100 = 603.57 / 5
= 0.8 x100 = 120.71
= 80.00
MODULE 3 Test 2
3. Weighted Aggregate Price Index for 2006 (2000 → 100)

Commodity poqo pnqn poqn pnqo


I 14000 20700 18000 16100
II 8000 18000 14400 10000
III 4500 4200 3600 5250
IV 40000 40000 32000 50000
V 5000 8100 6750 6000
Totals 71500 91000 74750 87350

Laspeyre’s Index (2000 → 100) Fisher’s Ideal Index (2000 → 100)

Σ(pnqo) IF = √ (IL) (IP0


IL = x 100
Σ(poqo) = √ (122.17) (121.74)
√ 14872.61
= 87,350 / 71500 x 100 = 121.95
= 1.221678 x 100
= 122.17

Paasche’s Index (2000 → 100)

Σ(pnqn)
IP = x 100
Σ(poqn)

= 91,000 / 74,750 x 100


= 1.217391 x 100
= 121.74
MODULE 3 Test 3

Year Old Price Index Revised Price Index

2000 100
2001 102.3
2002 105.3
2003 107.6
2004 111.9
2005 114.2 100
2006 102.5
2007 106.4
2008 108.3
2009 111.7
2010 117.8

SOLUTION

Year Old Price Index Revised Price Index


2000 100
2001 102.3
2002 105.3
2003 107.6
2004 111.9
2005 114.2 100
2006 102.5
2007 106.4
2008 108.3
2009 111.7
2010 117.8

Formula:
For Base: Old Price Index / 100
For Spliced Index Old Price Index / Divisor
3 Test 3

Divisor Spliced Index


1.142 87.57
1.142 89.58
1.142 92.21
1.142 94.22
1.142 97.99
1.142 100
102.5
106.4
108.3
111.7
117.8

TION

Divisor Spliced Index


1.142 100 / 1.142
1.142 102.3 / 1.142
1.142 105.3 / 1.142
1.142 107.6 / 1.142
1.142 111.9 / 1.142
114.2 / 100 114.2 / 114.2 / 100
same with New price
same with New price
same with New price
same with New price
same with New price
Module V : No. 3

Given

Region 1 Region 2 Region 3 Total


Basketball 75 58 48 181
Soccer 33 15 23 71
Volleyball 47 36 29 112
Total 155 109 100 364
Overall total
Hypothesis
Ho: each sport is independent of each of the regions
Ha: each sport is not independent within each of the regions
(variables are not independent)

Formula
Expected Value (Total active league in all regions) (Tol active league in Region 1)
(EV) = Overall Total

Observed Expected Obs.-Expected (obs-exp.) sq.r x2


75 77.07 -2.07 4.30 0.06
33 30.23 2.77 7.65 0.25
47 47.69 -0.69 0.48 0.01
58 54.20 3.80 14.44 0.27
15 21.26 -6.26 39.20 1.84
36 33.54 2.46 6.06 0.18
48 49.73 -1.73 2.98 0.06
23 19.51 3.49 12.21 0.63
29 30.77 -1.77 3.13 0.10
3.40

Formula Obs.-Exp. (obs-exp.)2 (obs-exp.)2


Solution Ex
75 181*155/364 75-77.07 ~ -2.07 x -2.07 4.30/77.07
33 71*155/364 33-30.23 ~ 2.77 x 2.77 7.65/30.23
47 112*155/364 47-47.69 ~ -0.69 x -0.69 0.48/47.69
58 181*109/364 58-54.20 ~ 3.80 x 3.80 14.44/54.20
15 71*109/364 15-21.26 ~ -6.26 x -6.26 39.20/21.26
36 112*109/364 36-33.54 ~ 2.46 x 2.46 6.06/33.54
48 181*100/364 48-49.73 ~ -1.73 x -1.73 2.98/49.73
23 71*100/364 23-19.51 ~ 3.49 x 3.49 12.21/19.51
29 112*100/364 29-30.77 ~ -1.77 x -1.77 3.13/30.77
Module V : No. 3

degree of freedom = (row-1) (column - 1)


= (3-1) (3-1)
=2*2
=4

df = 4 intesection = 9.49
significance = 0.05 test static = 3.40

Test static, 3.40 is near the critical value, 9.49, null hypothesis is accepeted.
Points Per
Player Age (x) Game (y)
xy x2 y2
1 Bosh 29 15.1 437.9 841 228.01
2 Bryant 36 26.7 961.2 1296 712.89
3 Durant 25 28.9 722.5 625 835.21
4 James 29 23.4 678.6 841 547.56
5 Nowitzki 35 19.9 696.5 1225 396.01
6 Rose 23 27.6 634.8 529 761.76
7 Wade 32 20.9 668.8 1024 436.81
Total n=7 209 162.5 4800.3 6381 3918.25

1 29 15.1 29 x 15.1 29 x 29 15.1 x 15.1


2 36 26.7 36 x 26.7 36 x 36 26.7 x 26.7
3 25 28.9 25 x 28.9 25 x 25 28.9 x 28.9
4 29 23.4 29 x 23.4 29 x 29 23.4 x 23.4
5 35 19.9 35 x 19.9 35 x 35 19.9 x 19.9
6 23 27.6 23 x 27.6 23 x 23 27.6 x 27.6
7 32 20.9 32 x 20.9 32 x 32 20.9 x 20.9

Formula

Substition and solution


r= 7 (4800.3) - (209) (162.5)
√7(6381) - (209) 2 √ 7(3918.25) - (162.5) 2

33,602.10 - 33,962.50
√ 44,667 - 43,681 √ 27,427.75 -

-360.40
√ 986 1,021.50

-360.40
√ 1007199

-360.40
1003.6
-0.36
Formula

Substition and solution


b= 7 4800.3 - 209
7 6381 - 209

= 33602.1 - 33962.5
44667 - 43681

= -360.4
986

= -0.37

Formula

n
Substition and solution

a= 162.5 - -0.37 209


7
26,406.25
= 162.5 x -76.3931
7

= -12413.88
7

= -1773.411
Hence, the equation of the line that best fit the series of the data is:
y = -1,773.41 + -0.37x
162.5
2
s of the data is:

You might also like