Solution Question Bank Unit-3
Solution Question Bank Unit-3
Contents: Measures of central tendency, Skewness, Kurtosis, Curve Fitting, Method of least
squares, fitting of straight lines, fitting of second-degree parabola, Exponential curves, Correlation
and Rank correlation, Regression Analysis: Regression lines of y on x and x on y,regression coef-
ficients, properties of regressions coefficients and nonlinear regression.
Course Outcome (CO3): Understand the basic statistical concept like moments, skewness, kurto-
sis, curve fitting, correlation and regression.
Question 1
A cooperative bank has two branches employing 50 and 70 workers respectively. The aver-
age salaries paid by two respective branches are Rs. 360 and Rs. 390 per month. Calculate
the mean of the salaries of all the employees.
Solution:
{
Note:
To calculate the mean salary of all employees, we use the Mean of Composite Serirs: If xi , (i =
1, 2, ..., k) are the means of k-component series of sizes ni .(i = 1, 2, ..., k) respectively, then the
meanx of the size n1 , (i = 1, 2, ..., k) respectively then mean x̄ of the composite series is given by
the formula:
n1 x1 + n2 x2 + ... + nk xk
x̄ = = ∑ nk xk / ∑ nk
n1 + n2 + ... + nk i i
}
Let nl and n2 denote respectively the number of male and female employees in the concern and
x1 and x2 denote respectively their average salary (in rupees). Let x̄ denote the average salary of all
1
the workers in the firm.
We are given that:
x1 = 360, x2 = 390, n1 = 50, n2 = 70
Mean Salary:
Solution:
1. Arrange the numbers in ascending order:
2. Count the total number of data points: The total number of data points (n) etis 7, which is
odd.
3. Find the position of the median: For an odd number of data points, the median is the value
at the position:
n+1
Median Position =
2
Substituting n = 7:
7+1
Median Position = =4
2
4. The 4th number in the dataset is 10.
4, 6, 5, 7, 9, 8, 10, 4, 7, 6, 5, 8, 7, 7, 9
Page 2
Solution:
1. Arrange the data and count the frequency of each number:
x 4 5 6 7 8 9 10
Frequency 2 times 2 times 2 times 4 times 2 times 2 times 1 time
2. Identify the mode: The mode is the number that appears the most frequently. Here, 7 appears
4 times, which is more than any other number.
x 1 2 3 4 5 6 7
f 5 9 12 17 14 10 6
Solution:
The formula for the arithmetic mean is:
∑ fx
Arithmetic Mean =
∑f
where:
x f fx
1 5 5
2 9 18
3 12 36
4 17 68
5 14 70
6 10 60
7 6 42
Sum ∑ f = 73 ∑ f x = 299
Step 2: Compute the Arithmetic Mean
∑ f x 299
Arithmetic Mean = = ≈ 4.10
∑f 73
Page 3
Final Answer: The arithmetic mean is approximately: 4.10
Question 5
In an asymmetrical distribution, the mean is 16 and the median is 20. Calculate the mode of
the distribution.
Solution:
The empirical relationship between the mean, median, and mode is given by:
Given:
Mean = 16, Median = 20
Substitute the values:
Mode = 3 × 20 − 2 × 16
Mode = 60 − 32 = 28
Final Answer: The mode of the distribution is:28
Question 6
The first three central moments of a distribution are 0, 15, -31. Find the moment of coeffi-
cient of Skewness.(2019-20 2 Marks)
Solution:
Given:
µ1 = 0, µ2 = 15, µ3 = −31
The formula for the moment coefficient of skewness is:
µ3
γ1 = 3/2
µ2
Hence
−31
γ1 = √ ≈ −.53
153
The moment coefficient of skewness is approximately:
γ1 ≈ −0.533
Question 7
The first two moments of a distribution about the value ‘2’ of the variable are 1, 16. Show
that mean is 3, and variance is 15. (2020-2021 2 marks)
Page 4
Solution:
Given that
A = 2, µ1′ = 1 and µ2′ = 16
Now
x̄ = µ1 + A = 1 + 2 = 3
variance
σ 2 = µ2
and
µ2 = µ2′ − µ1′2 = 16 − (1)2 = 15
Hence
variance = 15
Question 8
The fourth central moment is µ4 = 48. What must be its standard deviation (σ ) in order for
the distribution to be mesokurtic?
Solution:
The kurtosis (β2 ) is given as:
µ4
β2 = (µ2 = σ 2 )
σ4
where:
Step 1: Substitute the known values: For a mesokurtic distribution, β2 = 3, and µ4 = 48.
Substituting these into the formula:
48
3= 4
σ
Step 2: Solve for σ 4 :
48
σ4 = = 16
3
Step 3: Solve for σ : Taking the fourth root (or square root twice) of both sides:
√
4
√
σ = 16 = 4 = 2
Page 5
Question 9
Write the normal equations to fit the curve y = ax2 + b by the method of least squares.
Let
y = ax2 + b (1)
be the given equation of best fit to set of n points (xi , yi ), i = 1, 2, ..., n, Using the principle of least
squares, we have to determine the constants a,b and c so that
n
E = ∑ (y − ax2 − b)2 = 0
i=1
is minimum.Equating to zero the partial derivatives of E with respect to a and b separately, we get
the normal equations for estimating a and b as
n
∂E
= −2 ∑ x2 (y − ax2 − b) = 0
∂a i=1
n
∂E
= −2 ∑ (y − ax2 − b) = 0
∂b i=1
=⇒
n
∑ yx2 = a ∑ x4 + bx2
i=1
n n
∑ y = a ∑ x2 + b
i=1 i=1
Question 10
Write the formula for Karl Pearson’s correlation coefficient and state the range of the corre-
lation coefficient.
Solution:
Karl Pearson Correlation Coefficient Formula:
Page 6
The Karl Pearson correlation coefficient (r) is given by the formula:
cov(x, y)
r=
σx σy
∑ (x − x̄)(y − ȳ)
or r= p
∑ (x − x̄)2 ∑ (y − ȳ)2
n ∑ xy − ∑ x ∑ y
or r= p p
n ∑ x2 − (∑ x)2 n ∑ y2 − (∑ y)2
where:
• x and y are the individual data points of the two variables X and Y ,
−1 ≤ r ≤ 1
Solution:
The formula for the coefficient of correlation (r) is:
Cov(x, y)
r=
σx σy
where:
Given:
• Cov(x, y) = 10,
√
• covariance σx2 = 16, so σx = 16 = 4,
√
• covariance σy2 = 9, so σy = 9 = 3.
Page 7
Hence:
10 10 5
r= = = = 0.833
4 × 3 12 6
Hence the coefficient of correlation r = 0.833.
Question 12
The lines of regression of y on x and x on y are respectively:
Solution
Given:
• Line of regression of y on x: y = x + 5
Slope of this line (byx ) = 1.
Page 8
Solution:
The coefficient of correlation in terms of correlation coefficients:
p
r = ± byx · bxy
Given
byx = 0.8, bxy = 0.2
We get √ √
r = ± 0.8 · 0.2 = ± 0.16
r = ±0.4
The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are
positive the correlation coefficient is positive.
r = 0.4
Question 14
If the regression coefficients are byx = 0.8 and bxy = 0.8, find the value of the coefficient of
correlation (r).
Solution:
The formula for the coefficient of correlation is:
p
r = ± byx · bxy
Given:
byx = 0.8, bxy = 0.8
Substitute these values into the formula:
√
r = ± 0.8 · 0.8
Simplify: √
r = ± 0.64
r = ±0.8
Page 9
The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are
positive the correlation coefficient is positive.
r = 0.8
Question 15
What is the relation between the regression coefficients and the coefficient of Correlation?
Page 10
Spearman’s Rank Correlation with Tied Ranks
In the case of tied ranks, the formula for Spearman’s rank correlation coefficient (rs ) is:
1
6{∑ di2 + 12 ∑ mi (m2i − 1)}
rs = 1 −
n(n2 − 1)
Where:
mi is number of repetition of the ranks.
Page 11
Question Description (7 Marks)
Question 18
Calculate the first four central moments and also comment upon Skewness and Kurtosis
from the following data:
Solution:
Given Data:
∑( f x)
x̄ =
∑f
210
x̄ = = 21
10
Calculation of Central Moments, Skewness, and Kurtosis
Class Interval f x (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
0–10 1 5 -16 -16 256 -4096 65536
10–20 4 15 -6 -24 144 -864 5184
20–30 3 25 4 12 48 192 768
30–40 2 35 14 28 392 5488 76832
Total 10 S 0 840 720 148320
Page 12
First Central Moment :
µ1 = 0
∑ f (x − x̄)2 840
µ2 = = = 84
∑f 10
∑ f (x − x̄)3 720
µ3 = = = 72
∑f 10
∑ f (x − x̄)4 148320
µ4 = = 14832
∑f 10
Hence
µ32 722 5184
Skewness β1 = 3
= 3= = .0087
µ2 84 592704
µ4 14832 14832
Kurtosis β2 = 2
= = = 2.102
µ2 842 7056
Skewness: Since (β1 > 0)the distribution is positively skewed, meaning the tail on the right
side is longer or fatter than the left side.
Kurtosis: The kurtosis (β2 < 3) suggests a relatively low peak than a normal distribution,
indicating a platikurtic distribution .
Question 19
Calculate the first four central moments about the mean, Skewness, and Kurtosis for the
following data (2021-22):
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
Solution:
Solution:
Given Data:
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
∑( f · x)
The mean x̄ =
∑f
Page 13
x . f . fx .
0 1 0
1 8 8
2 28 56
3 56 168
4 70 280
5 56 280
6 28 168
7 8 56
8 1 8
Total 256 1024
1024
x̄ = =4
256
Now for central moments
x f (x − x̄) f (x − x̄) f (x − x̄)2 f · (x − x̄)3 f (x − x̄)4
0 1 −4 16 16 −64 256
1 8 −3 9 72 −216 648
2 28 −2 4 112 −224 448
3 56 −1 1 56 −56 56
4 70 0 0 0 0 0
5 56 1 1 56 56 56
6 28 2 4 112 224 448
7 8 3 9 72 216 648
8 1 4 16 16 64 256
Total 256 0 512 0 2816
∑ f (x − x̄) 0
µ1 = = =0
∑f 256
∑ f (x − x̄)2 512
µ2 = = =2
∑f 256
∑ f (x − x̄)3 0
µ3 = = =0
∑f 256
∑ f (x − x̄)4 2816
µ4 = = = 11
∑f 256
Hence
p µ3
Skewness γ1 = β1 = 3/2 = 0
µ2
µ4 11
Kurtosis γ2 = β2 − 3 = 2 − 3 = 2 = −0.75
µ2 2
Page 14
Question 20
Compute Skewness and Kurtosis, if the first four moments of a frequency distribution about
the value 4 of the variable are 1, 4, 10, and 45.
Solution:
We are given the first four moments about the value A = 4:
Hence
Skewness (γ1 ):
µ3 0
γ1 = 3/2
= =0
µ2 (3)3/2
symmetric distribution.
Kurtosis :
µ4 26 26
β2 = 2
= 2
= ≈ 2.89
µ2 (3) 9
The kurtosis is slightly lower than the normal value of 3, indicating a distribution close to
normal.
Question 21
The first four moments of a frequency distribution about the value 4 of the variable are -1.5,
17,-30 and 80. Find µ1 , µ2 , µ3 , µ4 about mean. Also find β1 and β2 .
Solution
We are given the first four moments about the value A = 4:
The formulae for central moments (µr ) in terms of moments about A (µr′ ) are:
µ1 = 0, µ2 = µ2′ − (µ1′ )2 , µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3 , µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ (µ1′ )2 − 3(µ1′ )4
Page 15
Step 1: Calculate Central Moments
1. Second Central Moment (µ2 ):
µ2 = µ2′ − (µ1′ )2
µ2 = 17 − (−1.5)2
µ2 = 17 − 2.25 = 14.75
2. Third Central Moment (µ3 ):
39.752
β1 = ≈ .492
14.753
µ4
β2 = 2
µ2
Substitute µ4 = 114.3125 and µ2 = 14.75:
114.3125 114.3125
β2 = = ≈ 0.525
(14.75)2 217.5625
Question 22
The first four moments of a frequency distribution about the value 2 of the variable are 2,
20, 40 and 50 respectively. Comment upon the skewness and kurtosis of the distribution.
Page 16
Solution
Analysis of Skewness and Kurtosis
The first four moments about A = 2 are given as:
µ1 = 0.
Central Moments
The central moments are calculated using the formula:
r
r ′ ′r−k
µr = ∑ µr µ1 .
k=0 k
Skewness (γ1 )
Skewness is calculated as:
µ3
γ1 = 3/2
µ2
−64
γ1 = = −1.
(16)3/2
Interpretation: Since γ1 is negative, the distribution is negatively skewed.
Page 17
Kurtosis (γ2 )
Kurtosis is calculated as:
µ4
γ2 =
µ22
162
γ2 = = 0.6328.
162
Excess kurtosis is:
Excess Kurtosis = γ2 − 3 = 0.6328 − 3 = −2.367.
Interpretation: The negative excess kurtosis indicates that the distribution is platykurtic (flatter
than a normal distribution).
Question 23
The first four moments of a frequency distribution about the value 5 of the variable are 1, 2.5,
5.5 and 16 respectively.Find the four central moments, moments about origin and coefficient
of skewness.
Solution:
Given:
µ1′ = 1,
µ2′ = 2.5,
µ3′ = 5.5,
µ4′ = 16.
The value of A = 5. The mean is given by:
x̄ = A + µ1′ = 2 + 1 = 3.
Page 18
Thus, the central moments are:
µ2 = 1.5, µ3 = 0, µ4 = 6.
µ1′ = x̄ = 3,
µ2′ = µ2 + (µ1′ )2 = 1.5 + 32 = 10.5,
µ3′ = µ3 + 3µ2 µ1′ + µ1′3 = 0 + 3(1.5)(3) + 33 = 40.5,
µ4′ = µ4 + 4µ3 µ1′ + 6µ2 mu′2 4 2 4
1 + x̄ = 6 + 4(0)(3) + 6(1.5)(3 ) + 3 = 168.
Hence
The coefficient of skewness γ1 is given by:
µ3 0
γ1 = 3/2
= = 0.
µ2 (1.5)3/2
Solution
We are given the following frequency distribution:
∑ fx
x̄ =
∑f
Page 19
We first calculate f x:
Class Interval f Mid Point(x) fx
10 − 20 18 15 270
20 − 30 20 25 500
30 − 40 30 35 1050
40 − 50 22 45 990
50 − 60 10 55 550
Total 100 3360
Thus, the mean is:
3360
x̄ = = 33.6
100
Step 2: Calculate Moments
∑ f (x − x̄)2 15204
µ2 = = = 152.04
∑f 100
∑ f (x − x̄)3 2131.2
µ3 = = = 21.312
∑f 100
∑ f (x − x̄)4 4732752
µ4 = = = 47327.52
∑f 100
Hence, Skewness
p µ3 21.312
γ1 = β1 = 3/2 = = 0.
µ2 (15204)3/2
Kurtosis
µ4 47327.5
γ2 = β2 − 3 = 2
= − 3 = −0.953
µ2 123.042
Question 25
Find the coefficient of correlation from the following points of observation
(1,3),(2,2),(3,5),(4,4),(5,6).
Page 20
Solution:
To find the coefficient of correlation r for the given points of observation, we use the Pearson
correlation coefficient formula:
n ∑ xy − ∑ x ∑ y
r= p
[n ∑ x − (∑ x)2 ][n ∑ y2 − (∑ y)2 ]
2
x y xy x2 y2
1 3 3 1 9
2 2 4 4 4
3 5 15 9 25
4 4 16 16 16
5 6 30 25 36
2 2
∑ x = 15 ∑ y = 20 ∑ xy = 68 ∑ x = 55 ∑ y = 90
n = 5 (number of points)
Now, substitute the values into the formula for the Pearson correlation coefficient:
5 × 68 − (15 × 20)
r= p
[5 × 55 − (15)2 ][5 × 90 − (20)2 ]
340 − 300
Simplifying each part: = p
[275 − 225][450 − 400]
40
=p
[50][50]
40 40
=√ = = 0.8
2500 50
Answer: The coefficient of correlation r is 0.8. This indicates a strong positive correlation between
X and Y .
Question 26
A random sample of 5 college students is selected and their grades in Mathematics and
Statistics are found to be:
Students 1 2 3 4 5
Mathematics 85 60 73 40 90
Statistics 93 75 65 50 80
Calculate the rank correlation coefficient.
Page 21
Solution:
Solution:
Spearman’s Rank Correlation Coefficient
The formula for the rank correlation coefficient ρ is:
6 ∑ di2
ρ = 1−
n(n2 − 1)
Where: n is the number of data points (in this case, n = 5), di is the difference between the ranks
of corresponding values of Mathematics and Statistics for each student.
Arrange X and Y series into ascending order and give them ranks starting from 1
X-Series: 90 85 73 60 40
Rank: 1 2 3 4 5
Y-Series: 93 80 75 65 50
Rank: 1 2 3 4 5
Calculate the Differences in Ranks and Square Them Now, we calculate di = RankX − RankY and
di2 :
Student RankX RankY di = RankX − RankY di2
1 2 1 1 1
2 4 3 1 1
3 3 4 −1 1
4 5 5 0 0
5 1 2 −1 1
2
Total ∑ di = 4
Now, substitute into the formula:
6×4 24 24 24
ρ = 1− 2
= 1− = 1− = 1− = 1 − 0.2 = 0.8
5(5 − 1) 5(25 − 1) 5 × 24 120
Page 22
Solution:
The Pearson correlation coefficient r is given by the formula:
n ∑ xy − ∑ x ∑ y
r= p
[n ∑ x2 − (∑ x)2 ][n ∑ y2 − (∑ y)2 ]
We are given the data for 8 students, so n = 8.
X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
2 2
∑ x = 544 ∑ y = 552 ∑ x = 37028 ∑ y = 38132 ∑ xy = 37560
Now, substitute the values into the formula:
8 × 37560 − 544 × 552
r= p = 0.603
[8 × 37028 − (544)2 ][8 × 38132 − (552)2 ]
SHORT-CUT METHOD
X Y U = X − 68 V = X − 69 U2 V2 UV
65 67 -3 -2 9 4 6
66 68 -2 -1 4 1 2
67 65 -1 -4 1 16 4
67 68 -1 -1 1 1 1
68 72 0 3 0 9 0
69 72 1 3 1 9 3
70 69 2 0 4 0 0
72 71 4 2 16 4 8
∑ X = 544 ∑ Y = 552 ∑U = 0 ∑V = 0 ∑ U 2 = 36 ∑ V 2 = 44 ∑ UV = 24
1 1
Ū =∑ U = 0, V̄ = ∑ V = 0
n n
1 1
Cov(U,V ) = ∑ UV − Ū V̄ = × 24 = 3
n 8
1 1
σU2 = ∑ U − Ū 2 = × 36 = 4.5
n 8
1 1
σV2 = ∑ V − V̄ 2 = × 44 = 5.5
n 8
Page 23
Question 28
Fit a parabolic curve of second degree to the following data:
x 0 1 2 3 4
y 1 1.8 1.3 2.5 6.3
Solution:
Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:
∑ y = n · a + b ∑ x + c ∑ x2
∑ xy = a ∑ x + b ∑ x2 + c ∑ x3
∑ x2 y = a ∑ x2 + b ∑ x3 + c ∑ x4
x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
∑ x = 10 ∑ y = 12.9 ∑ x2 = 30 ∑ x3 = 100 ∑ x4 = 354 ∑ xy = 37.1 ∑ x2 y = 130.3
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:
Question 29
Use the method of least squares to find the curve y = abx that best fits the following data:
X 2 3 4 5 6
Y 8.3 15.4 33.1 65.2 127.4
Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of
both sides:
Page 24
log(y) = log(abx ) = log(a) + x log(b)
Let Y = log(y), A = log(a), and B = log(b), so the equation becomes:
Y = A + Bx
Now, we apply the method of least squares to the transformed equation:
The normal equations are:
∑ Y = nA + B ∑ x
∑ xY = A ∑ x + B ∑ x2
We compute the required sums:
x y Y xY x2
2 8.3 0.9191 1.8382 4
3 15.4 1.1875 3.5626 9
4 33.1 1.5198 6.0793 16
5 65.2 1.8142 9.0712 25
6 127.4 2.1052 12.6310 36
2
∑ x = 20 ∑ Y = 7.5458 ∑ xY = 33.1823 ∑ x = 90
7.5458 = 5A + 20B
33.1823 = 20A + 90B
Solving this, we find
A = 0.3095, B = 0.2999
. Thus,
a = 10A = 2.0395
b = 10B = 1.9948
Therefore, the best-fitting curve is:
y = 2.0395(1.9948)x
Question 30
Use the method of least squares to find the curve y = abx that best fits the following data:
x 2 3 4 5 6
y 144 172.8 207.4 248.8 298.5
Page 25
Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of
both sides:
Y = A + Bx
Now, we apply the method of least squares to the transformed equation:
The normal equations are:
∑ Y = nA + B ∑ x
∑ xY = A ∑ x + B ∑ x2
x y Y xY x2
2 144 2.1584 4.3167 4
3 172.8 2.2375 6.7126 9
4 207.4 2.3168 9.2672 16
5 248.8 2.8142 11.9793 25
6 298.5 2.3959 14.8497 36
2
∑ x = 20 ∑ Y = 11.5835 ∑ xY = 47.1255 ∑ x = 90
We solve the system of equations:
11.5835 = 5A + 20B
y = 99.68(1.2)x
Problem 14:
Using the method of least squares to fit the curve y = ax2 + bx to the following data:
x 1 2 3 4 5 6 7 8
y 1 1.2 1.8 2.5 3.6 4.7 6.6 9.1
Page 26
Solution:
The normal equations of y = ax2 + bx are:
∑ x2y = a ∑ x4 + b ∑ x3.............(1)
∑ xy = a ∑ x3 + b ∑ x2........(2)
x y x2 x2 y x3 x4 xy
1 1 1 1 1 1 1
2 1.2 4 4.8 8 16 2.4
3 1.8 9 16.2 27 81 5.4
4 2.5 16 40 64 256 10
5 3.6 25 90 125 625 18
6 4.7 36 169.2 216 1296 28.2
7 6.6 49 323.4 343 2401 46.2
8 9.1 64 582.4 512 4096 72.8
2 = 204 2 y = 1227 3 = 1296 4 = 8772
∑ x ∑ x ∑ x ∑ x ∑ = 184
xy
The system of equations after substituting the values of sums are:
y = 0.217x + 0.107x2
Question 31
Find the exponential curve of the form K = PV γ for the following data using the method of
least squares:
Solution: We assume the curve is of the form K = PV γ . Taking the natural logarithm of both
sides:
A = Y + BX
=⇒ Y = A − BX
Page 27
We apply the method of least squares to the linear equation:
∑ Y = nA − B ∑ X
∑ XY = A ∑ X − B ∑ X 2
V P X = logV Y = logP XY X2
50 135 1.698 2.13 3.616 2.883
100 48 2 1.681 3.362 4
150 26 2.176 1.414 3.077 4.735
200 17 2.301 1.23 2.830 5.295
2
∑ X = 8.175 ∑ Y = 6.455 ∑ XY = 12.884 ∑ X = 16.911
We substitute these values into the normal equations:
6.455 = 4A − 18.825B
12.884 = 8.175A − 16.911B
Solving this system of equations we get A = 4.713 and B = 1.516. Once we have A and B, we
can compute k = 10A = 51641.6 and γ = B = 1.516.
Thus, the exponential curve is
51641.6 = PV 1.516
Question 32
Using the method of least squares, fit the curve y = c0 x + √c1x to the following data:
Solution:
The equation of the curve is y = c0 x + √c1x . The normal equations are:
√
∑ xy = c0 ∑ x2 + c1 ∑ x
y √ 1
∑ √x = c0 ∑ x + c1 ∑ x .
√
x y x x2 xy √y 1
x x
0.2 16 0.447 0.04 3.2 35.777 5
0.3 14 0.547 0.09 4.2 25.560 3.333
0.5 11 0.707 0.25 5.5 15.556 2
1 6 1 1 6 6 1
2 3 1.414 4 6 2.121 0.5
√ y
∑ x = 4.116 ∑ x2 = 5.38 ∑ xy = 24.9 ∑ √x = 85.015 ∑ 1x = 11.833
Page 28
Substituting the values:
85.015 = c1 11.833 + c0 4.116,
24.9 = c1 4.116 + c0 5.38.
Solving the equations gives:
c1 = 7.60, c0 = −1.18.
Using the method of least squares, fit the curve f (x) = a + bx + cx2 to the following data:
x 1 2 3 4 5
f (x) 1 1.2 1.8 2.5 3.6
Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:
∑ y = n · a + b ∑ x + c ∑ x2
∑ xy = a ∑ x + b ∑ x2 + c ∑ x3
∑ x2 y = a ∑ x2 + b ∑ x3 + c ∑ x4
x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
∑ x = 10 ∑ y = 12.9 ∑ x2 = 30 ∑ x3 = 100 ∑ x4 = 354 ∑ xy = 37.1 ∑ x2 y = 130.3
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:
Page 29
Question 34
Using the method of least squares, fit a curve of the form:
y = aebx
x 1 2 3 4 5
y 1 1.2 1.8 2.5 3.6
Solution
Taking the natural logarithm of both sides:
y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)
Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:
∑ Y = nA + b ∑ x
∑ xY = A ∑ x + b ∑ x2
x y Y x2 xY
1 1 0.000 1 0.000
2 1.2 0.182 4 0.365
3 1.8 0.588 9 1.763
4 2.5 0.916 16 3.665
5 3.6 1.280 25 6.404
2
∑ x = 15 ∑ Y = 2.967 ∑ x = 55 ∑ xY = 12.197
Using sums normal equations become:
5A + 15b = 2.967,
15A + 55b = 12.197.
A = −0.3953, b = 0.3296.
Converting A to a:
a = eA = 0.6735.
Step 6: Fitted Curve
Page 30
The fitted curve is:
y = 0.6735e0.3296x .
Question 35
If 4x − 5y + 33 = 0 and 20x − 9y = 107 are two line of regression of x on yand regression of y
on x respectively. Find mean values of x and y,the correlation of coefficient and the standard
deviation of y if the variance of x is 9.
Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have
4X − 5Y + 33 = 0 be line of regression Y on X
and 20X − 9Y = 107 be line of regression X on Y
r = +0.6
σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.
Page 31
Question 36
Problem Statement
In a partially destroyed laboratory record of an analysis of correlation data, the following
results are legible:
Variance of x: σx2 = 9.
The regression equations are:
Solution
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have
8X̄ − 10Ȳ = −66,
40X̄ − 18Ȳ = 214.
Solving, we get X̄ = 13, Ȳ = 17
(ii) Let
8X − 10Y + 66 = 0 be line of regression Y on X
and 40X − 18Y = 214 be line of regression X on Y
These equations can be put in the form :
8 66
Y= X+ .
10 10
18 214
X= Y+
40 40
8 4
∴ bY X = Regression coefficient of Y on X = =
10 5
18 9
and bXY = Regression coefficient of X on Y = =
40 20
2 4 9 9
Hence r = bY X · bXY = · =
5 20 25
3
∴ r = ± = ±0.6
5
Page 32
But since both the regression coefficients are positive, we take
r = +0.6
σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.
Question 37
Solution:
Rearrange the Regression Equations
The regression equations are:
5 52
5x − 2y = 52 ⇒ y = x−
2 2
and
8 12
3x − 8y = 12 ⇒ x = y+ .
3 3
Calculate Mean Values of x and y
To find the mean values of x and y, we substitute x = x and y = y in the regression equations
and solve.
Thus, the mean values are:
x ≈ 11.528, y ≈ 2.82.
Calculate the Coefficient of Correlation
The formula for the correlation coefficient r is given by:
p
r = bxy × byx ,
where bxy is the regression coefficient of x on y, and byx is the regression coefficient of y on x.
From the regression equations: bxy = 52 , byx = 38 .
Thus: r r r
5 8 40 20
r= × = = ≈ 2.5819.
2 3 6 3
Page 33
So, the coefficient of correlation is approximately r ≈ 2.5819.
Calculate the Variance of y
We are given the variance of x is σx2 = 12. The variance of y can be calculated using the
formula:
σx
bxy = r .
σy
Substitute the known values: √
12
σy = 2.1819 × ,
2.5
Thus, the variance of y is approximately σy ≈ 3.023.
Question 38
The following table gives the age (x) in years of cars and annual maintenance cost (y) in
hundred rupees.
x 1 3 5 7 9
y 15 18 21 23 22
Calculate the maintenance cost for a 4-year-old car after finding the regression equation.
Solution
The regression equation is of the form:
y = a + bx
x y xy x2 y2
1 15 15 1 225
3 18 54 9 324
5 21 105 25 441
7 23 161 49 529
9 22 198 81 484
2 2
∑ x = 25 ∑ y = 99 ∑ xy = 533 ∑ x = 165 ∑ y = 2003
Calculate x̄ and ȳ
∑ x 25
x̄ = = =5
n 5
∑ y 99
ȳ = = = 19.8
n 5
Page 34
Calculate byx and bxy
n ∑(xy) − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
5(533) − (25)(99)
byx = = 0.95
5(165) − (25)2
n ∑(xy) − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
5(533) − (25)(99)
bxy = = 0.887
5(2003) − (99)2
Regression Equation y on x
y − ȳ = bxy (x − x̄) = y − 19.8 = 0.95(x − 5) = 0.95x + 15.05
y = 0.95x + 15.05
x 6 2 10 4 8
y 9 11 5 8 7
Solution
The regression equation of y on x is:
y − ȳ = byx (x − x̄),
where:
n ∑ xy − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
The regression equation of x on y is:
x − x̄ = bxy (y − ȳ),
where:
n ∑ xy − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
Page 35
Step 1: Calculate Required Sums
x y xy x2 y2
6 9 54 36 81
2 11 22 4 121
10 5 50 100 25
4 8 32 16 64
8 7 56 64 49
∑ x = ∑ y = ∑ xy = = ∑ x ∑ y2 =
2
Calculate Means
∑ x 30 ∑ y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5
x = 1.3y = 16.4
Question 40
Fit a parabolic curve of regression of y on x to the following data:
Page 36
Solution
The parabolic regression curve is of the form:
y = a + bx + cx2
∑ y = na + b ∑ x + c ∑ x2
∑(xy) = a ∑ x + b ∑ x2 + c ∑ x3
∑(x2y) = a ∑ x2 + b ∑ x3 + c ∑ x4
x y xy x2 x2 y x3 x4
1 1.1 1.1 1 1.1 1 1
1.5 1.3 1.95 2.25 2.925 3.375 5.0625
2 1.6 3.2 4 6.4 8 16
2.5 2 5 6.25 12.5 15.625 39.0625
3 2.7 8.1 9 24.3 27 81
3.5 3.4 11.9 12.25 41.65 42.875 150.0625
4 4.1 16.4 16 65.6 64 256
2= 2y = 3=
∑ x = ∑ y = ∑ xy = ∑ x ∑ x ∑ x ∑ x4 =
17.5 16.2 47.65 50.75 154.475 161.875 548.1875
Question 41
Find the multiple regression equation of X1 on X2 and X3 from the data given below:
X1 3 5 6 8 12 10
X2 10 10 5 7 5 2
X3 20 25 15 16 15 2
Page 37
Solution
The multiple regression equation is given by:
X1 = a + bX2 + cX3
∑ X1 = na + b ∑ X2 + c ∑ X3
∑(X1X2) = a ∑ X2 + b ∑ X22 + c ∑(X3X2)
∑(X1X3.) = a ∑ X3 + b ∑(X2X3) + c ∑ X32
X1 X2 X3 X1 X2 X22 X2 X3 X1 X3 X32
3 10 20 30 100 200 60 400
5 10 25 50 100 250 125 625
6 5 15 30 25 75 90 225
8 7 16 56 49 112 128 256
12 5 15 60 25 75 180 225
10 2 2 20 4 4 20 4
2=
x
∑ 1 = x
∑ 2 = X
∑ 3 = X X
∑ 1 2 = X
∑ 2 X X
∑ 2 3 = X X
∑ 1 3 = ∑ 32 =
X
44 39 93 246 303 716 603 1735
44 = 6a + 39b + 93c
Question 42
For the data given , determine the lines of regression :
x 2 4 6 8 10
y 5 7 9 8 11
Page 38
Solution
x y xy x2 y2
2 5 10 4 25
4 7 28 16 49
6 9 54 36 81
8 8 64 64 64
10 11 110 100 121
x = 30 y = 40 xy = 266 x 2 = 220 y2 = 340
∑ ∑ ∑ ∑ ∑
∑ x 30 ∑ y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5
The regression equation of y on x is:
y − ȳ = byx (x − x̄)
Calculate Means
where:
n ∑ xy − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
5 × 266 − 30 × 40
byx = = 0.65
5 × 220 − 302
Regression of x on y:
The regression equation of x on y is:
x − x̄ = bxy (y − ȳ)
where:
n ∑ xy − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
5 × 266 − 30 × 40
bxy = = 1.3
5 × 340 − (40)2
Page 39
x − x̄ = bxy (y − ȳ)
x − 6 = 1.3(y − 8)
x = 1.3y + 4.4
Question 43
If 3x + 2y = 26 and 6x + y = 31 are two lines of regression. Find (i) mean values of x and y
(ii) the coefficient of correlation between x and y (iii) find variance of y if the variance of x
is 9. (2024-25) 7 Marks
Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have
Solving, we get X̄ = 4, Ȳ = 7
(ii) Let
3X + 2Y = 26 be line of regression Y on X
and 6X +Y = 31 be line of regression X on Y
r = −0.33
Page 40
σY 2 1 σY
(iii) We have bY X = r · =⇒ − = − × (∵ σx2 = 9 =⇒ σx = 3)
σX 3 3 3
Hence σY = 6.
Question 44
x 2 4 6 8 10
y 4.077 11.084 30.128 81.897 222.62
(2024-25) 7 marks
y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)
Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:
∑ Y = nA + b ∑ x
∑ xY = A ∑ x + b ∑ x2
x y Y = ln y x2 xY
2 4.077 1.4054 4 2.8107
4 11.084 2.4055 16 9.6220
6 30.128 3.4055 36 20.4327
8 81.897 4.4055 64 35.2437
10 222.62 5.4055 100 54.0547
2
∑ x = 30 ∑ Y = 17.0272 ∑ x = 220 ∑ xY = 122.1638
Using sums in normal equations become:
5A + 30b = 17.0272,
30A + 220b = 122.1638.
A = 0.41, b = 0.50.
Hence
a = eA = 1.50.
Page 41
The required curve is:
y = 1.5e0.5x .
Question 45
Calculate all four moments about mean and also Skewness and Kurtosis.
Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10
(2024-25) 7 marks
Solution
We are given the following frequency distribution:
Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10
Page 42
Class Interval x f (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
0 − 10 5 1 −35 −35 1225 −42875 1500625
10 − 20 15 6 −25 −150 3750 −93750 2343750
20 − 30 25 10 −15 −150 2250 −33750 506250
30 − 40 35 15 −5 −75 375 −1875 9375
40 − 50 45 11 5 55 275 1375 6875
50 − 60 55 7 15 105 1575 23625 354375
60 − 70 65 10 25 250 6250 156250 3906250
Totl ∑ f (x − x̄) ∑ f (x − x̄) ∑ f (x − x̄) ∑ f (x − x̄)4
2 3
∑ f (x − x̄)2 15700
µ2 = = = 261.67
∑f 60
∑ f (x − x̄)3 9000
µ3 = = = 150
∑f 60
∑ f (x − x̄)4 8627500
µ4 = = = 143791.67
∑f 60
Hence, Skewness
p µ3 150
γ1 = β1 = 3/2 = = 0.035
µ2 (261.67)3/2
Kurtosis
µ4 143791.67
γ2 = β 2 − 3 = 2
= − 3 = −0.90
µ2 261.672
Page 43