0% found this document useful (0 votes)
47 views43 pages

Solution Question Bank Unit-3

The document is a question bank for the Mathematics IV course at Raj Kumar Goel Institute of Technology, covering statistical techniques such as measures of central tendency, skewness, kurtosis, and regression analysis. It includes various statistical problems with solutions, demonstrating calculations for mean, median, mode, and correlation coefficients. The course aims to help students understand fundamental statistical concepts and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views43 pages

Solution Question Bank Unit-3

The document is a question bank for the Mathematics IV course at Raj Kumar Goel Institute of Technology, covering statistical techniques such as measures of central tendency, skewness, kurtosis, and regression analysis. It includes various statistical problems with solutions, demonstrating calculations for mean, median, mode, and correlation coefficients. The course aims to help students understand fundamental statistical concepts and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

RAJ KUMAR GOEL INSTITUTE OF

TECHNOLOGY, GHAZIABAD Session (2024-25):


Odd Sem MATHEMATICS – IV (BAS-303)
Question bank (Unit-3): Statistical Technique-I

April 23, 2025

Contents: Measures of central tendency, Skewness, Kurtosis, Curve Fitting, Method of least
squares, fitting of straight lines, fitting of second-degree parabola, Exponential curves, Correlation
and Rank correlation, Regression Analysis: Regression lines of y on x and x on y,regression coef-
ficients, properties of regressions coefficients and nonlinear regression.
Course Outcome (CO3): Understand the basic statistical concept like moments, skewness, kurto-
sis, curve fitting, correlation and regression.

Question 1
A cooperative bank has two branches employing 50 and 70 workers respectively. The aver-
age salaries paid by two respective branches are Rs. 360 and Rs. 390 per month. Calculate
the mean of the salaries of all the employees.

Solution:
{
Note:
To calculate the mean salary of all employees, we use the Mean of Composite Serirs: If xi , (i =
1, 2, ..., k) are the means of k-component series of sizes ni .(i = 1, 2, ..., k) respectively, then the
meanx of the size n1 , (i = 1, 2, ..., k) respectively then mean x̄ of the composite series is given by
the formula:
n1 x1 + n2 x2 + ... + nk xk
x̄ = = ∑ nk xk / ∑ nk
n1 + n2 + ... + nk i i
}
Let nl and n2 denote respectively the number of male and female employees in the concern and
x1 and x2 denote respectively their average salary (in rupees). Let x̄ denote the average salary of all

1
the workers in the firm.
We are given that:
x1 = 360, x2 = 390, n1 = 50, n2 = 70
Mean Salary:

50(360) + 70(390) 18000 + 27300 45300


Mean Salary = = = = 377.5
50 + 70 120 120

Mean Salary = 377.5 Rs/month


Final Answer: The mean salary of all the employees is Rs. 377.5 per month.
Question 2
Find the median of the dataset: 6, 8, 9, 10, 11, 12, 13

Solution:
1. Arrange the numbers in ascending order:

6, 8, 9, 10, 11, 12, 13

2. Count the total number of data points: The total number of data points (n) etis 7, which is
odd.

3. Find the position of the median: For an odd number of data points, the median is the value
at the position:
n+1
Median Position =
2
Substituting n = 7:
7+1
Median Position = =4
2
4. The 4th number in the dataset is 10.

Final Answer: The median of the dataset is: 10


Question 3
Find the mode of the following marks obtained by 15 students:

4, 6, 5, 7, 9, 8, 10, 4, 7, 6, 5, 8, 7, 7, 9

Page 2
Solution:
1. Arrange the data and count the frequency of each number:

x 4 5 6 7 8 9 10
Frequency 2 times 2 times 2 times 4 times 2 times 2 times 1 time

2. Identify the mode: The mode is the number that appears the most frequently. Here, 7 appears
4 times, which is more than any other number.

Final Answer: The mode of the given data is:7


Question 4
Find the arithmetic mean of the following distribution.

x 1 2 3 4 5 6 7
f 5 9 12 17 14 10 6

Solution:
The formula for the arithmetic mean is:
∑ fx
Arithmetic Mean =
∑f
where:

• f is the frequency of each observation.

• x is the value of each observation.

Step 1: Calculate f x for each x

x f fx
1 5 5
2 9 18
3 12 36
4 17 68
5 14 70
6 10 60
7 6 42
Sum ∑ f = 73 ∑ f x = 299
Step 2: Compute the Arithmetic Mean

∑ f x 299
Arithmetic Mean = = ≈ 4.10
∑f 73

Page 3
Final Answer: The arithmetic mean is approximately: 4.10
Question 5
In an asymmetrical distribution, the mean is 16 and the median is 20. Calculate the mode of
the distribution.

Solution:
The empirical relationship between the mean, median, and mode is given by:

Mode = 3 × Median − 2 × Mean

Given:
Mean = 16, Median = 20
Substitute the values:
Mode = 3 × 20 − 2 × 16
Mode = 60 − 32 = 28
Final Answer: The mode of the distribution is:28
Question 6
The first three central moments of a distribution are 0, 15, -31. Find the moment of coeffi-
cient of Skewness.(2019-20 2 Marks)

Solution:
Given:
µ1 = 0, µ2 = 15, µ3 = −31
The formula for the moment coefficient of skewness is:
µ3
γ1 = 3/2
µ2
Hence
−31
γ1 = √ ≈ −.53
153
The moment coefficient of skewness is approximately:

γ1 ≈ −0.533

Question 7
The first two moments of a distribution about the value ‘2’ of the variable are 1, 16. Show
that mean is 3, and variance is 15. (2020-2021 2 marks)

Page 4
Solution:
Given that
A = 2, µ1′ = 1 and µ2′ = 16
Now
x̄ = µ1 + A = 1 + 2 = 3
variance
σ 2 = µ2
and
µ2 = µ2′ − µ1′2 = 16 − (1)2 = 15
Hence
variance = 15
Question 8
The fourth central moment is µ4 = 48. What must be its standard deviation (σ ) in order for
the distribution to be mesokurtic?

Solution:
The kurtosis (β2 ) is given as:
µ4
β2 = (µ2 = σ 2 )
σ4
where:

• µ4 is the fourth central moment.

• σ is the standard deviation.

Step 1: Substitute the known values: For a mesokurtic distribution, β2 = 3, and µ4 = 48.
Substituting these into the formula:
48
3= 4
σ
Step 2: Solve for σ 4 :
48
σ4 = = 16
3
Step 3: Solve for σ : Taking the fourth root (or square root twice) of both sides:

4

σ = 16 = 4 = 2

Final Answer: The standard deviation (σ ) =2

Page 5
Question 9

Write the normal equations to fit the curve y = ax2 + b by the method of least squares.

Solution: Fitting of second degree parabola

Let
y = ax2 + b (1)
be the given equation of best fit to set of n points (xi , yi ), i = 1, 2, ..., n, Using the principle of least
squares, we have to determine the constants a,b and c so that
n
E = ∑ (y − ax2 − b)2 = 0
i=1

is minimum.Equating to zero the partial derivatives of E with respect to a and b separately, we get
the normal equations for estimating a and b as
n
∂E
= −2 ∑ x2 (y − ax2 − b) = 0
∂a i=1
n
∂E
= −2 ∑ (y − ax2 − b) = 0
∂b i=1

=⇒
n
∑ yx2 = a ∑ x4 + bx2
i=1
n n
∑ y = a ∑ x2 + b
i=1 i=1

summation taken over i from 1 to n. For given set of points (xi , yi )

Question 10
Write the formula for Karl Pearson’s correlation coefficient and state the range of the corre-
lation coefficient.

Solution:
Karl Pearson Correlation Coefficient Formula:

Page 6
The Karl Pearson correlation coefficient (r) is given by the formula:

cov(x, y)
r=
σx σy
∑ (x − x̄)(y − ȳ)
or r= p
∑ (x − x̄)2 ∑ (y − ȳ)2
n ∑ xy − ∑ x ∑ y
or r= p p
n ∑ x2 − (∑ x)2 n ∑ y2 − (∑ y)2

where:

• x and y are the individual data points of the two variables X and Y ,

• x̄ and ȳ are the means of the variables X and Y , respectively.

Range of the Correlation Coefficient:


The value of the correlation coefficient r lies between -1 and +1, inclusive:

−1 ≤ r ≤ 1

r = 1: Perfect positive correlation r = −1: Perfect negative correlation r = 0: No correlation


Question 11
If the covariance between variables x and y is 10, and the variances of x and y are 16 and 9
respectively, find the coefficient of correlation.

Solution:
The formula for the coefficient of correlation (r) is:

Cov(x, y)
r=
σx σy

where:

• Cov(x, y) is the covariance between x and y,

• σx and σy are the standard deviations of x and y, respectively.

Given:

• Cov(x, y) = 10,

• covariance σx2 = 16, so σx = 16 = 4,

• covariance σy2 = 9, so σy = 9 = 3.

Page 7
Hence:
10 10 5
r= = = = 0.833
4 × 3 12 6
Hence the coefficient of correlation r = 0.833.
Question 12
The lines of regression of y on x and x on y are respectively:

y = x+5 and 16x − 9y = 94,

find the correlation coefficient.

Solution
Given:

• Line of regression of y on x: y = x + 5
Slope of this line (byx ) = 1.

• Line of regression of x on y: 16x − 9y = 94


9
Rewrite it as x = 16 y + 94
16 , so the slope (bxy ) =
9
16 .

Formula for Correlation Coefficient:


p
r = ± byx · bxy
Substitute the values: r
9
r = ± 1·
16
r
9 3
r=± =±
16 4

Determining the Sign of r:


The sign of r depends on the direction of the relationship. Since both regression lines have positive
slopes, r is positive.
3
r=
4
Question 13
If the regression coefficients are byx = 0.8 and bxy = 0.2, find the value of the coefficient of
correlation (r).

Page 8
Solution:
The coefficient of correlation in terms of correlation coefficients:
p
r = ± byx · bxy

Given
byx = 0.8, bxy = 0.2
We get √ √
r = ± 0.8 · 0.2 = ± 0.16

r = ±0.4

The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are
positive the correlation coefficient is positive.

r = 0.4

Question 14
If the regression coefficients are byx = 0.8 and bxy = 0.8, find the value of the coefficient of
correlation (r).

Solution:
The formula for the coefficient of correlation is:
p
r = ± byx · bxy

Given:
byx = 0.8, bxy = 0.8
Substitute these values into the formula:

r = ± 0.8 · 0.8

Simplify: √
r = ± 0.64

r = ±0.8

Page 9
The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are
positive the correlation coefficient is positive.
r = 0.8
Question 15
What is the relation between the regression coefficients and the coefficient of Correlation?

Relationship Between Regression Coefficients and the Coefficient of Correlation


The relationship between the regression coefficients (byx and bxy ) and the coefficient of corre-
lation (r) is as follows:
p
r = ± byx · bxy
Question 16
c0 √
Write the normal equations to fit a curve y = x + c1 x

Solution: Fitting give equation of curve


Let √
c0
y= + c1 x
x
be the given equation of best fit to set of n points (xi , yi ), i = 1, 2, ..., n, Using the principle of least
squares, we have to determine the constants a,b and c so that
 c0 √ 2
E = ∑ y − − c1 x
x
is minimum.Equating to zero the partial derivatives of E with respect to co and c1 separately,
we get the normal equations for estimating co and c1 as
∂E 1 c0 √ 
= −2 ∑ y − − c1 x = 0
∂a x x
∂E √  c0 √ 
= −2 ∑ x y − − c1 x = 0
∂b x
=⇒
y 1 1
∑( x ) = c0 ∑ x2 + c1 ∑ √x
√ 1
∑(y x) = c0 ∑ √x + c1 ∑ x

summation taken over i from 1 to n. For given set of points (xi , yi )


Question 17
Write the formula for rank correlation in the case of tied rank

Page 10
Spearman’s Rank Correlation with Tied Ranks
In the case of tied ranks, the formula for Spearman’s rank correlation coefficient (rs ) is:
1
6{∑ di2 + 12 ∑ mi (m2i − 1)}
rs = 1 −
n(n2 − 1)

Where:
mi is number of repetition of the ranks.

Page 11
Question Description (7 Marks)

Question 18
Calculate the first four central moments and also comment upon Skewness and Kurtosis
from the following data:

Class Interval Frequency


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2

Solution:
Given Data:

Class Interval Frequency( f )


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2
Calculate the Mean (x̄)
Class Interval f x fx
0–10 1 5 5
10–20 4 15 60
20–30 3 25 75
30–40 2 35 70
Total 10 210

∑( f x)
x̄ =
∑f
210
x̄ = = 21
10
Calculation of Central Moments, Skewness, and Kurtosis
Class Interval f x (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
0–10 1 5 -16 -16 256 -4096 65536
10–20 4 15 -6 -24 144 -864 5184
20–30 3 25 4 12 48 192 768
30–40 2 35 14 28 392 5488 76832
Total 10 S 0 840 720 148320

Page 12
First Central Moment :

µ1 = 0
∑ f (x − x̄)2 840
µ2 = = = 84
∑f 10
∑ f (x − x̄)3 720
µ3 = = = 72
∑f 10
∑ f (x − x̄)4 148320
µ4 = = 14832
∑f 10
Hence
µ32 722 5184
Skewness β1 = 3
= 3= = .0087
µ2 84 592704
µ4 14832 14832
Kurtosis β2 = 2
= = = 2.102
µ2 842 7056
Skewness: Since (β1 > 0)the distribution is positively skewed, meaning the tail on the right
side is longer or fatter than the left side.
Kurtosis: The kurtosis (β2 < 3) suggests a relatively low peak than a normal distribution,
indicating a platikurtic distribution .
Question 19
Calculate the first four central moments about the mean, Skewness, and Kurtosis for the
following data (2021-22):

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1

Solution:
Solution:
Given Data:

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1

∑( f · x)
The mean x̄ =
∑f

Page 13
x . f . fx .
0 1 0
1 8 8
2 28 56
3 56 168
4 70 280
5 56 280
6 28 168
7 8 56
8 1 8
Total 256 1024
1024
x̄ = =4
256
Now for central moments
x f (x − x̄) f (x − x̄) f (x − x̄)2 f · (x − x̄)3 f (x − x̄)4
0 1 −4 16 16 −64 256
1 8 −3 9 72 −216 648
2 28 −2 4 112 −224 448
3 56 −1 1 56 −56 56
4 70 0 0 0 0 0
5 56 1 1 56 56 56
6 28 2 4 112 224 448
7 8 3 9 72 216 648
8 1 4 16 16 64 256
Total 256 0 512 0 2816

∑ f (x − x̄) 0
µ1 = = =0
∑f 256
∑ f (x − x̄)2 512
µ2 = = =2
∑f 256
∑ f (x − x̄)3 0
µ3 = = =0
∑f 256
∑ f (x − x̄)4 2816
µ4 = = = 11
∑f 256
Hence

p µ3
Skewness γ1 = β1 = 3/2 = 0
µ2
µ4 11
Kurtosis γ2 = β2 − 3 = 2 − 3 = 2 = −0.75
µ2 2

Page 14
Question 20
Compute Skewness and Kurtosis, if the first four moments of a frequency distribution about
the value 4 of the variable are 1, 4, 10, and 45.

Solution:
We are given the first four moments about the value A = 4:

µ1′ = 1, µ2′ = 4, µ3′ = 10, µ4′ = 45

Hence

µ2 = µ2′ − (µ1′ )2 = 4 − (1)2 = 4 − 1 = 3


µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3 10 − 3(4)(1) + 2(1)3 = 0
µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ (µ1′ )2 − 3(µ1′ )4 = 45 − 4(10)(1) + 6(4)(1)2 − 3(1)4 = 26

Skewness (γ1 ):
µ3 0
γ1 = 3/2
= =0
µ2 (3)3/2
symmetric distribution.
Kurtosis :
µ4 26 26
β2 = 2
= 2
= ≈ 2.89
µ2 (3) 9
The kurtosis is slightly lower than the normal value of 3, indicating a distribution close to
normal.
Question 21
The first four moments of a frequency distribution about the value 4 of the variable are -1.5,
17,-30 and 80. Find µ1 , µ2 , µ3 , µ4 about mean. Also find β1 and β2 .

Solution
We are given the first four moments about the value A = 4:

µ1′ = −1.5, µ2′ = 17, µ3′ = −30, µ4′ = 80

The formulae for central moments (µr ) in terms of moments about A (µr′ ) are:

µ1 = 0, µ2 = µ2′ − (µ1′ )2 , µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3 , µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ (µ1′ )2 − 3(µ1′ )4

Page 15
Step 1: Calculate Central Moments
1. Second Central Moment (µ2 ):
µ2 = µ2′ − (µ1′ )2
µ2 = 17 − (−1.5)2
µ2 = 17 − 2.25 = 14.75
2. Third Central Moment (µ3 ):

µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3

µ3 = −30 − 3(17)(−1.5) + 2(−1.5)3


µ3 = −30 + 76.5 + 2(−3.375)
µ3 = −30 + 76.5 − +6.75 = 39.75
3. Fourth Central Moment (µ4 ):

µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ (µ1′ )2 − 3(µ1′ )4

µ4 = 80 − 4(−30)(−1.5) + 6(17)(−1.5)2 − 3(−1.5)4


µ4 = 80 − 4(−30)(−1.5) + 6(17)(2.25) − 3(5.0625)
µ4 = 80 − 180 + 229.5 − 15.1875
µ4 = 114.3125

Step 2: Skewness and Kurtosis


Skewness (β1 ):
µ32
β1 =
µ23
Substitute µ3 = 39.75 and µ2 = 14.75:

39.752
β1 = ≈ .492
14.753
µ4
β2 = 2
µ2
Substitute µ4 = 114.3125 and µ2 = 14.75:
114.3125 114.3125
β2 = = ≈ 0.525
(14.75)2 217.5625

Question 22
The first four moments of a frequency distribution about the value 2 of the variable are 2,
20, 40 and 50 respectively. Comment upon the skewness and kurtosis of the distribution.

Page 16
Solution
Analysis of Skewness and Kurtosis
The first four moments about A = 2 are given as:

µ1′ = 2, µ2′ = 20, µ3′ = 40, µ4′ = 50.

µ1 = 0.

Central Moments
The central moments are calculated using the formula:
r  
r ′ ′r−k
µr = ∑ µr µ1 .
k=0 k

Second Central Moment (µ2 ):

µ2 = µ2′ − (µ1′ )2 = 20 − 22 = 20 − 4 = 16.

Third Central Moment (µ3 ):


µ3 = µ3′ − 3µ2′ µ1′ + 2µ1′3
µ3 = 40 − 3(20)(2) + 23 = 40 − 120 + 8 = −64.

Fourth Central Moment (µ4 ):

µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ µ1′2 − 3(µ1′ )4

µ4 = 50 − 4(40)(2) + 6(20)(2)2 − 3(24 ) = 162

Skewness (γ1 )
Skewness is calculated as:
µ3
γ1 = 3/2
µ2
−64
γ1 = = −1.
(16)3/2
Interpretation: Since γ1 is negative, the distribution is negatively skewed.

Page 17
Kurtosis (γ2 )
Kurtosis is calculated as:
µ4
γ2 =
µ22
162
γ2 = = 0.6328.
162
Excess kurtosis is:
Excess Kurtosis = γ2 − 3 = 0.6328 − 3 = −2.367.
Interpretation: The negative excess kurtosis indicates that the distribution is platykurtic (flatter
than a normal distribution).
Question 23
The first four moments of a frequency distribution about the value 5 of the variable are 1, 2.5,
5.5 and 16 respectively.Find the four central moments, moments about origin and coefficient
of skewness.

Solution:
Given:
µ1′ = 1,
µ2′ = 2.5,
µ3′ = 5.5,
µ4′ = 16.
The value of A = 5. The mean is given by:
x̄ = A + µ1′ = 2 + 1 = 3.

Step 1: Central Moments


The central moments µr are related to the moments about A (µr ) as follows:
µ1 = 0,
µ2 = µ2′ − (µ1′ )2 ,
µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3 ,
µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ (µ1′ )2 − 3(µ1′ )4 .
Substitute the values:

µ2 = µ2′ − (µ1′ )2 = 2.5 − 12 = 2.5 − 1 = 1.5,


µ3 = µ3′ − 3µ2′ µ1′ + 2(µ1′ )3 = 5.5 − 3(2.5)(1) + 2(1)3 = 5.5 − 7.5 + 2 = 0,
µ4 = µ4′ − 4µ3′ µ1′ + 6µ2′ mu′2 ′ 4 2 4
1 − 3(µ1 ) = 16 − 4(5.5)(1) + 6(2.5)(1) − 3(1) = 16 − 22 + 15 − 3 = 6.

Page 18
Thus, the central moments are:

µ2 = 1.5, µ3 = 0, µ4 = 6.

Step 2: Moments About the Origin


The moments about the origin µn′ are related to the central moments µn and the mean µ1′ = x̄ as
follows:

µ1′ = x̄ = 3,
µ2′ = µ2 + (µ1′ )2 = 1.5 + 32 = 10.5,
µ3′ = µ3 + 3µ2 µ1′ + µ1′3 = 0 + 3(1.5)(3) + 33 = 40.5,
µ4′ = µ4 + 4µ3 µ1′ + 6µ2 mu′2 4 2 4
1 + x̄ = 6 + 4(0)(3) + 6(1.5)(3 ) + 3 = 168.

Thus, the moments about the origin are:

µ1′ = 3, µ2′ = 10.5, µ3′ = 40.5, µ4′ = 168.

Hence
The coefficient of skewness γ1 is given by:
µ3 0
γ1 = 3/2
= = 0.
µ2 (1.5)3/2

Thus, the distribution is symmetric.


Question 24
Determine the Skewness and Kurtosis for the following data:

Marks 10-20 20-30 30-40 40-50 50-60


No. of students 18 20 30 22 10

Solution
We are given the following frequency distribution:

Class Interval 10-20 20-30 30-40 40-50 50-60


No. of students 18 20 30 22 10
Step 1: Calculate the Mean The mean x̄ is calculated as:

∑ fx
x̄ =
∑f

Page 19
We first calculate f x:
Class Interval f Mid Point(x) fx
10 − 20 18 15 270
20 − 30 20 25 500
30 − 40 30 35 1050
40 − 50 22 45 990
50 − 60 10 55 550
Total 100 3360
Thus, the mean is:
3360
x̄ = = 33.6
100
Step 2: Calculate Moments

Class Interval x f (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4


10 − 20 15 18 −18.6 −334.8 6227.28 −115827 2154390
20 − 30 25 20 −8.6 −172 1479.9 −12721.1 109401.6
30 − 40 35 30 1.4 42 58.80 82.32 115.248
40 − 50 45 22 11.4 250.8 2859.12 32593.968 371571.2
50 − 60 55 10 21.4 214 4579.60 98003.44 22097274
2
Total x̄ = 100 ∑ f (x − x̄) = 0 ∑ f (x − x̄) = 15204 2131.2 4732752

∑ f (x − x̄)2 15204
µ2 = = = 152.04
∑f 100

∑ f (x − x̄)3 2131.2
µ3 = = = 21.312
∑f 100
∑ f (x − x̄)4 4732752
µ4 = = = 47327.52
∑f 100
Hence, Skewness
p µ3 21.312
γ1 = β1 = 3/2 = = 0.
µ2 (15204)3/2
Kurtosis
µ4 47327.5
γ2 = β2 − 3 = 2
= − 3 = −0.953
µ2 123.042
Question 25
Find the coefficient of correlation from the following points of observation
(1,3),(2,2),(3,5),(4,4),(5,6).

Page 20
Solution:
To find the coefficient of correlation r for the given points of observation, we use the Pearson
correlation coefficient formula:
n ∑ xy − ∑ x ∑ y
r= p
[n ∑ x − (∑ x)2 ][n ∑ y2 − (∑ y)2 ]
2

Given Points of Observation:

(1, 3), (2, 2), (3, 5), (4, 4), (5, 6)

x y xy x2 y2
1 3 3 1 9
2 2 4 4 4
3 5 15 9 25
4 4 16 16 16
5 6 30 25 36
2 2
∑ x = 15 ∑ y = 20 ∑ xy = 68 ∑ x = 55 ∑ y = 90

n = 5 (number of points)
Now, substitute the values into the formula for the Pearson correlation coefficient:
5 × 68 − (15 × 20)
r= p
[5 × 55 − (15)2 ][5 × 90 − (20)2 ]
340 − 300
Simplifying each part: = p
[275 − 225][450 − 400]
40
=p
[50][50]
40 40
=√ = = 0.8
2500 50
Answer: The coefficient of correlation r is 0.8. This indicates a strong positive correlation between
X and Y .

Question 26
A random sample of 5 college students is selected and their grades in Mathematics and
Statistics are found to be:

Students 1 2 3 4 5
Mathematics 85 60 73 40 90
Statistics 93 75 65 50 80
Calculate the rank correlation coefficient.

Page 21
Solution:
Solution:
Spearman’s Rank Correlation Coefficient
The formula for the rank correlation coefficient ρ is:

6 ∑ di2
ρ = 1−
n(n2 − 1)

Where: n is the number of data points (in this case, n = 5), di is the difference between the ranks
of corresponding values of Mathematics and Statistics for each student.
Arrange X and Y series into ascending order and give them ranks starting from 1

X-Series: 90 85 73 60 40
Rank: 1 2 3 4 5
Y-Series: 93 80 75 65 50
Rank: 1 2 3 4 5
Calculate the Differences in Ranks and Square Them Now, we calculate di = RankX − RankY and
di2 :
Student RankX RankY di = RankX − RankY di2
1 2 1 1 1
2 4 3 1 1
3 3 4 −1 1
4 5 5 0 0
5 1 2 −1 1
2
Total ∑ di = 4
Now, substitute into the formula:
6×4 24 24 24
ρ = 1− 2
= 1− = 1− = 1− = 1 − 0.2 = 0.8
5(5 − 1) 5(25 − 1) 5 × 24 120

The rank correlation coefficient is ρ = 0.8.


Question 27
Calculate the coefficient of correlation for the following heights (in inches) of fathers (X)
and their sons (Y ):

Father’s Height (X) 65 66 67 67 68 69 70 72


Son’s Height (Y) 67 68 65 68 72 72 69 71

Page 22
Solution:
The Pearson correlation coefficient r is given by the formula:
n ∑ xy − ∑ x ∑ y
r= p
[n ∑ x2 − (∑ x)2 ][n ∑ y2 − (∑ y)2 ]
We are given the data for 8 students, so n = 8.
X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
2 2
∑ x = 544 ∑ y = 552 ∑ x = 37028 ∑ y = 38132 ∑ xy = 37560
Now, substitute the values into the formula:
8 × 37560 − 544 × 552
r= p = 0.603
[8 × 37028 − (544)2 ][8 × 38132 − (552)2 ]
SHORT-CUT METHOD

X Y U = X − 68 V = X − 69 U2 V2 UV
65 67 -3 -2 9 4 6
66 68 -2 -1 4 1 2
67 65 -1 -4 1 16 4
67 68 -1 -1 1 1 1
68 72 0 3 0 9 0
69 72 1 3 1 9 3
70 69 2 0 4 0 0
72 71 4 2 16 4 8
∑ X = 544 ∑ Y = 552 ∑U = 0 ∑V = 0 ∑ U 2 = 36 ∑ V 2 = 44 ∑ UV = 24
1 1
Ū =∑ U = 0, V̄ = ∑ V = 0
n n
1 1
Cov(U,V ) = ∑ UV − Ū V̄ = × 24 = 3
n 8
1 1
σU2 = ∑ U − Ū 2 = × 36 = 4.5
n 8
1 1
σV2 = ∑ V − V̄ 2 = × 44 = 5.5
n 8

Page 23
Question 28
Fit a parabolic curve of second degree to the following data:

x 0 1 2 3 4
y 1 1.8 1.3 2.5 6.3

Solution:
Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:

∑ y = n · a + b ∑ x + c ∑ x2
∑ xy = a ∑ x + b ∑ x2 + c ∑ x3
∑ x2 y = a ∑ x2 + b ∑ x3 + c ∑ x4
x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
∑ x = 10 ∑ y = 12.9 ∑ x2 = 30 ∑ x3 = 100 ∑ x4 = 354 ∑ xy = 37.1 ∑ x2 y = 130.3
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:

a = 1.42, b = −1.07, c = 0.55.

Thus the required equation of the second degree parabola is

y = 1.42 − 1.07x + 0.55x2

Question 29
Use the method of least squares to find the curve y = abx that best fits the following data:

X 2 3 4 5 6
Y 8.3 15.4 33.1 65.2 127.4

Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of
both sides:

Page 24
log(y) = log(abx ) = log(a) + x log(b)
Let Y = log(y), A = log(a), and B = log(b), so the equation becomes:

Y = A + Bx
Now, we apply the method of least squares to the transformed equation:
The normal equations are:

∑ Y = nA + B ∑ x
∑ xY = A ∑ x + B ∑ x2
We compute the required sums:

x y Y xY x2
2 8.3 0.9191 1.8382 4
3 15.4 1.1875 3.5626 9
4 33.1 1.5198 6.0793 16
5 65.2 1.8142 9.0712 25
6 127.4 2.1052 12.6310 36
2
∑ x = 20 ∑ Y = 7.5458 ∑ xY = 33.1823 ∑ x = 90

We solve the system of equations:

7.5458 = 5A + 20B
33.1823 = 20A + 90B
Solving this, we find
A = 0.3095, B = 0.2999
. Thus,
a = 10A = 2.0395
b = 10B = 1.9948
Therefore, the best-fitting curve is:

y = 2.0395(1.9948)x
Question 30
Use the method of least squares to find the curve y = abx that best fits the following data:

x 2 3 4 5 6
y 144 172.8 207.4 248.8 298.5

Page 25
Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of
both sides:

log y = log abx = log a + x log b


Let Y = log y, A = log a, and B = log b, so the equation becomes:

Y = A + Bx
Now, we apply the method of least squares to the transformed equation:
The normal equations are:

∑ Y = nA + B ∑ x
∑ xY = A ∑ x + B ∑ x2
x y Y xY x2
2 144 2.1584 4.3167 4
3 172.8 2.2375 6.7126 9
4 207.4 2.3168 9.2672 16
5 248.8 2.8142 11.9793 25
6 298.5 2.3959 14.8497 36
2
∑ x = 20 ∑ Y = 11.5835 ∑ xY = 47.1255 ∑ x = 90
We solve the system of equations:

11.5835 = 5A + 20B

47.1255 = 20A + 90B


Solving this, we find
A = 2.0001,
and
B = 0.0791
.
Thus, a = 10A = 100.0230 and b = 10B = 1.1999.
Therefore, the best-fitting curve is:

y = 99.68(1.2)x

Problem 14:
Using the method of least squares to fit the curve y = ax2 + bx to the following data:

x 1 2 3 4 5 6 7 8
y 1 1.2 1.8 2.5 3.6 4.7 6.6 9.1

Page 26
Solution:
The normal equations of y = ax2 + bx are:

∑ x2y = a ∑ x4 + b ∑ x3.............(1)
∑ xy = a ∑ x3 + b ∑ x2........(2)
x y x2 x2 y x3 x4 xy
1 1 1 1 1 1 1
2 1.2 4 4.8 8 16 2.4
3 1.8 9 16.2 27 81 5.4
4 2.5 16 40 64 256 10
5 3.6 25 90 125 625 18
6 4.7 36 169.2 216 1296 28.2
7 6.6 49 323.4 343 2401 46.2
8 9.1 64 582.4 512 4096 72.8
2 = 204 2 y = 1227 3 = 1296 4 = 8772
∑ x ∑ x ∑ x ∑ x ∑ = 184
xy
The system of equations after substituting the values of sums are:

1227 = 8772a + 1296b


184 = 1296a + 204b
We solve this system and find a = 0.107 and b = 0.217.
Thus, the best-fitting curve is:

y = 0.217x + 0.107x2
Question 31
Find the exponential curve of the form K = PV γ for the following data using the method of
least squares:

V 50 100 150 200


P 135 48 26 17

Solution: We assume the curve is of the form K = PV γ . Taking the natural logarithm of both
sides:

log(K) = log(PV γ ) = log(P) + γ log(V )


Let Y = log(P), A = log(k), and B = γ, and X = log(V ). The equation becomes:

A = Y + BX
=⇒ Y = A − BX

Page 27
We apply the method of least squares to the linear equation:

∑ Y = nA − B ∑ X
∑ XY = A ∑ X − B ∑ X 2
V P X = logV Y = logP XY X2
50 135 1.698 2.13 3.616 2.883
100 48 2 1.681 3.362 4
150 26 2.176 1.414 3.077 4.735
200 17 2.301 1.23 2.830 5.295
2
∑ X = 8.175 ∑ Y = 6.455 ∑ XY = 12.884 ∑ X = 16.911
We substitute these values into the normal equations:

6.455 = 4A − 18.825B
12.884 = 8.175A − 16.911B
Solving this system of equations we get A = 4.713 and B = 1.516. Once we have A and B, we
can compute k = 10A = 51641.6 and γ = B = 1.516.
Thus, the exponential curve is
51641.6 = PV 1.516
Question 32
Using the method of least squares, fit the curve y = c0 x + √c1x to the following data:

x 0.2 0.3 0.5 1 2


y 16 14 11 6 3

Solution:
The equation of the curve is y = c0 x + √c1x . The normal equations are:

∑ xy = c0 ∑ x2 + c1 ∑ x
y √ 1
∑ √x = c0 ∑ x + c1 ∑ x .

x y x x2 xy √y 1
x x
0.2 16 0.447 0.04 3.2 35.777 5
0.3 14 0.547 0.09 4.2 25.560 3.333
0.5 11 0.707 0.25 5.5 15.556 2
1 6 1 1 6 6 1
2 3 1.414 4 6 2.121 0.5
√ y
∑ x = 4.116 ∑ x2 = 5.38 ∑ xy = 24.9 ∑ √x = 85.015 ∑ 1x = 11.833

Page 28
Substituting the values:
85.015 = c1 11.833 + c0 4.116,
24.9 = c1 4.116 + c0 5.38.
Solving the equations gives:

c1 = 7.60, c0 = −1.18.

The fitted curve is:


7.60
y = √ − 1.18x
x
Question 33

Using the method of least squares, fit the curve f (x) = a + bx + cx2 to the following data:

x 1 2 3 4 5
f (x) 1 1.2 1.8 2.5 3.6

Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:

∑ y = n · a + b ∑ x + c ∑ x2
∑ xy = a ∑ x + b ∑ x2 + c ∑ x3
∑ x2 y = a ∑ x2 + b ∑ x3 + c ∑ x4
x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
∑ x = 10 ∑ y = 12.9 ∑ x2 = 30 ∑ x3 = 100 ∑ x4 = 354 ∑ xy = 37.1 ∑ x2 y = 130.3
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:

a = 1.42, b = −1.07, c = 0.55.

Thus the required equation of the second degree parabola is

y = 1.42 − 1.07x + 0.55x2

Page 29
Question 34
Using the method of least squares, fit a curve of the form:

y = aebx

to the following data:

x 1 2 3 4 5
y 1 1.2 1.8 2.5 3.6

Solution
Taking the natural logarithm of both sides:

y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)

Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:

∑ Y = nA + b ∑ x
∑ xY = A ∑ x + b ∑ x2
x y Y x2 xY
1 1 0.000 1 0.000
2 1.2 0.182 4 0.365
3 1.8 0.588 9 1.763
4 2.5 0.916 16 3.665
5 3.6 1.280 25 6.404
2
∑ x = 15 ∑ Y = 2.967 ∑ x = 55 ∑ xY = 12.197
Using sums normal equations become:

5A + 15b = 2.967,
15A + 55b = 12.197.

Solve for A and b


Solving the equations, we get:

A = −0.3953, b = 0.3296.

Converting A to a:
a = eA = 0.6735.
Step 6: Fitted Curve

Page 30
The fitted curve is:
y = 0.6735e0.3296x .
Question 35
If 4x − 5y + 33 = 0 and 20x − 9y = 107 are two line of regression of x on yand regression of y
on x respectively. Find mean values of x and y,the correlation of coefficient and the standard
deviation of y if the variance of x is 9.

Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have

4X̄ − 5Ȳ = −33,


20X̄ − 9Ȳ = 107.

Solving, we get X̄ = 13, Ȳ = 17


(ii) Let

4X − 5Y + 33 = 0 be line of regression Y on X
and 20X − 9Y = 107 be line of regression X on Y

These equations can be put in the form :


4 33
Y = X+ .
5 5
9 107
X= Y+
20 20
4
∴ bY X = Regression coefficient of Y on X =
5
9
and bXY = Regression coefficient of X on Y =
20
4 9 9
Hence r2 = bY X · bXY = · =
5 20 25
3
∴ r = ± = ±0.6
5
But since both the regression coefficients are positive, we take

r = +0.6

σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.

Page 31
Question 36

Problem Statement
In a partially destroyed laboratory record of an analysis of correlation data, the following
results are legible:
Variance of x: σx2 = 9.
The regression equations are:

8x − 10y + 66 = 0 and 40x − 18y = 214.

Calculate the following:

(a) The mean values of x and y.

(b) The standard deviation of y.

(c) The coefficient of correlation between x and y.

Solution
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have
8X̄ − 10Ȳ = −66,
40X̄ − 18Ȳ = 214.
Solving, we get X̄ = 13, Ȳ = 17
(ii) Let
8X − 10Y + 66 = 0 be line of regression Y on X
and 40X − 18Y = 214 be line of regression X on Y
These equations can be put in the form :
8 66
Y= X+ .
10 10
18 214
X= Y+
40 40
8 4
∴ bY X = Regression coefficient of Y on X = =
10 5
18 9
and bXY = Regression coefficient of X on Y = =
40 20
2 4 9 9
Hence r = bY X · bXY = · =
5 20 25
3
∴ r = ± = ±0.6
5

Page 32
But since both the regression coefficients are positive, we take

r = +0.6

σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.

Question 37

Two lines of regression are given by 5x − 2y = 52 and 3x − 8y = 12 and σx2 = 12.


Calculate

(a) The mean value of x and y,

(b) The variance of y,

(c) The coefficient of correlation between x and y.

Solution:
Rearrange the Regression Equations
The regression equations are:
5 52
5x − 2y = 52 ⇒ y = x−
2 2
and
8 12
3x − 8y = 12 ⇒ x = y+ .
3 3
Calculate Mean Values of x and y
To find the mean values of x and y, we substitute x = x and y = y in the regression equations
and solve.
Thus, the mean values are:
x ≈ 11.528, y ≈ 2.82.
Calculate the Coefficient of Correlation
The formula for the correlation coefficient r is given by:
p
r = bxy × byx ,

where bxy is the regression coefficient of x on y, and byx is the regression coefficient of y on x.
From the regression equations: bxy = 52 , byx = 38 .
Thus: r r r
5 8 40 20
r= × = = ≈ 2.5819.
2 3 6 3

Page 33
So, the coefficient of correlation is approximately r ≈ 2.5819.
Calculate the Variance of y
We are given the variance of x is σx2 = 12. The variance of y can be calculated using the
formula:
σx
bxy = r .
σy
Substitute the known values: √
12
σy = 2.1819 × ,
2.5
Thus, the variance of y is approximately σy ≈ 3.023.
Question 38
The following table gives the age (x) in years of cars and annual maintenance cost (y) in
hundred rupees.

x 1 3 5 7 9
y 15 18 21 23 22
Calculate the maintenance cost for a 4-year-old car after finding the regression equation.

Solution
The regression equation is of the form:

y = a + bx

x y xy x2 y2
1 15 15 1 225
3 18 54 9 324
5 21 105 25 441
7 23 161 49 529
9 22 198 81 484
2 2
∑ x = 25 ∑ y = 99 ∑ xy = 533 ∑ x = 165 ∑ y = 2003

Calculate x̄ and ȳ
∑ x 25
x̄ = = =5
n 5
∑ y 99
ȳ = = = 19.8
n 5

Page 34
Calculate byx and bxy
n ∑(xy) − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
5(533) − (25)(99)
byx = = 0.95
5(165) − (25)2
n ∑(xy) − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
5(533) − (25)(99)
bxy = = 0.887
5(2003) − (99)2

Regression Equation y on x
y − ȳ = bxy (x − x̄) = y − 19.8 = 0.95(x − 5) = 0.95x + 15.05
y = 0.95x + 15.05

Predict Maintenance Cost for a 4-Year-Old Car (x = 4)


y = 15.05 + 0.95(4) = 18.85 (hundred rupees).
Question 39
From the following data, determine the equations of the line of regression of y on x and x on
y:

x 6 2 10 4 8
y 9 11 5 8 7

Solution
The regression equation of y on x is:

y − ȳ = byx (x − x̄),

where:
n ∑ xy − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
The regression equation of x on y is:

x − x̄ = bxy (y − ȳ),

where:
n ∑ xy − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2

Page 35
Step 1: Calculate Required Sums
x y xy x2 y2
6 9 54 36 81
2 11 22 4 121
10 5 50 100 25
4 8 32 16 64
8 7 56 64 49
∑ x = ∑ y = ∑ xy = = ∑ x ∑ y2 =
2

30 40 214 220 340

Calculate Means
∑ x 30 ∑ y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5

Calculate Regression Coefficients


n ∑ xy − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
5 × 214 − 30 × 40
byx = = −0.65
5 × 220 − 302
n ∑ xy − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
56 × 214 − 30 × 40
bxy = = −1.30
5 × 340 − 402

Write Regression Equations


Regression of y on x:
y − 8.00 = −0.65(x − 6.00)
y = 0.65x + 11.9
Regression of x on y:
x − 6.00 = −1.30(y − 8.00)

x = 1.3y = 16.4
Question 40
Fit a parabolic curve of regression of y on x to the following data:

x 1.0 1.5 2.0 2.5 3.0 3.5 4.0


y 1.1 1.3 1.6 2.0 2.7 3.4 4.1

Page 36
Solution
The parabolic regression curve is of the form:

y = a + bx + cx2

The normal equations for fitting a parabola are:

∑ y = na + b ∑ x + c ∑ x2
∑(xy) = a ∑ x + b ∑ x2 + c ∑ x3
∑(x2y) = a ∑ x2 + b ∑ x3 + c ∑ x4
x y xy x2 x2 y x3 x4
1 1.1 1.1 1 1.1 1 1
1.5 1.3 1.95 2.25 2.925 3.375 5.0625
2 1.6 3.2 4 6.4 8 16
2.5 2 5 6.25 12.5 15.625 39.0625
3 2.7 8.1 9 24.3 27 81
3.5 3.4 11.9 12.25 41.65 42.875 150.0625
4 4.1 16.4 16 65.6 64 256
2= 2y = 3=
∑ x = ∑ y = ∑ xy = ∑ x ∑ x ∑ x ∑ x4 =
17.5 16.2 47.65 50.75 154.475 161.875 548.1875

Substitute Sum and Solve Normal Equations


16.2 = 7a + b17.5 + c50.75
47.65 = a17.5 + b50.75 + c161.875
154.475 = a50.75 + b161.875 + c548.1875
Solve these simultaneous equations to find a, b, and c.

Final Parabolic Equation


Substitute a = 0.242, b = −0.193, and c = 1.036 into the equation:

y = 0.242 − 0.193x + 1.036x2

Question 41
Find the multiple regression equation of X1 on X2 and X3 from the data given below:

X1 3 5 6 8 12 10
X2 10 10 5 7 5 2
X3 20 25 15 16 15 2

Page 37
Solution
The multiple regression equation is given by:

X1 = a + bX2 + cX3

To determine a, b, and c, we use the normal equations:

∑ X1 = na + b ∑ X2 + c ∑ X3
∑(X1X2) = a ∑ X2 + b ∑ X22 + c ∑(X3X2)
∑(X1X3.) = a ∑ X3 + b ∑(X2X3) + c ∑ X32

X1 X2 X3 X1 X2 X22 X2 X3 X1 X3 X32
3 10 20 30 100 200 60 400
5 10 25 50 100 250 125 625
6 5 15 30 25 75 90 225
8 7 16 56 49 112 128 256
12 5 15 60 25 75 180 225
10 2 2 20 4 4 20 4
2=
x
∑ 1 = x
∑ 2 = X
∑ 3 = X X
∑ 1 2 = X
∑ 2 X X
∑ 2 3 = X X
∑ 1 3 = ∑ 32 =
X
44 39 93 246 303 716 603 1735

Solve Normal Equations


Substitute the calculated sums into the normal equations and solve for a, b, and c.

44 = 6a + 39b + 93c

246 = a39 + 303b + 716c


603 = 93a + 716b + 1735c

Write Final Equation


Substitute the values of a = 12.360, b = −1.398, and c = 0.262 into the regression equation:

X1 = 12.360 − 1.398X2 + 0.262X3

Question 42
For the data given , determine the lines of regression :

x 2 4 6 8 10
y 5 7 9 8 11

Page 38
Solution
x y xy x2 y2
2 5 10 4 25
4 7 28 16 49
6 9 54 36 81
8 8 64 64 64
10 11 110 100 121
x = 30 y = 40 xy = 266 x 2 = 220 y2 = 340
∑ ∑ ∑ ∑ ∑
∑ x 30 ∑ y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5
The regression equation of y on x is:

y − ȳ = byx (x − x̄)

Calculate Means
where:
n ∑ xy − ∑ x ∑ y
byx =
n ∑ x2 − (∑ x)2
5 × 266 − 30 × 40
byx = = 0.65
5 × 220 − 302

Regression of x on y:
The regression equation of x on y is:

x − x̄ = bxy (y − ȳ)

where:
n ∑ xy − ∑ x ∑ y
bxy =
n ∑ y2 − (∑ y)2
5 × 266 − 30 × 40
bxy = = 1.3
5 × 340 − (40)2

Step 3: Write the Regression Equations


1. Regression of y on x:
y − ȳ = byx (x − x̄)

y − 8 = .65(x − 6) = .65x − 3.9


y = .65x + 4.1
2. Regression of x on y:

Page 39
x − x̄ = bxy (y − ȳ)
x − 6 = 1.3(y − 8)

x = 1.3y + 4.4
Question 43
If 3x + 2y = 26 and 6x + y = 31 are two lines of regression. Find (i) mean values of x and y
(ii) the coefficient of correlation between x and y (iii) find variance of y if the variance of x
is 9. (2024-25) 7 Marks

Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have

3X̄ + 2Ȳ = 26,


6X̄ + Ȳ = 31.

Solving, we get X̄ = 4, Ȳ = 7
(ii) Let

3X + 2Y = 26 be line of regression Y on X
and 6X +Y = 31 be line of regression X on Y

These equations can be put in the form :


26 2
Y= − X .
3 3
31 1
X= − Y
6 6
2
∴ bY X = Regression coefficient of Y on X = −
3
1
and bXY = Regression coefficient of X on Y = −
6
2 1 1
Hence r2 = bY X · bXY = − · =
3 6 9
1
∴ r = ± = ±0.33
3
But since both the regression coefficients are negative, we take

r = −0.33

Page 40
σY 2 1 σY
(iii) We have bY X = r · =⇒ − = − × (∵ σx2 = 9 =⇒ σx = 3)
σX 3 3 3
Hence σY = 6.

Question 44

Fit the curve y = aebx

x 2 4 6 8 10
y 4.077 11.084 30.128 81.897 222.62

(2024-25) 7 marks

Taking the natural logarithm of both sides:

y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)

Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:

∑ Y = nA + b ∑ x
∑ xY = A ∑ x + b ∑ x2
x y Y = ln y x2 xY
2 4.077 1.4054 4 2.8107
4 11.084 2.4055 16 9.6220
6 30.128 3.4055 36 20.4327
8 81.897 4.4055 64 35.2437
10 222.62 5.4055 100 54.0547
2
∑ x = 30 ∑ Y = 17.0272 ∑ x = 220 ∑ xY = 122.1638
Using sums in normal equations become:

5A + 30b = 17.0272,
30A + 220b = 122.1638.

Solve for A and b


Solving the equations, we get:

A = 0.41, b = 0.50.

Hence
a = eA = 1.50.

Page 41
The required curve is:
y = 1.5e0.5x .
Question 45
Calculate all four moments about mean and also Skewness and Kurtosis.
Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10

(2024-25) 7 marks

Solution
We are given the following frequency distribution:

Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10

The mean x̄ is calculated as:


∑ fx
x̄ =
∑f
We first calculate f x:
Class Interval f Mid Point(x) fx
0 − 10 1 5 5
10 − 20 6 15 90
20 − 30 10 25 250
30 − 40 15 35 525
40 − 50 11 45 495
50 − 60 7 55 385
60 − 70 10 65 650
Total 60 2400
Thus, the mean is:
2400
x̄ = = 40
60
Step 2: Calculate Moments

Page 42
Class Interval x f (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
0 − 10 5 1 −35 −35 1225 −42875 1500625
10 − 20 15 6 −25 −150 3750 −93750 2343750
20 − 30 25 10 −15 −150 2250 −33750 506250
30 − 40 35 15 −5 −75 375 −1875 9375
40 − 50 45 11 5 55 275 1375 6875
50 − 60 55 7 15 105 1575 23625 354375
60 − 70 65 10 25 250 6250 156250 3906250
Totl ∑ f (x − x̄) ∑ f (x − x̄) ∑ f (x − x̄) ∑ f (x − x̄)4
2 3

=0 = 15700 = 9000 = 8627500

∑ f (x − x̄)2 15700
µ2 = = = 261.67
∑f 60

∑ f (x − x̄)3 9000
µ3 = = = 150
∑f 60
∑ f (x − x̄)4 8627500
µ4 = = = 143791.67
∑f 60
Hence, Skewness
p µ3 150
γ1 = β1 = 3/2 = = 0.035
µ2 (261.67)3/2
Kurtosis
µ4 143791.67
γ2 = β 2 − 3 = 2
= − 3 = −0.90
µ2 261.672

Page 43

You might also like