0% found this document useful (0 votes)
15 views4 pages

Midterm Exam 1 Correcction

The document consists of exercises related to statistics, including true/false assertions about statistical concepts, data analysis of dog weights, and correlation calculations between two variables. It covers topics such as median, quartiles, standard deviation, and Spearman correlation coefficient. The exercises require the completion of statistical tables, graphical representations, and interpretations of results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Midterm Exam 1 Correcction

The document consists of exercises related to statistics, including true/false assertions about statistical concepts, data analysis of dog weights, and correlation calculations between two variables. It covers topics such as median, quartiles, standard deviation, and Spearman correlation coefficient. The exercises require the completion of statistical tables, graphical representations, and interpretations of results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Name : Groupe:

Exercice 1. Among these assertions, specify which ones are true and which ones are false. (20x0.25)
True False
p
1-The median is the score at the 50th percentile.
p
2-The range is a frequently used measure of central tendency.
p
3-With ordinal data, the appropriate correlation to use is the Spearman correlation.
p
4-The STD of a sample can be -1.
p
5-Increasing all values of a sample by a, will increase the mean by a.
p
6-Increasing all values of a sample by b, will increase STD by b.
p
7-The mean is equal to 0 for a symmetric histogram.
p
8-If a value is higher than the mean then it is an outlier.
p
9-The median is always higher than the mean.
p
10-The sample mean is always higher than the population mean.
p
11- (X; Y ) is a unitless number between -1 and 1.
p
12-8a; b; x0 ; y0 2 R : Cov(aX + x0 ; bY + y0 ) < abCov(X; Y )
p
13-Parameter is a numerical measure that describes a variable of a population.
p
14-A sample is a subset of the people or objects in a population.
p
15-Skewed distributions usually show the familiar bell shape.
p
16-Descriptive statistics are used to summarize and display the data in a meaningful way.
p
17-A correlation of 1 or -1 means that the two variables are perfectly linearly related.
p
18-Correlation implies causation.
p
19-The interquartile range is the di¤erence between the …rst and second quartiles.
p
20-A scatter plot can be used to show the correlation between two continuous variables.
Exercice 2. Table below some data on the weights (in kilograms) of 35 dogs sample in a park. :

class frequency Rel. Frequency Cum. Frequency Cum. Rel. Frequency


[5; 10[ 4 0.11 4 0.11
[10; 15[ 6 0.17 10 0.28
[15; 20[ 8 0.23 18 0:51
[20; 25[ 5 0.14 23 0.66
[25; 30[ 4 0.11 27 0.77
[30; 35[ 3 0:09 30 0.86 (1)
[35; 40[ 5 0:14 35 1

1. Complete the statistical table above.

2. Wheret is the median weight located? Give its value.


e 2 [15; 20[ and given by
The median weight X
n 35
e = L1 + 2
Cfb 2
10 (0.75)
X w = 15 + (20 15) = 19. 688
fm 8

1
3. Give Q1 ; Q3 and inerquartile range
The …rst quartile Q1 is located in the class [10; 15[ and the third quartle Q3 is in [25; 30[ and both are
given by
n 35
Cfq1 4
Q1 = L1 + 4 w = 10 + 4 (15 10) = 13. 958
fq1 6
and (1.5)
3n 105
4
Cfq3 4
23
Q3 = L1 + w = 25 + (30 25) = 29. 063.
fq3 4
The inerquartile range is IQR = Q3 Q1 = 29:063 13:958 =15. 105.
4. Plot the histogram and the Ogive plot of the above distribution then determine the mode and median
of the series graphically

(1)

(1)

5. Give the modale class then calculate the mode value.


b belongs and it’s
Clearly, The highest frequency is for the third class [15; 20[ to which the mode X
given by

b = Lmo + 1 (8 6) (0.75)
X w = 15 + (20 15) = 17.0
1 + 2 (8 6) + (8 5)

6. Calculate the mean, variance, standard deviation and coe¢ cient of variation.
Since the data are grouped, we need to compute the med point of each class and the formulas to be
used are : P7
fi xi 752:5 (0.25)
= X = Pi=1 7 = = 21.5
i=1 fi
35

1 X
N
3190
2
= V ar(X) = PN fi (xi )2 = = 93.8235 (0.5)
i=1 fi 1 i=1
34
and v
u
u 1 X
N
p
= tP N
fi (xi )2 = 93:8235 = 9. 686 3. (0.25)
i=1 fi 1 i=1

The coe¢ cient of variation is given by


9: 686 3 (0.25)
CV = 100 = CV = 100 = 45.053%
21:5

2
7. Do you think the distribution is symmetric? justify.
No, the distribution is not symmetric, characterized by the shape of the histogram, also by computing
pearson’s coe¢ cient of skewness
(0.25)
3 (21:5 19: 688) (0.5)
sk = = 0.561 21
9: 686 3
we can observe that its value is positive which let us conclude that the distribution is asymmetric

Exercice 3.The following results were obtained:

X 170 175 180 185 190 195 200 205 210 215
Y 65 75 70 82 85 86 95 97 100 110

1. Draw the scatter plot of the given data. Describe the relationship between X and Y:

(1)
(0.5)

2. Calculate the correlation coe¢ cient and interpret the obtained result.
The above scatter plot suggested the presence of linear correlation, hence, we choose pearson’s coef-
…cient of correlation to measure it.
We
P have P P P P
xi = 1925; yi = 865; (xi X)2 = 2062:5; (yi Y )2 = 1806:5; (xi X)(yi Y ) = 1887:5:
P
Cov(X; Y ) (xi X)(yi Y ) 1887:5
(X; Y ) = = qP qP =p p = 0.977 85 (2)
X: Y 2 2 2062:5 1806:5
(xi X) (yi Y )

which indicates the existence of a strong positive linear correlation between the two variables. (0.5)

Exercice 4. The Spearman correlation coe¢ cient is de…ned as the Pearson correlation coe¢ cient between
the rank variables
cov (R (X) ; R (Y ))
rs = (R (X) ; R (Y )) = :
R(X) R(Y )

For a sample of size n; if all n ranks are distinct integers, show that the above formula can be written as
P
6 ni=1 d2i
rs = 1
n (n2 1)

3
where di = R (Xi ) R (Yi ) is the di¤erence between the two ranks of each observation.
Since there are no ties, the R (X)’s and R (Y )’s both consist of the integers from 1 to n inclusive.
Hence we can rewrite the denominator:
P P
i R (X i ) R (X) R (Y i ) R (Y ) i R (Xi ) R (X) R (Yi ) R (Y )
rs = r = P 2
P 2P 2
R (X ) R (X)
i R (Xi ) R (X) i R (Y i ) R (Y ) i i

But the denominator is just a function of n


X 2 X 2
R (Xi ) R (X) = R (Xi )2 nR (X)
i i

In the special case where the R (Xi ) and R (Yi ) are distinct ranks, each is a permutation of the same
P P
sequence of numbers 1; 2; : : : ; n. Thus R (X) = R (Y ) = (n+1)
2
and i R (Xi )2 = i i2 = n(n+1)(2n+1)
6
:
Therefore
X 2 (2n + 1) (n + 1) (n 1)
R (Xi ) R (X) = n (n + 1) = n (n + 1)
i
6 4 12
n (n2 1)
=
12
Now let’s look at the numerator, we have R (Xi ) R (Yi ) = 1
2
R (Xi )2 + R (Yi )2 (R (Xi ) R (Yi ))2
X X
R (Xi ) R (X) R (Yi ) R (Y ) = R (Xi ) R (Yi ) nR (X)R (Y )
i i

1X
2
(n + 1)
= R (Xi )2 + R (Yi )2 (R (Xi ) R (Yi )) 2
n
2 i 2
n (n + 1) (2n + 1) n (n + 1)2 1X 2
= d
6 4 2 i i
n (n2 1) 1 X
= d2i :
12 2 i

Finaly
n(n2 1) P P
12
1
2 i d2i 6 ni=1 d2i
rs = n(n2 1)
=1
n (n2 1)
12

3pts for a complete answer only.


Partial answers are not accepted

You might also like