Exercise 10
Exercise 10
2. 43+52 -3x81
>4^3+5^2-3*81
[1] -154
3. √28+∛547- 47/53
>sqrt(28)+547^(1/3)-47/53
[1] 12.583
4.e^3+12% of 75
>exp(3)+0.12*75
[1] 29.08554
5. ∛729+log(23/42)
>729^(1/3)+log(23/42)
[1] 8.397825
6.(1.01)6+(2.67)3.4 – (3.2)(-2.1)
>(1.01)^6+(2.67)^3.4-(3.2)^(-2.1)
[1] 29.16739
1
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-2
1.Assign single values to X and Y as 3 and 4. Then find Z = X + Y; W = X*Y; A = Z + W; B = A2 +√Y; C= X3+Y3
>x=3;y=4
>z=x+y;z
[1] 7
>w=x*y;w
[1] 12
>a=z+w;a
[1] 19
>b=a^2+sqrt(y);b
[1] 363
>c=x^3+y^3;c
[1] 91
2.Assign combination of values (equal length) to X and Y and do above calculations. For eg X= [2, 3, 5, 7] and Y=
[11,13,17,19]
>x=c(2,3,5,7);y=c(11,13,17,19)
>z=x+y;z
[1] 13 16 22 26
>w=x*y;w
[1] 22 39 85 133
>a=z+w;a
[1] 35 55 107 159
>b=a^2+sqrt(y);b
[1] 1228.317 3028.606
[3] 11453.123 25285.359
>c=x^3+y^3;c
[1] 1339 2224 5038 7202
3
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-3
1.Use sequence operator to get a sequence
i.from 1 to 20
ii.from 20 to 10
iii.From 2 to 30 of width 2
i.1:20
[1] 1 2 3 4 5 6 7 8
[9] 9 10 11 12 13 14 15 16
[17] 17 18 19 20
ii.20:10
[1] 20 19 18 17 16 15 14 13
[9] 12 11 10
iii.2*1:15
[1] 2 4 6 8 10 12 14 16
[9] 18 20 22 24 26 28 30
2.Assign value 15 to n and find the difference between 1: n-1 and 1:(n-1)
>n=15
>1:n-1
[1] 0 1 2 3 4 5 6 7
[9] 8 9 10 11 12 13 14
>1:(n-1)
[1] 1 2 3 4 5 6 7 8
[9] 9 10 11 12 13 14
4
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-4
1.Enter the following data using rep function
a)1,1,1,1,2,2,3,3,3,3,3,
b)4,4,4,4,5,5,6,6,6,6,7,8,8,8
c)1,1,2,2,3,3,4,4,5,5,6,6
d)10,10,10,10,11,11,11,11,12,12,12,12
>a=c(rep(1,4),rep(2,2),rep(3,5));a
[1] 1 1 1 1 2 2 3 3 3 3 3
>b=c(rep(4,4),rep(5,2),rep(6,4),7,rep(8,3));b
[1] 4 4 4 4 5 5 6 6 6 6 7 8 8 8
>c=rep(1:6,each=2);c
[1] 1 1 2 2 3 3 4 4 5 5 6 6
>d=rep(10:12,each=4);d
[1] 10 10 10 10 11 11 11 11 12 12 12 12
5
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-5
1.Usingdata.frame function make the following frequency distribution.
>age=11:16;freq=c(5,10,120,22,13,5);d1=data.frame(age,freq);d1
Age freq
1 11 5
2 12 10
3 13 120
4 14 22
5 15 13
6 16 5> marks=C(15:40,by=5)
> marks=c(15,20,25,30,35,40);freq=c(2,2,3,3,3,4);d2=data.frame(marks,freq);d2
Marks freq
1 15 2
2 20 2
3 25 3
4 30 3
5 35 3
6 40 4
varaible=C(13,17,19,24,29,33);freq=c(1,1,2,2,3,3);d3=data.frame(varaiable,freq);d3
variable freq
1 13 1
2 17 1
3 19 2
4 24 2
5 29 3
6 33 3
6
PRACTICAL 10: EXERCISES USING R PROGRAMMING
7
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-6
1.Following is the data set: 5, 12, 21, 25, 25, 30, 25, 40, 42, 38, 50, 45, 60, 65, 50,70, 80, 50,13. Use the built-in functions
discussed above, on the data set x.
>x=c(5, 12, 21, 25, 25, 30, 25, 40, 42, 38, 50, 45, 60, 65, 50,70, 80, 50,13);
>length(x)
[1] 19
>max(x)
[1] 80
>min(x)
[1] 5
>range(x)
[1] 5 80
>quantile(x)
0% 25% 50% 75% 100%
5 25 40 50 80
> IQR(x)
[1] 25
>sum(x)
[1]746
>cumsum(x)
[1] 5 17 38 63 88 118 143
[8]183 225 263 313 358 418 483
[15]533 603 683 733 746
>mean(x)
[1] 39.26316
>median(x)
[1] 40
>var(x)
[1] 428.9825
>sort(x)
[1] 5 12 13 21 25 25 25 30 38 40
[11] 42 45 50 50 50 60 65 70 80
8
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-7
1.For the given data sets;
a.Enter the data set either using the scan function or c function .
b.Find the index for its maximum and minimum value
c.Find the summary.
d.Find all functions wrt this data set
e.Construct the discrete distribution.
Data set I: 13, 17, 24, 21, 28, 28, 13, 27, 17, 23, 17, 24, 21, 17, 23, 21
>x=c(13, 17, 24, 21, 28, 28, 13, 27, 17, 23, 17, 24, 21, 17, 23, 21)
>max(x)
[1] 28
>min(x)
[1] 13
>summary(x)
Min. 1st Qu. Median Mean
13.00 17.00 21.00 20.88
3rd Qu. Max.
24.00 28.00
>quantile(x)
0% 25% 50% 75% 100%
13 17 21 24 28
>range(x)
[1]13 28
>mean(x)
[1] 20.875
>median(x)
[1] 21
>var(x)
[1]23.45
>sum(x)
[1]334
9
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>cumsum(x)
[1] 13 30 54 75 103 131 144 171 188
[10] 211 228 252 273 290 313 334
>all.moments(x,order.max=4,central=T);
[1] 1.00000 0.00000 21.98438 -11.00391
[5] 961.68677
> skewness(x)
[1] -0.1067519
> kurtosis(x)
[1] 1.989782
>names(x)
NULL
>table(x)
x
13 17 21 23 24 27 28
2 4 3 2 2 1 2
[1] 107
>cumsum(y)
[1] 0 1 3 6 10 15 21 27 32
[10] 36 40 45 50 54 58 61 64 67
[19] 70 72 74 76 79 81 84 86 88
[28] 90 91 92 93 93 93 94 94 97
[37] 99 101 103 104 105 106 106 106 107
[46] 107
0% 25% 50% 75% 100%
0 1 2 3 6
[1] 2.326087
>median(x)
[1] 2
>var(x)
[1] 2.84686
>quantile(x,probs=seq(0,1,.1))
0% 10% 20% 30% 40% 50% 60% 70%
0 0 1 1 2 2 3 3
80% 90% 100%
4 5 6
>quantile(x,c(0.67,0.98,0.99))
67% 98% 99%
3 6 6
>skewness(x)
[1] 0.4307074
>kurtosis(x)
[1] 2.3618
>table(x)
x
1 2 3 4 5 6
7 9 15 11 5 4 2
11
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-8
1.A psychologist estimates the I.Q. of 60 children. The values are as follows :103, 98, 87, 85, 67, 96, 115, 109, 127, 103,
95, 123, 94, 88, 102, 76, 73, 80, 84, 102, 115, 93, 76, 81, 132, 90, 119, 84, 97, 120, 114, 101, 153, 98, 99, 105, 110, 107,
110, 128, 89, 112, 118, 101, 122, 146, 96, 109, 72, 97, 94, 94, 79, 79, 100, 54, 102, 89, 43, 111.
>x=c(103, 98, 87, 85, 67, 96, 115, 109, 127, 103, 95, 123, 94, 88, 102, 76, 73, 80, 84, 102, 115, 93, 76, 81, 132, 90, 119,
84, 97, 120, 114, 101, 153, 98, 99, 105, 110, 107, 110, 128, 89, 112, 118, 101, 122, 146, 96, 109, 72, 97, 94, 94, 79, 79,
100, 54, 102, 89, 43, 111)
>summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
43.00 87.75 98.50 99.10 110.25 153.00
>(153-43)/5
[1] 22
>seq(43,160,by=22)
[1] 43 65 87 109 131 153
>ci=seq(43,160,by=22)
>length(x)
[1] 60
>range(x)
[1] 43 153
>y=cut(x,ci,right=F);y
[1] [87,109) [87,109) [87,109) [65,87)
[5] [65,87) [87,109) [109,131) [109,131)
[9] [109,131) [87,109) [87,109) [109,131)
[13] [87,109) [87,109) [87,109) [65,87)
[17] [65,87) [65,87) [65,87) [87,109)
[21] [109,131) [87,109) [65,87) [65,87)
[25] [131,153) [87,109) [109,131) [65,87)
[29] [87,109) [109,131) [109,131) [87,109)
[33] <NA> [87,109) [87,109) [87,109)
[37] [109,131) [87,109) [109,131) [109,131)
[41] [87,109) [109,131) [109,131) [87,109)
[45] [109,131) [131,153) [87,109) [109,131)
[49] [65,87) [87,109) [87,109) [87,109)
[53] [65,87) [65,87) [87,109) [43,65)
12
PRACTICAL 10: EXERCISES USING R PROGRAMMING
2.The following data regarding weight of new born babies is obtained from the office records of a hospital. Weight (kgs.)
3.7, 3.4, 4.1, 4.0, 3.7, 4.7, 3.3, 2.4, 3.1, 4.2, 3.8, 3.6, 4.2, 4.3, 2.9, 3.6, 3.3, 4.8, 4.0, 3.9, 3.5, 3.5, 3.8, 3.8, 4.2, 3.9, 4.9, 3.2,
4.0, 3.8, 3.2, 2.7, 3.4., 3.3, 3.0, 3.1, 3.5, 3.7, 3.9, 4.3, 3.8, 3.7, 3.0, 4.4, 4.1, 3.6, 3.7, 3.4, 3.7, 3.3, 3.5, 3.7, 3.0, 2.9, 3.1,
3.3, 4.2.
>x=c(3.7, 3.4, 4.1, 4.0, 3.7, 4.7, 3.3, 2.4, 3.1, 4.2, 3.8, 3.6, 4.2, 4.3, 2.9, 3.6, 3.3, 4.8, 4.0, 3.9, 3.5, 3.5, 3.8, 3.8, 4.2, 3.9,
4.9, 3.2, 4.0, 3.8, 3.2, 2.7, 3.4, 3.3, 3.0, 3.1, 3.5, 3.7, 3.9, 4.3, 3.8, 3.7, 3.0, 4.4, 4.1, 3.6, 3.7, 3.4, 3.7, 3.3, 3.5, 3.7, 3.0, 2.9,
3.1, 3.3, 4.2)
>summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.400 3.300 3.700 3.651 4.000 4.900
>(4.9-2.4)/5
[1] 0.5
>ci=seq(2.4,5.5,by=0.5)
>y=cut(x,ci,right=F);y
[1] [3.4,3.9) [3.4,3.9) [3.9,4.4) [3.9,4.4) [3.4,3.9)
[6] [4.4,4.9) [2.9,3.4) [2.4,2.9) [2.9,3.4) [3.9,4.4)
[11] [3.4,3.9) [3.4,3.9) [3.9,4.4) [3.9,4.4) [2.9,3.4)
[16] [3.4,3.9) [2.9,3.4) [4.4,4.9) [3.9,4.4) [3.9,4.4)
[21] [3.4,3.9) [3.4,3.9) [3.4,3.9) [3.4,3.9) [3.9,4.4)
[26] [3.9,4.4) [4.9,5.4) [2.9,3.4) [3.9,4.4) [3.4,3.9)
[31] [2.9,3.4) [2.4,2.9) [3.4,3.9) [2.9,3.4) [2.9,3.4)
[36] [2.9,3.4) [3.4,3.9) [3.4,3.9) [3.9,4.4) [3.9,4.4)
[41] [3.4,3.9) [3.4,3.9) [2.9,3.4) [4.4,4.9) [3.9,4.4)
[46] [3.4,3.9) [3.4,3.9) [3.4,3.9) [3.4,3.9) [2.9,3.4)
13
PRACTICAL 10: EXERCISES USING R PROGRAMMING
14
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-9
1.Access the data set treering containing tree-ring widths in dimensionless unit, from the base package of R. Use R-
commands to answer the following
a.how many observations are in the data set?
b.What is the minimum and maximum observation?
c.List observation greater than the 1.8.
d.Find the quartiles of the data set.
e.Find the index for the maximum and minimum value of data set.
f.Construct appropriate frequency distribution table
>data(treering);d=treering;
>length(d)
[1] 7980
>summary(d)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000 0.8370 1.0340 0.9968 1.1970 1.9080
>d[d>1.8]
[1] 1.844 1.850 1.856 1.820 1.884 1.908 1.826 1.802
>length(d[d>1.8])
[1] 8
>min(d)
[1]0
>max(d)
[1]1.9080
>d[1:5]
[1] 1.345 1.077 1.545 1.319 1.413
>d[7976:7980]
[1] 1.027 1.173 1.471 1.444 1.160
>which(d==.0000)
[1] 1395
>which(d==1.9080)
[1] 2185
>ci=seq(0,2,0.2);ci
[1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
15
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>y=cut(d,ci,right=F);fd=cbind(table(y));fd
[,1]
[0,0.2) 121
[0.2,0.4) 254
[0.4,0.6) 473
[0.6,0.8) 914
[0.8,1) 1795
[1,1.2) 2457
[1.2,1.4) 1459
[1.4,1.6) 430
[1.6,1.8) 69
[1.8,2) 8
2.Access the data set rivers, from the base package of R. Use R-commands to answer the following
a.how many observations are in the data set?
b.What is the minimum and maximum observation?
c.List observation greater than the median.
d.Find the quartiles of the data set.
e.Find the index for the maximum and minimum value of data set.
f.Construct appropriate frequency distribution table
>data(rivers);d=rivers
>length(d)
[1] 141
>summary(d)
Min. 1st Qu. Median Mean 3rd Qu. Max.
135.0 310.0 425.0 591.2 680.0 3710.0
>min(d)
[1] 135.0
>max[d]
>3710.0
>d[d>425.0]
[1] 735 524 450 1459 465 600 870 906 1000 600
16
PRACTICAL 10: EXERCISES USING R PROGRAMMING
[11] 505 1450 840 1243 890 525 720 850 630 730
[21] 600 710 470 680 570 560 900 625 2348 1171
[31] 3710 2315 2533 780 460 431 760 618 981 1306
[41] 500 696 605 1054 735 435 490 460 1270 545
[51] 445 1885 800 538 1100 1205 610 540 1038 444
[61] 620 652 900 525 529 500 720 430 671 1770
>d[1:5]
[1] 735 320 325 392 524
>which(d==135.0)
[1] 8
>which(d==3710.0)
[1] 68
>(3710-135)/5
[1] 715
>ci=seq(135,3710,by=715)
>y=cut(d,ci,right=F);fd=cbind(table(y));fd
[,1]
[135,850) 117
[850,950) 18
[950,1050) 2
[1050,2500) 3
[2500,3710) 0
10 6
2 1 28
3 2 36
4 3 25
54 5
>cf=transform(d1,cfreq=cumsum(f));cf
x fcfreq
10 6 6
2 1 28 34
3 2 36 70
4 3 25 95
5 4 5 100
>cf1=transform(d1,rf=f/sum(f));cf1
x f rf
1 0 6 0.06
2 1 28 0.28
3 2 36 0.36
4 3 25 0.25
5 4 5 0.05
4. Access the data set swiss, from the base package of R. Use R-commands to answer the followingFertility Agriculture
Examination Education Catholic
a.Find the mean and variance for Agriculture
b.Construct a continuous frequency distribution for either Examination orEducation
c.Find the number of observation that has Catholic less than 60
d.Get all the information with respect to 6throw
e.Get all the information with respect to 6thcolumn
f.Get all the information with respect to the 5th,10th,…..& 45th observations.
g.Get all the information with respect to the 1th,17th,29th,33rd,47th observations.
>data("swiss")
>mean(swiss$Fertility)
[1] 70.14255
>var(swiss$Agriculture)
18
PRACTICAL 10: EXERCISES USING R PROGRAMMING
[1] 515.7994
>summary(swiss$Examination)
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.00 12.00 16.00 16.49 22.00 37.00
>(37-3)/5
[1] 6.8
>ci=seq(3,45,by=5)
>x=cut(swiss$Examination,ci,righth=F)
>fd=cbind(table(x));fd
[,1]
(3,8] 6
(8,13] 7
(13,18] 15
(18,23] 9
(23,28] 4
(28,33] 2
(33,38] 2
(38,43] 0
>sum(swiss$Catholic< 60)
[1] 31
>swiss[6, ]
Fertility Agriculture Examination Education
Porrentruy 76.1 35.3 9 7
Catholic Infant.Mortality
Porrentruy 90.57 26.6
>swiss[, 6]
[1] 22.2 22.2 20.2 20.3 20.6 26.6 23.6 24.9 21.0 24.4
[11] 24.5 16.5 19.1 22.7 18.7 21.2 20.0 20.2 10.8 20.0
[21] 18.0 22.4 16.7 15.3 21.0 23.8 18.0 16.3 20.9 22.5
[31] 15.1 19.8 18.3 19.4 20.2 17.8 16.3 18.1 20.3 20.5
[41] 18.9 23.0 20.0 19.5 18.0 18.2 19.3
19
PRACTICAL 10: EXERCISES USING R PROGRAMMING
20
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>print(specified_observations_info)
Catholic Infant.Mortality
Courtelary 9.96 22.2
Grandson 3.30 20.0
Vevey 18.46 20.9
Herens 100.00 18.3
Rive Gauche 58.33 19.3
21
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-10
1.Access the data set cars from the base library of R
Construct Boxplot for the variables in it.
Obtain the summary of the variables
>data(cars);d1=cars;attach(d1)
>dim(d1)
[1] 50 2
>names(d1)
[1] "speed" "dist"
>s=speed
>boxplot(s,xlab="speed")
>d=dist
>boxplot(d,xlab="distance")
>identify(rep(1,length(d)),d);
>d1[49,]
speed dist
49 24 120
2. Access the data cats from the library MASS and plot sexwise boxplot for the variable Hwt(heart weight)
>library(MASS)
> head(cats)
>Sex BwtHwt
22
PRACTICAL 10: EXERCISES USING R PROGRAMMING
1 F 2.0 7.0
2 F 2.0 7.4
3 F 2.0 9.5
4 F 2.1 7.2
5 F 2.1 7.3
6 F 2.1 7.6
> data(cats);
>attach(cats);names(cats);
The following objects are masked from cats (pos = 3):
Bwt, Hwt, Sex
> cats[c(47,144),]
Sex BwtHwt
47 F 3.0 13.0
144 M 3.9 20.5
23
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>
3.Access the data set InsectSprays from the base package of R. Construct parallel boxplots for different sprays.
Hint: >boxplot(count~spray)
>data("InsectSprays")
>attach(InsectSprays);names(InsectSprays);
[1] "count" "spray"
>boxplot(count~spray);
24
PRACTICAL 10: EXERCISES USING R PROGRAMMING
4. Following are the body mass index values (kg/m2) for 14 subjects in sample
24.4, 3.04, 21.4, 25.4, 21.3, 23.8, 20.8, 22.9, 23.2, 21.1, 23.0, 20.6, 26.0, 20.9
i) compute mean, median, variance, standard deviation and coefficient of variation
ii) construct box and whisker plot. If outliers are found identify them.
iii)Compute Bowley’s measure of skewness
> x=c(24.4, 3.04, 21.4, 25.4, 21.3, 23.8, 20.8, 22.9, 23.2, 21.1, 23.0, 20.6, 26.0, 20.9)
> mean(x)
[1] 21.27429
> median(x)
[1] 22.15
> var(x)
[1] 30.62987
>sd(x)
[1] 5.534426
> cv=sd(x)/mean(x)*100
25
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> cv
[1] 26.01463
> Q3=quantile(BMI_values,0.75)
> Q1=quantile(BMI_values,0.25)
> Q2=quantile(BMI_values,0.50)
>bowleys_skewness=(Q3+Q1-2*Q2)/(Q3-Q1);bowleys_skewness
75%
0.1111111
>boxplot(x)
26
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-11
1.Following are the number of accidents that occurred at 60 major intersections in a certain city during a weekend:
0 1 0 2 4 2 5 0 3 0 2 0 1 4
4 4 1 2 1 2 5 0 4 1 0 2 1 1
4 2 5 3 2 0 5 1 1 0 6 3 1 5
0 3 0 0 6 3 2 2 3 1 4 0 3 0
0 1 2 4
Prepare a frequency distribution table and draw a bar chart. Comment on the nature of the distribution.
SOLUTION
>t=table(x)
>t
x
0 1 2 3 4 5 6
15 12 11 7 8 5 2
This indicates a negative skewness, with most major intersections experiencing few or no accidents during the weekend,
while a few among them encountered higher accident counts, up to 6 accidents.
27
PRACTICAL 10: EXERCISES USING R PROGRAMMING
[1] 90 72 66 42 48 30 12
pie(accidents,angles,main="PIE CHART of accidents at intersections",col="black") pie(accidents,angles,main="PIE
CHART of accidents at intersections",col="light blue")
28
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXCERISE-12
1. Draw a histogram and frequency polygon for the following data.
Height 0-7 14-21 21-28 20-35 35-42 42-49 49-50
7-14
No. of 31 35 42 82 71 54 19
people:
26
> mid=seq(3.5,52.5,7)
> freq=c(26,31,35,42,82,71,54,19)
> y=rep(mid,freq)
> brk=seq(0,56,7)
> hist(y,breaks=brk)
> hist(y,breaks=brk,col="green")
> plot(mid,freq,type="b")
> h=hist(y,breaks=brk,col="light blue");h
$breaks
[1] 0 7 14 21 28 35 42 49 56
$counts
[1] 26 31 35 42 82 71 54 19
$density
[1] 0.010317460 0.012301587 0.013888889 0.016666667 0.032539683 0.028174603
[7] 0.021428571 0.007539683
$mids
[1] 3.5 10.5 17.5 24.5 31.5 38.5 45.5 52.5
$xname
[1] "y"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> lines(c(min(h$breaks),h$mids,max(h$breaks)),lwd=2,c(0,h$counts,0),type="b")
>
29
PRACTICAL 10: EXERCISES USING R PROGRAMMING
2. Plot the histogram and frequency polygon on the same graph for the given data
Class 20-30 30-40 40-50 50-60 60-70 70-80 80-90
interval
Frequency 10 24 18 12 8 5 3
> class_intervals=seq(20,80,by=10)
> frequencies=c(10,24,18,12,8,5,3)
> midpoints=class_intervals+5
> data=rep(midpoints,times=frequencies)
> class_intervals=seq(20,100,by=10)
> hist(data,breaks=class_intervals, main="histogram ",xlab="class intervals" ,ylab="frequency", col="light pink",
border="black")
> points(midpoints,frequencies,type="b",col="black",pch=19,lwd=2,cex=1.5)
30
PRACTICAL 10: EXERCISES USING R PROGRAMMING
31
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-13
1.Plot the scatter plot and compute the both the correlation coefficient for the following data
i)
X 0 4 8 12
Y 8.34 8.89 9.16 9.50
ii)
A 11.1 10.3 12.0 15.1 13.7 18.5 17.3 14.2 14.8 15.3
B 10.9 14.2 13.8 21.5 13.2 21.1 16.4 19.3 17.4 19.0
iii)
C 5.12 6.18 6.77 6.65 6.36 5.90 5.48 6.02 10.34 8.51
D 2.30 2.54 2.95 3.77 4.18 5.31 5.53 8.83 9.48 14.20
>x=c(0,4,8,12)
>y=c(8.34,8.89,9.16,9.50)
>p=plot(x,y)
>cor(x,y,method="spearman")
[1] 1
>a=c(11.1,10.3,12.0,15.1,13.7,18.5,17.3,14.2,14.8,15.3)
>b=c(10.9,14.2,13.8,21.5,13.2,21.1,16.4,19.3,17.4,19.0)
32
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>p=plot(a,b)
>cor(a,b,method="spearman")
[1] 0.6969697
>d=c(2.30,2.54,2.95,3.77,4.18,5.31,5.53,8.83 ,9.48,14.20)
>p=plot(c,d)
>cor(c,d,method="spearman")
[1] 0.4181818
33
PRACTICAL 10: EXERCISES USING R PROGRAMMING
2.
X1 Y1 X2 Y2
10 8.04 10 9.14
8 6.95 8 8.14
13 7.58 13 8.74
9 8.81 9 8.77
11 8.33 11 9.26
14 9.96 14 8.10
6 7.24 6 6.13
4 4.26 4 3.10
12 10.84 12 9.13
7 4.82 7 7.26
5 5.68 5 4.78
34
PRACTICAL 10: EXERCISES USING R PROGRAMMING
>x1=c(10,8,13,9,11,14,6,4,12,7,5)
>x2=c(10,8,13,9,11,14,6,4,12,7,5)
>mean(x1)
[1] 9
>mean(x2)
[1] 9
y1=c(8.04,6.95,7.58,8.81,8.33,9.96,7.24,4.26,10.84,4.82,5.68)
>y2=c(9.14,8.14,8.74,8.77,9.26,8.10,3.10,9.13,7.26,4.78)
>mean(y1)
[1] 7.500909
>mean(y2)
[1] 7.642
>cor(x1,y1,method = "spearman")
[1] 0.8181818
>cor(x2,y2,method=”sperman”)
>
> p=plot(x1,y1,main="scatter plot 1")
> p=plot(x2,y2,main="scatter plot 2")
35
PRACTICAL 10: EXERCISES USING R PROGRAMMING
36
PRACTICAL 10: EXERCISES USING R PROGRAMMING
37
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-14
1.The table shown the score of 10 students on maths(X) test and stats(Y) test. The maximum score in each test was 50.
Obtain the line of regression of X on Y.
Print this equation on the graph
if it is known that a student gets 28 in stats, what would be his/her score in maths?
X 34 37 36 32 32 36 35 34 29 35
Y 37 37 34 34 33 40 39 37 36 35
x=c(34,37,36,32,32,36,35,34,29,35)
y=c(37,37,34,34,33,40,39,37,36,35)
plot(x,y)
fit=lm(y~x);abline(fit);fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
23.7769 0.3654
> text(locator(1),"y=0.3654*x+23.7769")
38
PRACTICAL 10: EXERCISES USING R PROGRAMMING
Y: 56 50 48 60 62 64 65 70 74 82 90
Plot the line of best fit and Estimate Y when X = 78
>x=c(45,55,56,58,60,65,68,70,75,80,85)
>y=c(56,50,48,60,62,64,65,70,74,82,90)
>cor(x,y)
>[1] 0.9188406
>plot(x,y)
>fit=lm(y~x);abline(fit);fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
0.9044 0.9917
> text(locator(1),"y=0.9917*x+0.9904")
3.Calculate the coefficient of correlation by Karl Person’s method from the following data relating to overhead expenses
and cost of production
Overhead expense (1000 Rs.) 80 90 100 110 120 130 140 150 160
Cost of (Rs. 1000) 15 15 16 19 17 18 16 18 19
Plot the line of best fit and estimate X when Y = 22
39
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> x=10*8:16
> y=c(15,15,16,19,17,18,16,18,19)
>cor(x,y)
[1] 0.6928203
>plot(x,y)
> fit=lm(y~x);abline(fit);fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
12.20 0.04
> text(locator(1),"y=0.04*x+12.20")
>
40
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-15
The incident of occupational disease is such that the workers have 20% chance of catching it, what is the probability that
out of 6 workers chosen (i) 4 or more are disease. (ii) atmost 2 catches the disease
> n=6
> p=0.20
> prob_4_or_more=1-pbinom(3,n,p);prob_4_or_more
[1] 0.01696
> prob_at_most_2=pbinom(2,n,p);prob_at_most_2
[1] 0.90112
2.The probability that a patient recovers from a sax blood disease 0.21. If 15 people are known to have contracted this
disease what is the probability that: a) Atleast 10 survive? b) From 3 to 8 survive
>n=6
> p=0.21
> prob_atleast_10_survive=1-pbinom(9,n,p);prob_atleast_10_survive
[1] 0.0001745072
> prob_3_to_8_survive=pbinom(8,n,p)-pbinom(2,n,p);prob_3_to_8_survive
[1] 0.6373935
3.Find the probability that seven of ten persons will recover from a tropical disease, given that the probability is 0.8, that
any one of these will recover from the disease.
> n=10
> p=0.8
> prob_7_recovers=pbinom(7,n,p);prob_7_recovers
[1] 0.3222005
4.A basketball player hits on seventy-five percent of his shots from the free throw line. What is th probability that he
makes exactly two of his next four free shots?
> n=4
> p=0.75
> prob_making_2=pbinom(2,n,p);prob_making_2
[1] 0.2617188
5.In a certain city, incompatibility is given as the legal reason in 70% of all divorce cases. Find the probability that 5 of
the next 6 divorce casesin this city will blame incompatible.
> n=6
> p=0.70
> prob_five_blame=pbinom(5,n,p);prob_five_blame
[1] 0.882351
6. A automobile safety engineer claims that one in ten automobile accidents is due to driver fatigue. What is the
probability that at least three of five automobile accidents are due to driver fatigue?
> n=5
41
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> p=0.1
> prob_atleast_three=1-pbinom(2,n,p);prob_atleast_three
[1] 0.00856
7. Seven unbiassed and coins are tossed, and No. of heads are noted. The experiment is repeated 128 times and the
following results are obtained. Fit a binomial distribution and obtain the expected frequencies.
No.of 0 1 2 3 4 5 6 7
Heads (x)
Frequency 7 6 17 35 30 23 7 3
> x=c(0,1,2,3,4,5,6,7)
> f=c(7,6,17,35,30,23,7,3)
> fx=f*x
> fx
[1] 0 6 34 105 120 115 42 21
> sum(fx)
[1] 443
> sum(f)
[1] 128
> mean=(sum(fx)/sum(f));mean
[1] 3.460938
> prob=mean/7;prob
[1] 0.4944196
> p=dbinom(x,7,prob,log=FALSE);p
[1] 0.008443672 0.057800941
[3] 0.169574947 0.276385951
[5] 0.270284716 0.158590899
[7] 0.051696666 0.007222208
> exp_freq=(p*sum(f));exp_freq
[1] 1.0807900 7.3985205
[3] 21.7055932 35.3774017
[5] 34.5964436 20.2996351
[7] 6.6171732 0.9244427
> expected_frequency=round(exp_freq);expected_frequency
[1] 1 7 22 35 35 20 7 1
> y=cbind(p,expected_frequency);y
p expected_frequency
[1,] 0.008443672 1
[2,] 0.057800941 7
[3,] 0.169574947 22
[4,] 0.276385951 35
[5,] 0.270284716 35
[6,] 0.158590899 20
[7,] 0.051696666 7
[8,] 0.007222208 1
8.A set of six similar coins are tossed 640 times and the following results are obtained
No. of 0 1 2 3 4 5 6
42
PRACTICAL 10: EXERCISES USING R PROGRAMMING
Head(x)
Frequency 7 64 140 210 130 75 12
Fit a binomial distribution assuming that the nature of the coin is unknown
> x=c(1,2,3,4,5,6)
> f=c(7,64,140,210,130,75,12)
> fx=f*x;fx
[1] 0 64 280 630 520 375 72
> mean=(sum(fx)/sum(f));mean
[1] 3.04232
> prob=mean/6;prob
[1] 0.5070533
> p=dbinom(x,6,prob,log=FALSE);p
[1] 0.01434828 0.08855329
[3] 0.22771852 0.31231348
[5] 0.24093818 0.09913323
[7] 0.01699502
> exp_freq=(p*sum(f));exp_freq
[1] 9.154202 56.497000
[3] 145.284417 199.255999
[5] 153.718559 63.247000
[7] 10.842822
> expected_frequency=round(exp_freq);expected_frequency
[1] 9 56 145 199 154 63 11
> y=cbind(p,expected_frequency);y
p expected_frequency
[1,] 0.01434828 9
[2,] 0.08855329 57
[3,] 0.22771852 145
[4,] 0.31231348 199
[5,] 0.24093818 154
[6,] 0.09913323 63
[7,] 0.01699502 11
43
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-16
2.Assuming that the chance of a traffic accident in a City of Delhi is 0.001 on how many days out of 1000 days can we
expect no accidents and more than 3 accidents.
> lambda=1
> prob_no_accidents=dpois(0,lambda);prob_no_accidents
[1] 0.3678794
> x=prob_no_accidents*1000;x
[1] 367.8794
> prob_more_than_3_accidents=1-ppois(3,lambda);prob_more_than_3_accidents
[1] 0.01898816
> x=prob_more_than_3_accidents*1000;x
[1] 18.98816
4.If the number of mistakes made by a typist follows a Poisson distribution with mean 3, what is the chance that he/she
i) makes 2 mistakes, ii) makes atleast 2 mistakes
>dpois(2,3,log=F)
[1] 0.2240418
>1-ppois(2,3,lower.tail = T,log.p = F)
[1] 0.5768099
5.The number of accidents occurring in a factory in a year is a Poission variate with mean 5. Find the probability that.
i) more than 2 accidents take place
ii) more than 4 accidents occur in 1 year
>1-ppois(2,5,lower.tail = T,log.p=F)
[1] 0.875348
>1-ppois(4,5,lower.tail = T,log.p=F)
[1] 0.5595067
6.A receptionist at an office receives on an average 3 telephone calls between 10 a.m. and 10.05 a.m. Find the probability
that on a particular day
i) she does not receive any call
ii) she receives atleast 2 calls
>ppois(0,3,lower.tail = T,log.p = F)
[1] 0.04978707
>1-ppois(2,3,lower.tail = T,log.p=F)
[1] 0.5768099
7.At 10.00 a.m. there is a city bus service. The number of passengers getting in at the 1st stop is a Poisson variate with
parameter 6. What is the probability that on a particular day none of them gets in at the bus in the stop? On how many
days of an year would you expect this to happen.
45
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> lambda=8
> prob_no_passengers=dpois(0,lambda);prob_no_passengers
[1] 0.0003354626
> day_in_year=365
> x=prob_no_passengers*day_in_year;x
[1] 0.122443
8.On an average 3 street lights of a municipality fails every day. Find the standard deviation of number of failure per day
and probability that atleast one light fails per day.
>sqrt(3)
[1] 1.732051
>1-ppois(0,3,lower.tail = T,log.p = F)
[1] 0.9502129
9.On an average 1% of the pins are defective. If the box contains 300 pins, find the probability that the box has
i) atleast 1 defective pin
ii) more than 3 defective pins
>1-ppois(0,3,lower.tail = T,log.p = F)
[1] 0.9502129
>1-ppois(3,3,lower.tail = T,log.p = F)
[1] 0.3527681
10.On an average 1 in every 50 valves manufactured by a firm is substandard. If the valves are supplied in packers of 20
each
i) Find the probability that the packets will contain atleast 1 substandard valve
ii) In how many of a lot of 1000 packets would you expect substandard valves.
>lambda=20*(1/50)
> prob_atleast_1_substandard=1-dpois(0,lambda);prob_atleast_1_substandard
[1] 0.32968
> total_packets=1000
> x=total_packets*prob_atleast_1_substandard;x
[1] 329.68
11.Using the following data fit a Poisson distribution and find the expected frequencies
No.ofPrintingMistakes 0 1 2 3 4 5
No.of days 42 33 14 6 4 1
> x=c(0,1,2,3,4,5)
> f=c(42,33,14,6,4,1)
> fx=f*x;fx
[1] 0 33 28 18 16 5
46
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> sum(f)
[1] 100
> mean=(sum(fx)/sum(f));mean
[1] 1
> p=dpois(x,mean);p
[1] 0.367879441 0.367879441
[3] 0.183939721 0.061313240
[5] 0.015328310 0.003065662
> exp_freq=p*sum(f);exp_freq
[1] 36.7879441 36.7879441
[3] 18.3939721 6.1313240
[5] 1.5328310 0.3065662
> expected_frequency=round(exp_freq);expected_frequency
[1] 37 37 18 6 2 0
> y=cbind(p,expected_frequency);y
p expected_frequency
[1,] 0.367879441 37
[2,] 0.367879441 37
[3,] 0.183939721 18
[4,] 0.061313240 6
[5,] 0.015328310 2
[6,] 0.003065662 0
12.The following is the distribution of daily sales of television sets in a shop, Fit a Poissondistribution and hence find the
theoretical frequency.
No. of setssold 0 1 2 3 4 5 6
No. of days 18 43 45 28 12 5 0
> x=0:6
> f=c(18,43,45,28,12,5,0)
> fx=f*x;fx
[1] 0 43 90 84 48 25 0
> sum(f)
[1] 151
> mean=(sum(fx)/sum(f));mean
[1] 1.92053
> p=dpois(x,mean);p
[1] 0.14652931 0.28141391
[3] 0.27023190 0.17299614
[5] 0.08306106 0.03190425
[7] 0.01021218
> exp_freq=p*sum(f);exp_freq
[1] 22.125926 42.493500 40.805016
[4] 26.122417 12.542220 4.817541
[7] 1.542039
> expected_frequency=round(exp_freq);expected_frequency
[1] 22 42 41 26 13 5 2
> y=cbind(p,expected_frequency);y
p expected_frequency
47
PRACTICAL 10: EXERCISES USING R PROGRAMMING
[1,] 0.14652931 22
[2,] 0.28141391 42
[3,] 0.27023190 41
[4,] 0.17299614 26
[5,] 0.08306106 13
[6,] 0.03190425 5
[7,] 0.01021218 2
48
PRACTICAL 10: EXERCISES USING R PROGRAMMING
EXERCISE-17
1.Given a normal distribution with mean = 50 and standard deviation = 8. Find the probability that X assumes a value
between 34 and 62.
> mean=50
> sd=8
> x1=34
> x2=62
> z1=(x1-mean)/sd;z1
[1] -2
> z2=(x2-mean)/sd;z2
[1] 1.5
> p1=pnorm(z1);p1
[1] 0.02275013
> p2=pnorm(z2);p2
[1] 0.9331928
> probability=p2-p1;probability
[1] 0.9104427
2.For a normal distribution with mean = 200 and S.D. = 25, find the probability that X assumes a value between 200 and
260. Find the probability that X is greater than 240.
> mean=200;
> sd=25
> x1=200
> x2=260
> z1=(x1-mean)/sd;z1
[1] 0
> z2=(x2-mean)/sd;z2
[1] 2.4
> p1=pnorm(z1);p1
[1] 0.5
> p2=pnorm(z2);p2
[1] 0.9918025
probability=p1-p2;probability
[1] -0.4918025
ii)
> x=240
> z=(x-mean)/sd;z
[1] 1.6
> p=pnorm(z);p
[1] 0.9452007
> probability=1-p;probability
[1] 0.0547992
3.Given a Normal distribution with mean = 50 and S.D. = 13. Find the value of X that has (a) 13% of the area to its left :
b) 14% of the area to its right.
49
PRACTICAL 10: EXERCISES USING R PROGRAMMING
i)
> mean=50
> sd=13
> pleft=0.13
> xleft=qnorm(pleft,mean,sd,lower.tail = TRUE);xleft
[1] 35.35692
ii)
>mean=50
>sd=13
>pright=0.4
>xright=qnorm(pright,mean,sd,lower,tail=FALSE);
>xright
[1] 64.04415
5.The accounts of a certain departmental store has an average balance of Rs. 120/- and S.D. = Rs. 40/-. Assuming that the
account balances are normally distributed. a) what proportion of accounts is over Rs. 150/- d) what proportion is between
100 and 150; (c) between 60 and 90.
a)
> mean=120
> sd=40
> x=50
> z=(x-mean)/sd;z
> p=pnorm(z);p
[1] 0.04005916
> probability=1-p;probability
[1] 0.9599408
b)
> x1=100
> x2=150
> z1=(x1-mean)/sd;z1
[1] -0.5
> z2=(x2-mean)/sd;z2
[1] 0.75
> p1=pnorm(z1);p1
[1] 0.3085375
> p2=pnorm(z2);p2
[1] 0.7733726
50
PRACTICAL 10: EXERCISES USING R PROGRAMMING
> probability=p2-p1;probability
[1] 0.4648351
c)
> x1=60
> x2=90
> z1=(x1-mean)/sd;z1
[1] -1.5
> z2=(x2-mean)/sd;z2
[1] -0.75
> p1=pnorm(z1);p1
[1] 0.0668072
> p2=pnorm(z2);p2
[1] 0.2266274
> probability=p1-p2;probability
[1] -0.1598202
5.The distribution of monthly income of 3000 workers of a factory follows normal law with mean = 900 and S.D. = 100.
Find
a) percentage of workers with income greater than Rs. 800
b) percentage of workers having on income less than Rs. 600.
a) > mean=900
> sd=100
> x=800
> z=(x-mean)/sd;z
[1] -1
> p=pnorm(z);p
[1] 0.1586553
> probability=1-p;probability
[1] 0.8413447
b)
> x=600
> z=(x-mean)/sd;z
51
PRACTICAL 10: EXERCISES USING R PROGRAMMING
[1] -3
> p=pnorm(z);p
[1] 0.001349898
> probability=p;probability
[1] 0.001349898
6.1200 students took an exam. The mean marks is 53% and S.D. = 15%. Assume normal distribution of marks.
a) if 50% marks are required for passing, find how many students are expected to score greater than 50%
b) if only 40% of students are required to be promoted what are the marks for promotion.
a)
> mean=0.53
> sd=0.15
> x=0.5
> z=(x-mean)/sd;z
[1] -0.2
> p=pnorm(z);p
[1] 0.4207403
> probability=1-p;probability
[1] 0.5792597
b)
> x=0.4
> probability=qnorm(1-x,0.53,0,15);probability
[1] 0.53
52