0% found this document useful (0 votes)
37 views7 pages

Answser Keys To Practices in R Short Course R Basics: Practice 1

This document shows example code and output from practicing with an R dataset. It contains 6 practice problems: 1. It extracts dimension and column information from a dataset with 150 rows and 5 columns. 2. It assigns values from specific cells in the dataset to new variables. 3. It reads in a new dataset with 988,346 rows and 38 columns from a CSV file and extracts two columns as new variables. 4. It calculates summary statistics like mean, variance, minimum, maximum and median for one of the new variables and stores them in a vector.

Uploaded by

emad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views7 pages

Answser Keys To Practices in R Short Course R Basics: Practice 1

This document shows example code and output from practicing with an R dataset. It contains 6 practice problems: 1. It extracts dimension and column information from a dataset with 150 rows and 5 columns. 2. It assigns values from specific cells in the dataset to new variables. 3. It reads in a new dataset with 988,346 rows and 38 columns from a CSV file and extracts two columns as new variables. 4. It calculates summary statistics like mean, variance, minimum, maximum and median for one of the new variables and stores them in a vector.

Uploaded by

emad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Answser keys to practices in R short course R basics

This file shows the code that is run and the result that will be shown in R if there is any, where the
highlighted blue lines are the code and the highlighted black lines are the results shown in R.

Practice 1
#1. What is the dimension of our dataset?
> dim(mydatacsv)
[1] 150 5

#2. Assign the value of the cell [2,3] to the new variable var1
var1=mydatacsv[2,3]

#3. Assign the value of the cell [10,4] to the new variable var2
var2=mydatacsv[10,4]

#4. Output the value of each column separately.


> mydatacsv[,1]
[1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4
[22] 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5
[43] 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0
[64] 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0
[85] 5.4 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5
[106] 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
[127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3
[148] 6.5 6.2 5.9
> mydatacsv[,2]
[1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5 3.8 3.8 3.4
[22] 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2 3.5 3.6 3.0 3.4 3.5 2.3
[43] 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2
[64] 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7
[85] 3.0 3.4 3.1 2.3 3.0 2.5 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0
[106] 3.0 2.5 2.9 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2
[127] 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2 3.3 3.0 2.5
[148] 3.0 3.4 3.0
> mydatacsv[,3]
[1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4 1.7 1.5 1.7
[22] 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2 1.3 1.4 1.3 1.5 1.3 1.3
[43] 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0
[64] 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1
[85] 4.5 4.5 4.7 4.4 4.1 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8
[106] 6.6 4.5 6.3 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
[127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0
[148] 5.2 5.4 5.1
> mydatacsv[,4]
[1] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3 0.3 0.3 0.2
[22] 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2 0.2 0.1 0.2 0.2 0.3 0.3
[43] 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2 1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0
[64] 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6
[85] 1.5 1.6 1.5 1.3 1.3 1.3 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3 2.5 1.9 2.1 1.8 2.2
[106] 2.1 1.7 1.8 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8
[127] 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3 2.5 2.3 1.9
[148] 2.0 2.3 1.8
> mydatacsv[,5]
[1] setosa setosa setosa setosa setosa setosa setosa
[8] setosa setosa setosa setosa setosa setosa setosa
[15] setosa setosa setosa setosa setosa setosa setosa
[22] setosa setosa setosa setosa setosa setosa setosa
[29] setosa setosa setosa setosa setosa setosa setosa
[36] setosa setosa setosa setosa setosa setosa setosa
[43] setosa setosa setosa setosa setosa setosa setosa
[50] setosa versicolor versicolor versicolor versicolor versicolor versicolor
[57] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[64] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[71] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[78] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[85] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[92] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[99] versicolor versicolor virginica virginica virginica virginica virginica
[106] virginica virginica virginica virginica virginica virginica virginica
[113] virginica virginica virginica virginica virginica virginica virginica
[120] virginica virginica virginica virginica virginica virginica virginica
[127] virginica virginica virginica virginica virginica virginica virginica
[134] virginica virginica virginica virginica virginica virginica virginica
[141] virginica virginica virginica virginica virginica virginica virginica
[148] virginica virginica virginica
Levels: setosa versicolor virginica

#5. Assign the values of Petal.Width to a new variable PW.


PW=mydatacsv[,4]

#6. Output the value of row 15.


mydatacsv[15,]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
15 5.8 4 1.2 0.2 setosa

Practice 2a.
#1. Read into R the dataset pubfileb.csv.
public<- read.table('pubfileb.csv', sep=',', header=T)

#2. Determine the dimensions of the dataset.


dim(public)
[1] 988346 38

#3. Extract the variable povpct, income as percent of poverty level (column 35) as a new variable.
povpct=public[,35]

#4. Extract the variable ms, marital status (column 5) as a new variable.
ms= public[,5]

#5. Obtain the minimum, maximum, mean, variance, median for the variable povpct and store them in
separate variables.
m=mean(povpct)
var=var(povpct)
min=min(povpct)
max=max(povpct)
med=median(povpct)

#6. Create a vector with the stored values from 5.


sumvec=c(m,var,min,max,med)
summary(povpct)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 5.00 10.00 10.27 16.00 21.00

#7. Create a histogram of povpct of a different color with 20 breaks.


hist(povpct, main="Histogram of povpct", col="tomato", breaks=20)
Practice 2b.
#1. Create a boxplot of povpct of a different color.
boxplot(povpct, main="Boxplot of povpct", col="limegreen", ylab="povpct")
#2. Create a boxplot of povpct by ms with the same color for all boxes.
boxplot(povpct~ms,col="blue", main="Boxplot of povpct by Marital Status")
#3. Create a boxplot of povpct by ms with the same color for the first three boxes and another color for
the remaining three boxes.
boxplot(povpct~ms,col=c("blue", "blue", "blue", "yellow", "yellow", "yellow"),
main="Boxplot of povpct by Marital Status")
#4. Create a normal Q-Q plot for povpct.
qqnorm(povpct, main="Normal QQ Plot povpct")
#5. Using for loops count how many observations are there in a metropolitan area (smsast=1) (col 20)
with an age lower than 15 (col 2).
count=0
for(i in 1:dim(public)[1]){
if (public[i,20]==1){
if(public[i,2]<15) count=count+1
}
}
count
[1] 53671

#6. Export your extracted variables as a .csv file and the dataset as a tab delimited .txt file.
write.csv(povpct,file="povpct.csv")
write.table(public,file="public.txt", sep="\t")

You might also like