0% found this document useful (0 votes)

18 views53 pages

R Record-1

The document is a practical record for the I M.Sc Computer Science course at L.R.G Government Arts College for Women, focusing on Data Mining using R. It includes various algorithms such as Apriori for association rules, K-Means and Hierarchical clustering, and classification methods, along with R code snippets and outputs for practical implementation. The record is intended for submission for the Bharathiar University Practical Examination for the academic year 2023-2024.

Uploaded by

Samy Samy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views53 pages

R Record-1

Uploaded by

Samy Samy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 53

L.R.

G GOVERNMENT ARTS COLLEGE FOR

WOMEN (AFFILIATED TO BHARATHIAR

UNIVERSITY)

TIRUPUR-641604.
DEPARTMENT OF COMPUTER SCIENCE

I MSC-COMPUTER SCIENCE

PRACTICAL-III :DATA MINING USING R

NAME :

REG.NO:
CERTIFICATE
L R.G GOVERNMENT ARTS COLLEGE FOR WOMEN

(AFFILIATED TO BHARATHIAR UNIVERSITY)

TIRUPUR-641604

NAME :

CLASS :

This is to certify that it is a bonafide record of practical work done by the above student of
the I-M.SC COMPUTER SCIENCE PRACTICAL-III: DATA MINING USING R during the
academic year 2023-2024.

Staff in Charge. Head of the Department

Submitted for the Bharathiar University Practical Examination held On. . . . . . . . . . . . . . . .

Internal Examiner. External Examiner

CONTENT
INDEX

S.N DATE CONTENTS PA SIGN

O G
E
N
O

1 APRIORI ALGORITHM TO
EXTRACT ASSOCIATION RULES
OF DATA MINING

2 K-MEANS CLUSTERING ALGORITHM

3 HIERARCHICAL CLUSTERING

4 CLASSIFICATION ALGORITHM

5 DECISION TREE

6 LINEAR REGRESSION

7 DATA VISUALZATION
1. APRIORI ALGORITHM TO EXTRACT ASSOCIATION RULES

OF DATA MINING

# Loading Libraries

library(arules)

library(arulesViz)

library(RColorBrewer)

# import dataset

data('Groceries')

# using apriori() function

rules<-apriori(Groceries,parameter=list(supp=0.01,conf=0.2))

# using inspect() function

inspect(rules[1:10])

# using itemFrequencyPlot() function

arules::itemFrequencyPlot(Groceries,topN=20,

col=brewer.pal(8,'Pastel2'),

main='Relative Item Frequency

Plot', type='relative',

ylab='Item Frequency(Relative)')
OUTPUT:
# Loading Libraries

>library(arules)

Loading required package: Matrix

Attaching package: ‘arules’

The following objects are masked from

‘package:base’: abbreviate, write

> library(arulesViz)

> library(RColorBrewer)

> # import dataset

> data('Groceries')

> # using apriori() function

> rules<-apriori(Groceries,parameter=list(supp=0.01,conf=0.2))

Apriori

Parameter specification:

confidence minval smax arem aval originalSupport maxtime support minlen

0.2 0.1 1 none FALSE TRUE 5 0.01 1

maxlen target ext

10 rules TRUE

Algorithmic

control:

filter tree heap memopt load sort verbose

0.1 TRUE TRUE FALSE TRUE 2 TRUE

Absolute minimum support count: 98

set item appearances ...[0 item(s)] done [0.00s].

set transactions ...[169 item(s), 9835 transaction(s)] done [0.01s].

sorting and recoding items ... [88 item(s)] done [0.00s].

creating transaction tree ... done [0.01s].

checking subsets of size 1 2 3 4 done [0.00s].

writing ... [232 rule(s)] done [0.00s].

creating S4 object ... done [0.00s].

> # using inspect() function

> inspect(rules[1:10])

lhs rhs support confidence coverage

[1] {} => {whole milk} 0.25551601 0.2555160 1.00000000

[2] {hard cheese} => {whole milk} 0.01006609 0.4107884 0.02450432

[3] {butter milk} => {other vegetables} 0.01037112 0.3709091 0.02796136

[4] {butter milk} => {whole milk} 0.01159126 0.4145455 0.02796136

[5] {ham} => {whole milk} 0.01148958 0.4414062 0.02602949

[6] {sliced cheese} => {whole milk} 0.01077783 0.4398340 0.02450432

[7] {oil} => {whole milk} 0.01128622 0.4021739 0.02806304

[8] {onions} => {other vegetables} 0.01423488 0.4590164 0.03101169

[9] {onions} => {whole milk} 0.01209964 0.3901639 0.03101169

[10] {berries} => {yogurt} 0.01057448 0.3180428 0.03324860

lift count
[ 1.0000 25
1 00 13
]
[ 1.6076 9
2 82 9
]
[ 1.9169 10
3 16 2
]
[ 1.6223 11
4 85 4
]
[ 1.7275 11
5 09 3
]
[ 1.7213 10
6 56 6
]
[ 1.5739 11
7 68 1
]
[8] 2.372268 140

[9] 1.526965 119

[10] 2.279848 104

> # using itemFrequencyPlot() function

> arules::itemFrequencyPlot(Groceries,topN=20,

+ col=brewer.pal(8,'Pastel2'),

+ main='Relative Item Frequency Plot',

+ type='relative',

+ ylab='Item Frequency(Relative)')
Relative Item Frequency Plot
2.K-Means Clustering
library(cluster)

df=USArrests

#Number of Rows and Columns of the Actul Data

set dim(df)

head(df)

#remove rows with missing

values df=na.omit(df)

dim(df)

#scale each variable to have mean 0 and sd 1

df=scale(df)

head(df)

set.seed(1)

#Cluster the dataset with 5 groups

km=kmeans(df,centers=5,nstart=25)

print(km)

plot(df)

points(km$centers,col=1:5,pch=8,cex=2)

cnt=table(km$cluster)

print(cnt)

final_data=cbind(df,cluster=km$cluster)

head(final_data)

plot(final_data,cex=0.6,main="Final Data")

ag=aggregate(final_data,by=list(cluster=km$cluster),mean)

head(ag)

plot(ag,cex=0.6,main="Aggregate")
OUTPUT:
#K-means Clustering

> #Number of Rows and Columns of the Actaul Data set

> dim(df)

[1] 50 4

> head(df)

Murder Assault UrbanPop Rape

Alabam 13 23 58
a .2 6 21.2
Alaska 10. 263 48
0 44.5
Arizona 8.1 294 80
31.0
Arkansa 8. 190 50
s 8 19.5
Californ 9. 276 91
ia 0 40.6
Colorad 7. 204 78
o 9 38.7

> #remove rows with missing

values

> dim(df)

[1] 50 4

> #scale each variable to have mean 0 and sd 1

> head(df)

Murder Assault UrbanPop Rape

Alabama 1.24256408 0.7828393 -0.5209066 -0.003416473

Alaska 0.50786248 1.1068225 -1.2117642 2.484202941

Arizona 0.07163341 1.4788032 0.9989801 1.042878388

Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602

California 0.27826823 1.2628144 1.7589234 2.067820292

Colorado 0.02571456 0.3988593 0.8608085 1.864967207

> #Cluster the dataset with 5 groups

> print(km)

K-means clustering with 5 clusters of sizes 7, 10, 10, 11,

12 Cluster means:

Murder Assault UrbanPop Rape

1 1.5803956 0.9662584 -0.7775109 0.04844071

2 -1.1727674 -1.2078573 -1.0045069 -1.10202608

3 -0.6286291 -0.4086988 0.9506200 -0.38883734

4 -0.1642225 -0.3658283 -0.2822467 -0.11697538

5 0.7298036 1.1188219 0.7571799 1.32135653

Clustering vector:

Alabama Alaska Arkan Californi

a
Arizona 15 5 4 sa

Colorado Connecticut s
Delaware Georg
5 ia
Flori
da

5 3 3 5 1

Haw Idah Illino India Iowa

aii o is na
3 2 5 4 2

Kansas Kentucky Louisiana Maine

Maryland 4 4 1 2

Massachusetts Michigan Minnesota Mississippi Missouri

3 5 2 1 4

Montana Nebraska Nevada New Hampshire New

Jersey 4 4 5 2 3

New Mexico New York North Carolina North Dakota

Ohio 5 5 1 2
3

Oklahoma Oregon Pennsylvania Rhode Island South Carolina

4 4 3 3 1

South Dakota Tennessee Texas Utah

Vermont 2 1 5 3

Virginia Washington West Virginia Wisconsin

Wyoming 4 3 22 4

Within cluster sum of squares by cluster:

[1] 6.128432 7.443899 9.326266 7.788275 18.257332

(between_SS / total_SS = 75.0 %)

Available components:

[1] "cluster" "centers" "totss" "withinss" "tot.withinss"

[6] "betweenss" "size" "iter" "ifault"

> plot(df)
> points(km$centers,col=1:5,pch=8,cex=2)
> print(cnt)

12345

7 10 10 11 12

> head(final_data)

Murder Assault UrbanPop Rape cluster

Alaba 1.24256408 0.7828393 -0.5209066 - 1

ma 0.003416473
Alaska 0.50786248 1.1068225 -1.2117642 2.484202941 5

Arizon 0.07163341 1.4788032 0.9989801 1.042878388 5

Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602 4

California 0.27826823 1.2628144 1.7589234 2.067820292 5

Colorado 0.02571456 0.3988593 0.8608085 1.864967207 5

> plot(final_data,cex=0.6,main="Final Data")

> head(ag)

cluster Murder Assault UrbanPop Rape cluster

1 1 1.5803956 0.9662584 -0.7775109 0.04844071 1

2 2 -1.1727674 -1.2078573 -1.0045069 -1.10202608 2

3 3 -0.6286291 -0.4086988 0.9506200 -0.38883734 3

4 4 -0.1642225 -0.3658283 -0.2822467 -0.11697538 4

5 5 0.7298036 1.1188219 0.7571799 1.32135653 5

> plot(ag,cex=0.6,main="Aggregate")
3.HIERARCHICAL CLUSTERING
#Hierarchical Clustering

library(cluster)

df=USArrests

#remove rows with missing

values df=na.omit(df)

#scale each variable to have a mean 0 and sd of

1 df=scale(df)

head(df)

d=dist(df,method="euclidean")

#complete dendogram

hc1=hclust(d,method="complete")

plot(hc1,cex=0.6,main="complete dendogram",hang=-1)

#average dendogram

hc2=hclust(d,method="average")

plot(hc2,cex=0.6,main="Average Dendogram",hang=-1)

abline(h=3.0,col="green")

groups=cutree(hc2,k=4)

print(groups)

table(groups)

rect.hclust(hc2,k=4,border="red")

final_data=cbind(df,cluster=groups)

head(final_data)

plot(final_data,cex=0.6,main="Final Data")
OUTPUT:
#Hierarchical Clustering

> #remove rows with missing values

> #scale each variable to have a mean 0 and sd of 1

> head(df)

Murder Assault UrbanPop Rape

Alabama 1.24256408 0.7828393 -0.5209066 -0.003416473

Alaska 0.50786248 1.1068225 -1.2117642 2.484202941

Arizona 0.07163341 1.4788032 0.9989801 1.042878388

Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602

California 0.27826823 1.2628144 1.7589234 2.067820292

Colorado 0.02571456 0.3988593 0.8608085 1.864967207

> #complete dendogram

> plot(hc1,cex=0.6,main="complete dendogram",hang=-1)

> #average dendogram

> print(groups)

Alabama Alaska Arkan Californi

a
Arizona 12 3 4 sa

Colorado Connecticut s
Delaware Geor
3 gia
Flori
da

3 4 4 3 1

Haw Idah Illino India Iowa

aii o is na
4 4 3 4 4

Kansas Kentucky Louisiana Maine

Maryland 4 4 1 4

Massachusetts Michigan Minnesota Mississippi Missouri

4 3 4 1 3

Montana Nebraska Nevada New Hampshire New Jersey

4 4 3 4 4

New Mexico New York North Carolina North Dakota

Ohio 3 3 1 4

Oklahoma Oregon Pennsylvania Rhode Island South Carolina 4

4 4 4 1

South Dakota Tennessee Texas Utah

Vermont 4 1 3 4

Virginia Washington West Virginia Wisconsin

Wyoming 4 4 44 4

groups

1234

7 1 12 30

> rect.hclust(hc2,k=4,border="red")
> final_data=cbind(df,cluster=groups)

> head(final_data)

Murder Assault UrbanPop Rape cluster

Alaba 1.24256408 0.7828393 -0.5209066 - 1

ma 0.003416473
Alaska 0.50786248 1.1068225 -1.2117642 2.484202941 2

Arizon 0.07163341 1.4788032 0.9989801 1.042878388 3

Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602 4

California 0.27826823 1.2628144 1.7589234 2.067820292 3

Colorado 0.02571456 0.3988593 0.8608085 1.864967207 3

> plot(final_data,cex=0.6,main="Final Data")
4.CLASSIFICATION ALGORITHM
#Classification

Algorithm library(class)

df=data(iris)

#Number of Rows and Columns

dim(iris)

head(iris)

rand=sample(1:nrow(iris),0.9*nrow(iris))

head(rand)

#Scale the values using Normalization

method nor<-function(x)

return((x-min(x))/(max(x)-min(x)))

iris_norm=as.data.frame(lapply(iris[,c(1,2,3,4)],nor))

head(iris_norm)

#Train dataset

iris_train=iris_norm[rand,]

iris_train_target=iris[rand,5

] #Test dataset

iris_test=iris_norm[-rand,]

iris_test_target=iris[-rand,5]

dim(iris_train)

dim(iris_test)

#K-nearesr neighbour Classification

model1=knn(train=iris_train,test=iris_test,cl=iris_train_target,k=7)
#Confusion Matric

tab=table(model1,iris_test_target)

print(tab)

accuracy=function(x)

sum(diag(x)/sum(rowSums(x)))*100

cat("Accuracy classifier=",accuracy(tab))
OUTPUT:
#Classification Algorithm

> #Number of Rows and Columns

> dim(iris)

[1] 150 5

> head(iris)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3 1.4 0.2 seto

. sa
5
2 4.9 3 1.4 0.2 seto
. sa
0
3 4.7 3 1.3 0.2 seto
. sa
2
4 4.6 3 1.5 0.2 seto
. sa
1
5 5.0 3 1.4 0.2 seto
. sa
6
6 5.4 3 1.7 0.4 seto
. sa
9

> head(rand)

[1] 114 107 25 128 14 24

> #Scale the values using Normalization method

> nor<-function(x)

+ return((x-min(x))/(max(x)-min(x)))

> head(iris_norm)

Sepal.Length Sepal.Width Petal.Length Petal.Width

1 0.62500 0.067796 0.041666
0.22222222 00 61 67
2 0.41666 0.067796 0.041666
0.16666667 67 61 67
3 0.50000 0.050847 0.041666
0.11111111 00 46 67
4 0.45833 0.084745 0.041666
0.08333333 33 76 67
5 0.19444444 0.6666667 0.06779661 0.04166667

6 0.30555556 0.7916667 0.11864407 0.12500000

> #Train dataset

> iris_train=iris_norm[rand,]

> iris_train_target=iris[rand,5]

> #Test dataset

> iris_test=iris_norm[-rand,]

> iris_test_target=iris[-rand,5]

> dim(iris_train)

[1] 135 4

> dim(iris_test)

[1] 15 4

> #K-nearesr neighbour Classification

> model1=knn(train=iris_train,test=iris_test,cl=iris_train_target,k=7)

> #Confusion Matric

> print(tab)

iris_test_target

model1 setosa versicolor

virginica setosa 6 0

versicolor 0 6 1

virginica 0 0 2

> accuracy=function(x)

+ sum(diag(x)/sum(rowSums(x)))*100

cat("Accuracy classifier=",accuracy(tab)) Accuracy classifier= 100>

5.DECISION TREE
#Decision Tree

library(rpart)

data=iris

str=data

head(data)

#creating the decision tree using regression

dtree=rpart(Sepal.Width~Sepal.Length+Petal.Width+Petal.Length+Species,data=iris,method
="anova")

plot(dtree,uniform=TRUE,main="Sepal Width Decision Tree Using Regression")

print(dtree)

text(dtree,use.n=TRUE,cex=.7)

#predicting the Sepal Width

adata<-data.frame(Species='versicolor',Sepal.Length=5.1,Petal.Length=4.5,Petal.Width=1.4)

cat("Predicted Value:\n")

pt=predict(dtree,adata,method="anova")

print(pt)

plot(pt)

#creating the decision tree using

classification df=as.data.frame(data)

dt=rpart(Sepal.Width~Sepal.Length+Petal.Width+Petal.Length+Species,data=df,method="cl
ass")

plot(dt,uniform=TRUE,main="Sepal Width Decision Tree using Classification")

print(dt)

text(dt,use.n=TRUE,cex=.7)
OUTPUT:
> #Decision Tree

> head(data)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3 1.4 0.2 seto

. sa
5
2 4.9 3 1.4 0.2 seto
. sa
0
3 4.7 3 1.3 0.2 seto
. sa
2
4 4.6 3 1.5 0.2 seto
. sa
1
5 5.0 3 1.4 0.2 seto
. sa
6
6 5.4 3 1.7 0.4 seto
. sa
9
> #creating the decision tree using regression
dtree=rpart(Sepal.Width~Sepal.Length+Petal.Width+Petal.Length+Species,data=iris,metho
d
="anova")

> plot(dtree,uniform=TRUE,main="Sepal Width Decision Tree Using Regression")

> print(dtree)

n= 150

node), split, n, deviance, yval

* denotes terminal node

1) root 150 28.3069300 3.057333

2) Species=versicolor,virginica 100 10.9616000 2.872000

4) Petal.Length< 4.05 16 0.7975000 2.487500 *

5) Petal.Length>=4.05 84 7.3480950 2.945238

10) Petal.Width< 1.95 55 3.4920000 2.860000

20) Sepal.Length< 6.35 36 2.5588890 2.805556 *

21) Sepal.Length>=6.35 19 0.6242105 2.963158 *

11) Petal.Width>=1.95 29 2.6986210 3.106897

22) Petal.Length< 5.25 7 0.3285714 2.914286 *

23) Petal.Length>=5.25 22 2.0277270 3.168182 *

3) Species=setosa 50 7.0408000 3.428000

6) Sepal.Length< 5.05 28 2.0496430 3.203571 *

7) Sepal.Length>=5.05 22 1.7859090 3.713636 *

> text(dtree,use.n=TRUE,cex=.7)

> #predicting the Sepal Width

Predicted Value:

2.805556

> plot(pt)
> #creating the decision tree using classification

> plot(dt,uniform=TRUE,main="Sepal Width Decision Tree using Classification")

> print(dt)

n= 150

node), split, n, loss, yval, (yprob)

* denotes terminal node

1) root 150 124 3 (0.0067 0.02 0.027 0.02 0.053 0.033 0.06 0.093 0.067 0.17 0.073 0.087
0.04 0.08 0.04 0.027 0.02 0.04 0.013 0.0067 0.0067 0.0067 0.0067)

2) Petal.Width>=0.8 100 80 3 (0.01 0.03 0.03 0.03 0.08 0.05 0.09 0.14 0.09 0.2 0.07 0.08
0.04 0.03 0 0.01 0 0.02 0 0 0 0 0)

4) Sepal.Length< 6.45 65 55 2.8 (0.015 0.046 0.046 0.046 0.11 0.062 0.14 0.15 0.11 0.14
0.015 0.046 0.031 0.046 0 0 0 0 0 0 0 0 0)

8) Petal.Width< 1.95 56 47 2.7 (0.018 0.054 0.054 0.054 0.11 0.071 0.16 0.11 0.12 0.16
0.018 0.036 0.018 0.018 0 0 0 0 0 0 0 0 0)

16) Sepal.Length< 5.55 12 9 2.4 (0.083 0 0.17 0.25 0.25 0.083 0.083 0 0 0.083 0 0 0 0
0 0 0 0 0 0 0 0 0) *

17) Sepal.Length>=5.55 44 36 2.7 (0 0.068 0.023 0 0.068 0.068 0.18 0.14 0.16 0.18
0.023 0.045 0.023 0.023 0 0 0 0 0 0 0 0 0)

34) Petal.Width< 1.55 29 23 2.9 (0 0.1 0.034 0 0.069 0.1 0.1 0.17 0.21 0.17 0 0.034 0
0 0 0 0 0 0 0 0 0 0)

68) Sepal.Length>=5.95 15 11 2.9 (0 0.2 0.067 0 0.067 0.067 0 0.2 0.27 0.067 0
0.067 0 0 0 0 0 0 0 0 0 0 0) *

69) Sepal.Length< 5.95 14 10 3 (0 0 0 0 0.071 0.14 0.21 0.14 0.14 0.29 0 0 0 0 0 0 0

0 0 0 0 0 0) *

35) Petal.Width>=1.55 15 10 2.7 (0 0 0 0 0.067 0 0.33 0.067 0.067 0.2 0.067 0.067
0.067 0.067 0 0 0 0 0 0 0 0 0) *

9) Petal.Width>=1.95 9 5 2.8 (0 0 0 0 0.11 0 0 0.44 0 0 0 0.11 0.11 0.22 0 0 0 0 0 0 0 0

0) *

5) Sepal.Length>=6.45 35 24 3 (0 0 0 0 0.029 0.029 0 0.11 0.057 0.31 0.17 0.14 0.057 0 0

0.029 0 0.057 0 0 0 0 0) *

3) Petal.Width< 0.8 50 41 3.4 (0 0 0.02 0 0 0 0 0 0.02 0.12 0.08 0.1 0.04 0.18 0.12 0.06
0.06 0.08 0.04 0.02 0.02 0.02 0.02)

6) Sepal.Length< 4.95 20 15 3 (0 0 0.05 0 0 0 0 0 0.05 0.25 0.2 0.2 0 0.15 0 0.1 0 0 0 0 0

0 0)

12) Petal.Length< 1.45 13 8 3 (0 0 0.077 0 0 0 0 0 0.077 0.38 0 0.23 0 0.077 0 0.15 0 0 0

0 0 0 0) *

13) Petal.Length>=1.45 7 3 3.1 (0 0 0 0 0 0 0 0 0 0 0.57 0.14 0 0.29 0 0 0 0 0 0 0 0 0) *

7) Sepal.Length>=4.95 30 24 3.4 (0 0 0 0 0 0 0 0 0 0.033 0 0.033 0.067 0.2 0.2 0.033 0.1
0.13 0.067 0.033 0.033 0.033 0.033)

14) Petal.Length< 1.45 11 7 3.5 (0 0 0 0 0 0 0 0 0 0 0 0.091 0.091 0.091 0.36 0.091 0 0

0.091 0.091 0 0.091 0) *

15) Petal.Length>=1.45 19 14 3.4 (0 0 0 0 0 0 0 0 0 0.053 0 0 0.053 0.26 0.11 0 0.16

0.21 0.053 0 0.053 0 0.053) *

> text(dt,use.n=TRUE,cex=.7)
6.LINEAR REGRESSION
#Linear Regression

setwd("D:/R")

df=read.csv("h2.csv",header=TRUE)

print(df)

lr=lm(height~weight,data=df)

print(lr)

#Linear Regression

plot(df$height,df$weight,col="blue",main="Height_Weight
Regression",cex=1.3,pch=15,xlab="height",ylab="weight")

print(summary(lr))

print(residuals(lr))

coeff=coefficients(lr)

eq=paste0("y",round(coeff[2],1),"*(",round(coeff[1],1),"*x)")

print(eq)

#Linear Equation

new.weights=data.frame(weight=c(60,50)

) print(new.weights)

df1=predict(lr,newdata=new.weights)

print(df1)

df2=data.frame(df1,new.weights)

names(df2)=c("height","weight")

print(df2)

df3=rbind(df,df2)

print(df3)

write.csv(df3,"h3.csv")

pie(table(df3$height))
OUTPUT:
> #Linear Regression

> setwd("D:/R")

> df=read.csv("h2.csv",header=TRUE)

> print(df)

height weight

1 8
0
174
2 7
0
150
3 7
5
160
4 8
5
180

> lr=lm(height~weight,data=df)

> print(lr)

Call:

lm(formula = height ~ weight, data =

df) Coefficients:

(Intercept) weight
4.80 2.08

> #Linear Regression

> print(summary(lr))

Call:

lm(formula = height ~ weight, data =

df) Residuals:

1 2 3 4

2.8 -0.4 -0.8 -1.6

Coefficients:
Estimate Std. Error t value Pr(>|t|)

(Intercept) 4.8000 16.4463 0.292 0.7979

weight 2.0800 0.2117 9.827 0.0102 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.366 on 2 degrees of freedom

Multiple R-squared: 0.9797, Adjusted R-squared: 0.9696

F-statistic: 96.57 on 1 and 2 DF, p-value: 0.0102

> print(residuals(lr))

1 2 3 4

2.8 -0.4 -0.8 -1.6

> print(eq)

[1] "y2.1*(4.8*x)"

> #Linear Equation

> print(new.weights)

weight

1 60

2 50

> print(df1)

1 2

129.6 108.8

> print(df2)

height weight

1 129.6 60

2 108.8 50

> df3=rbind(df,df2)
> print(df3)

height

weight

1 174.0 8
0
2 150.0 7
0
3 160.0 7
5
4 180.0 8
5
11 6
129.6 0
21 5
108.8 0

> write.csv(df3,"h3.csv")

> pie(table(df3$height))
7.DATA VISUALIZATION
#Data Visualization

X=iris

dim(X)

summary(X)

head(X)

hist(X$Sepal.Length,main='Histogram',col='green')

barplot(X$Sepal.Length[1:10],main='Barplot',col='red',xlab='Sepal.Length'

) pie(table(X$Sepal.Length),main='pie-chart')

pairs(X)

plot(X$Sepal.Length,main='plot-chart',col='blue')

boxplot(X,main='Boxplot',col='yellow')
OUTPUT:

> #Data Visualization

> dim(X)

[1] 150 5

> summary(X)

Sepal.Length Sepal.Width

Petal.Length Min. :4.300

Min. :2.000 Min. :1.000 1st Qu.:5.100

1st Qu.:2.800 1st Qu.:1.600

Median :5.800 Median :3.000 Median :4.350

Mean :5.843 Mean :3.057 Mean :3.758 3rd

Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 Max.

:7.900 Max. :4.400 Max. :6.900

Petal.Width Species

Min. :0.100 setosa :50

1st Qu.:0.300 versicolor:50

Median :1.300 virginica :50

Mean :1.199

3rd Qu.:1.800

Max. :2.500

> head(X)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3 1.4 0.2 seto
. sa
5
2 4.9 3 1.4 0.2 seto
. sa
0
3 4.7 3 1.3 0.2 seto
. sa
2
4 4.6 3 1.5 0.2 seto
. sa
1
5 5.0 3 1.4 0.2 seto
. sa
6
6 5.4 3 1.7 0.4 seto
. sa
9

> hist(X$Sepal.Length,main='Histogram',col='green')

> barplot(X$Sepal.Length[1:10],main='Barplot',col='red',xlab='Sepal.Length')
> pie(table(X$Sepal.Length),main='pie-chart')

> pairs(X)
> plot(X$Sepal.Length,main='plot-chart',col='blue')
> boxplot(X,main='Boxplot',col='yellow')

M.Sc Data Mining Using R Guide
No ratings yet
M.Sc Data Mining Using R Guide
15 pages
Datamining Lab Record
No ratings yet
Datamining Lab Record
36 pages
R Data Analysis Techniques
No ratings yet
R Data Analysis Techniques
9 pages
R Lab Program
No ratings yet
R Lab Program
20 pages
DATAMINING
No ratings yet
DATAMINING
24 pages
Datamininganddataware
No ratings yet
Datamininganddataware
25 pages
R - Practical
No ratings yet
R - Practical
50 pages
Datamining
No ratings yet
Datamining
20 pages
Apriori Algorithm & Clustering Guide
No ratings yet
Apriori Algorithm & Clustering Guide
8 pages
DM Lab
No ratings yet
DM Lab
18 pages
R Reference Card For Data Mining
No ratings yet
R Reference Card For Data Mining
3 pages
R Reference Card For Data Mining
No ratings yet
R Reference Card For Data Mining
4 pages
R Data Mining Guide for Analysts
No ratings yet
R Data Mining Guide for Analysts
4 pages
YanchangZhao Refcard Data Mining
No ratings yet
YanchangZhao Refcard Data Mining
3 pages
DWDM Lab Report
No ratings yet
DWDM Lab Report
10 pages
Mla - 2 (Cia - 3) - 20221013
No ratings yet
Mla - 2 (Cia - 3) - 20221013
21 pages
Datamining 2
No ratings yet
Datamining 2
54 pages
R - Language Lab Manual - PG 2024
No ratings yet
R - Language Lab Manual - PG 2024
29 pages
Da Exp9,10
No ratings yet
Da Exp9,10
9 pages
DWDM Lab Report
No ratings yet
DWDM Lab Report
12 pages
BDA MSC It
No ratings yet
BDA MSC It
35 pages
DM Guidelines 14jan2022
No ratings yet
DM Guidelines 14jan2022
5 pages
1
No ratings yet
1
19 pages
R Programming for Data Analysis
No ratings yet
R Programming for Data Analysis
6 pages
Saurabh
No ratings yet
Saurabh
22 pages
KVA Anusha - PGP12021 - BA
100% (1)
KVA Anusha - PGP12021 - BA
8 pages
EXXAM
No ratings yet
EXXAM
3 pages
BAN5
No ratings yet
BAN5
2 pages
RDataMining Reference Card
No ratings yet
RDataMining Reference Card
5 pages
R Code For Discriminant and Cluster Analysis
No ratings yet
R Code For Discriminant and Cluster Analysis
23 pages
FullMarks - Clustering StudentSolution 2
No ratings yet
FullMarks - Clustering StudentSolution 2
13 pages
WEEK
No ratings yet
WEEK
17 pages
Prac7 8 9 10
No ratings yet
Prac7 8 9 10
12 pages
R Lab Manual (1) - Merged
No ratings yet
R Lab Manual (1) - Merged
25 pages
Assignment-1 80501
No ratings yet
Assignment-1 80501
6 pages
Unit 4 DSRP
No ratings yet
Unit 4 DSRP
119 pages
DSR LAB MANUAL - 10 Programs
No ratings yet
DSR LAB MANUAL - 10 Programs
34 pages
Unit 3 Unsupervised Learning
No ratings yet
Unit 3 Unsupervised Learning
9 pages
Document 1116
No ratings yet
Document 1116
6 pages
Big Data
No ratings yet
Big Data
17 pages
Unit 6 - Machine Learning in R
No ratings yet
Unit 6 - Machine Learning in R
45 pages
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
No ratings yet
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
6 pages
R File Code
No ratings yet
R File Code
16 pages
Spatial Statistics in R
No ratings yet
Spatial Statistics in R
29 pages
Data Mining Solve
No ratings yet
Data Mining Solve
5 pages
Da Lab File 2
No ratings yet
Da Lab File 2
13 pages
R Basics
No ratings yet
R Basics
18 pages
Final Practical
No ratings yet
Final Practical
53 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
Bi 5to 8
No ratings yet
Bi 5to 8
6 pages
PG DM
No ratings yet
PG DM
4 pages
Association Rules
No ratings yet
Association Rules
29 pages
DMBI
No ratings yet
DMBI
16 pages
Lab Manual - DSR
No ratings yet
Lab Manual - DSR
32 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
21 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Network Security Cryptography
No ratings yet
Network Security Cryptography
80 pages
App Builder
No ratings yet
App Builder
2 pages
Unit2 Da
No ratings yet
Unit2 Da
7 pages
R Practical - Aim & Algorithm
No ratings yet
R Practical - Aim & Algorithm
7 pages
Linux Programming Exam Papers
No ratings yet
Linux Programming Exam Papers
53 pages
Data Warehousing & Mining Guide
No ratings yet
Data Warehousing & Mining Guide
10 pages
17CS651 DMDW
No ratings yet
17CS651 DMDW
302 pages
Data Mining Techniques Explained
No ratings yet
Data Mining Techniques Explained
9 pages
Dengue Prediction
No ratings yet
Dengue Prediction
6 pages
DM Unit I.1-Introduction
No ratings yet
DM Unit I.1-Introduction
93 pages
Data Mining Module - New
No ratings yet
Data Mining Module - New
38 pages
Association Rule Mining in R
No ratings yet
Association Rule Mining in R
58 pages
Unit 3
No ratings yet
Unit 3
55 pages
Association Rule Mining On Distributed Data: Pallavi Dubey
No ratings yet
Association Rule Mining On Distributed Data: Pallavi Dubey
6 pages
Data Mining Tutorial Guide
No ratings yet
Data Mining Tutorial Guide
30 pages
Data Mining Exam Paper Summer 2023
No ratings yet
Data Mining Exam Paper Summer 2023
3 pages
DA (Course File)
No ratings yet
DA (Course File)
51 pages
Sri Ram Week 2 Assignment
No ratings yet
Sri Ram Week 2 Assignment
12 pages
WEKA Guide for ML Enthusiasts
No ratings yet
WEKA Guide for ML Enthusiasts
52 pages
Comp 1942 finalExamQuestion-2019
No ratings yet
Comp 1942 finalExamQuestion-2019
14 pages
BDA Assignment (Savi Bilandi)
No ratings yet
BDA Assignment (Savi Bilandi)
10 pages
PhD Thesis Writing Assistance
100% (2)
PhD Thesis Writing Assistance
7 pages
Maruti Seminar Report
No ratings yet
Maruti Seminar Report
29 pages
Data Mining and Data Warehousing: Principles and Practical Techniques 1st Edition Parteek Bhatia PDF Download
No ratings yet
Data Mining and Data Warehousing: Principles and Practical Techniques 1st Edition Parteek Bhatia PDF Download
91 pages
Apriori Documentation
No ratings yet
Apriori Documentation
31 pages
Machine Learning Lab File (BTCS619-18)
No ratings yet
Machine Learning Lab File (BTCS619-18)
50 pages
Association Rules
No ratings yet
Association Rules
20 pages
Intelligent Data Warehousing From Data Preparation To Data Mining 1st Edition Zhengxin Chen (Author) PDF Download
100% (5)
Intelligent Data Warehousing From Data Preparation To Data Mining 1st Edition Zhengxin Chen (Author) PDF Download
71 pages
Apriori Association Rule Mining Guide
No ratings yet
Apriori Association Rule Mining Guide
4 pages
M.SC Computer Sci 180622
No ratings yet
M.SC Computer Sci 180622
24 pages
Data Mining: Classification-1
No ratings yet
Data Mining: Classification-1
53 pages
Data Mining and Analysis Fundamental Concepts and Algorithms Mohammed J. Zaki Ready To Read
100% (1)
Data Mining and Analysis Fundamental Concepts and Algorithms Mohammed J. Zaki Ready To Read
80 pages
Ugc Net Cs Paper-2 Dbms Data Warehouse Miscellaneous
No ratings yet
Ugc Net Cs Paper-2 Dbms Data Warehouse Miscellaneous
38 pages
Apriori Algorithm Example Problems
No ratings yet
Apriori Algorithm Example Problems
8 pages