Multilevel Modeling with R
Spirin Nikita
Dorodnicyn Computing Center of the Russian Academy of Sciences 03.24.2010, Moscow
Packages covered
SAS MySQL Python Mathematica R
Agenda
R programming language and R Paradigm Basic operations in R Graphics with R Statistics with R Multilevel Models and ML with R
8 min. times 5 equals 40 min.
Overview
Free and commercialized GNU GPL R core team http://cran.r-project.org Interpreter
Concepts
Actions with in-memory objects function() function library
Basic Notation
Basic Notation
A-Z and a-z _ . 0-9 Case Sensitive
Basic operations in R
assign operator
Basic operations in R
ls() function
Basic operations in R
HELP
Reading Data
getwd() setwd() readtable() scan() read.fwf() ASCII Excel, SAS, SPSS, SQL-type databases
Reading Data
Saving Data
save.image()
Generating Data
Generating Data
Generating Data
Cartesian product
Generating Data
rfunc(n, p1, p2, ...)
Of course Matrices
Of course Matrices
Syntactic sugar
Graphics with R
Device paradigm
Window() Pdf() X11()
Graphics with R
Graphics with R
Legend for a graph
Graphics with R
Graphics with R
Graphics with R
Graphics with R
Graphics with R
Statistics with R
> library(stats)
Key operator ~ @model description operator y ~ model
Statistics with R
Statistics with R
Quiz y~x1+x2
y~I(x1+x2)
y ~ poly(x, 2)
Statistics with R
Multilevel Modeling with R
Multilevel Modeling with R
Why multilevel modeling?
Using all the data to perform inferences for groups with small sample size Predict an output for a new group Hierarchical models avoid overfitting effect of least squares regression Yields accurate measure of predictive uncertainty
Multilevel Modeling with R
fss = c(0,8,15,33,42,45,49,54,98,143,165,175,179,200) # include the library library(caTools) # read training and scoring data train <- read.csv("C:/Users/Spirinus/Desktop/Final Package/R/S_AUC_Train_1_7500.csv") score <- read.csv("C:/Users/Spirinus/Desktop/Final Package/R/S_AUC_Train_Test_7501_15000.csv") # data preparation train[train$Target == - 1, "Target"] <- 0 train$RowID = NULL
Multilevel Modeling with R
# build the model AUClogistic <- glm(Target ~ ., data=train[1:1000,fss+1], family=binomial(link="logit")) # get predictions on a scoring dataset test_scores <- predict(AUClogistic, type="response", score[1:1000,]) testY = score[1:1000,]$Target # calculate AUC colAUC(test_scores,testY)
Multilevel Modeling with R
lmer() library(matrix) Examples:
lmer(y ~ 1 + (1 | county))
lmer(y ~ x + (1 | county)) lmer(y ~ x + (1 + x | county))
Summary
How R works Basic objects in R R graphical capabilities R for statistical analysis Multilevel modeling in R
More Information
R for Beginners, Emmanuel Paradis, Institut des Sciences de l' Evolution Universite Montpellier II, F-34095 Montpellier cedex 05, France http://cran.r-project.org Data Analysis Using Regression and Multilevel Hierarchical Models, A. Gelman J.Hill
Acknowledgements
Thank you!