Linda Ouchaou
FIFTH LESSONS REPORT
WORK PLAN
Introduction
Graphic illustrations
Exercices
FIFTH LESSONS REPORT
Linda Ouchaou
2023-11-08
INTRODUCTION: Graphs are a crucial part of a biostatitician work , as we need it to
understand more our variables and data . In this lesson we got introduced to the major
graphs in R .As well we got to get a new package called TIDYVERSE that allow us to organise
the data more effectively .
demo(graphics)
##
##
## demo(graphics)
## ---- ~~~~~~~~
##
## > # Copyright (C) 1997-2009 The R Core Team
## >
## > require(datasets)
##
## > require(grDevices); require(graphics)
##
## > ## Here is some code which illustrates some of the differences between
## > ## R and S graphics capabilities. Note that colors are generally
specified
## > ## by a character string name (taken from the X11 rgb.txt file) and that
line
## > ## textures are given similarly. The parameter "bg" sets the background
## > ## parameter for the plot and there is also an "fg" parameter which sets
## > ## the foreground color.
## >
## >
## > x <- stats::rnorm(50)
##
## > opar <- par(bg = "white")
##
## > plot(x, ann = FALSE, type = "n")
##
## > abline(h = 0, col = gray(.90))
##
## > lines(x, col = "green4", lty = "dotted")
##
## > points(x, bg = "limegreen", pch = 21)
##
## > title(main = "Simple Use of Color In a Plot",
## + xlab = "Just a Whisper of a Label",
## + col.main = "blue", col.lab = gray(.8),
## + cex.main = 1.2, cex.lab = 1.0, font.main = 4, font.lab = 3)
##
## > ## A little color wheel. This code just plots equally spaced hues in
## > ## a pie chart. If you have a cheap SVGA monitor (like me) you will
## > ## probably find that numerically equispaced does not mean visually
## > ## equispaced. On my display at home, these colors tend to cluster at
## > ## the RGB primaries. On the other hand on the SGI Indy at work the
## > ## effect is near perfect.
## >
## > par(bg = "gray")
##
## > pie(rep(1,24), col = rainbow(24), radius = 0.9)
##
## > title(main = "A Sample Color Wheel", cex.main = 1.4, font.main = 3)
##
## > title(xlab = "(Use this as a test of monitor linearity)",
## + cex.lab = 0.8, font.lab = 3)
##
## > ## We have already confessed to having these. This is just showing off
X11
## > ## color names (and the example (from the postscript manual) is pretty
"cute".
## >
## > pie.sales <- c(0.12, 0.3, 0.26, 0.16, 0.04, 0.12)
##
## > names(pie.sales) <- c("Blueberry", "Cherry",
## + "Apple", "Boston Cream", "Other", "Vanilla Cream")
##
## > pie(pie.sales,
## + col = c("purple","violetred1","green3","cornsilk","cyan","white"))
##
## > title(main = "January Pie Sales", cex.main = 1.8, font.main = 1)
##
## > title(xlab = "(Don't try this at home kids)", cex.lab = 0.8, font.lab =
3)
##
## > ## Boxplots: I couldn't resist the capability for filling the "box".
## > ## The use of color seems like a useful addition, it focuses attention
## > ## on the central bulk of the data.
## >
## > par(bg="cornsilk")
##
## > n <- 10
##
## > g <- gl(n, 100, n*100)
##
## > x <- rnorm(n*100) + sqrt(as.numeric(g))
##
## > boxplot(split(x,g), col="lavender", notch=TRUE)
##
## > title(main="Notched Boxplots", xlab="Group", font.main=4, font.lab=1)
##
## > ## An example showing how to fill between curves.
## >
## > par(bg="white")
##
## > n <- 100
##
## > x <- c(0,cumsum(rnorm(n)))
##
## > y <- c(0,cumsum(rnorm(n)))
##
## > xx <- c(0:n, n:0)
##
## > yy <- c(x, rev(y))
##
## > plot(xx, yy, type="n", xlab="Time", ylab="Distance")
##
## > polygon(xx, yy, col="gray")
##
## > title("Distance Between Brownian Motions")
##
## > ## Colored plot margins, axis labels and titles. You do need to be
## > ## careful with these kinds of effects. It's easy to go completely
## > ## over the top and you can end up with your lunch all over the
keyboard.
## > ## On the other hand, my market research clients love it.
## >
## > x <- c(0.00, 0.40, 0.86, 0.85, 0.69, 0.48, 0.54, 1.09, 1.11, 1.73, 2.05,
2.02)
##
## > par(bg="lightgray")
##
## > plot(x, type="n", axes=FALSE, ann=FALSE)
##
## > usr <- par("usr")
##
## > rect(usr[1], usr[3], usr[2], usr[4], col="cornsilk", border="black")
##
## > lines(x, col="blue")
##
## > points(x, pch=21, bg="lightcyan", cex=1.25)
##
## > axis(2, col.axis="blue", las=1)
##
## > axis(1, at=1:12, lab=month.abb, col.axis="blue")
##
## > box()
##
## > title(main= "The Level of Interest in R", font.main=4, col.main="red")
##
## > title(xlab= "1996", col.lab="red")
##
## > ## A filled histogram, showing how to change the font used for the
## > ## main title without changing the other annotation.
## >
## > par(bg="cornsilk")
##
## > x <- rnorm(1000)
##
## > hist(x, xlim=range(-4, 4, x), col="lavender", main="")
##
## > title(main="1000 Normal Random Variates", font.main=3)
##
## > ## A scatterplot matrix
## > ## The good old Iris data (yet again)
## >
## > pairs(iris[1:4], main="Edgar Anderson's Iris Data", font.main=4, pch=19)
##
## > pairs(iris[1:4], main="Edgar Anderson's Iris Data", pch=21,
## + bg = c("red", "green3", "blue")[unclass(iris$Species)])
##
## > ## Contour plotting
## > ## This produces a topographic map of one of Auckland's many volcanic
"peaks".
## >
## > x <- 10*1:nrow(volcano)
##
## > y <- 10*1:ncol(volcano)
##
## > lev <- pretty(range(volcano), 10)
##
## > par(bg = "lightcyan")
##
## > pin <- par("pin")
##
## > xdelta <- diff(range(x))
##
## > ydelta <- diff(range(y))
##
## > xscale <- pin[1]/xdelta
##
## > yscale <- pin[2]/ydelta
##
## > scale <- min(xscale, yscale)
##
## > xadd <- 0.5*(pin[1]/scale - xdelta)
##
## > yadd <- 0.5*(pin[2]/scale - ydelta)
##
## > plot(numeric(0), numeric(0),
## + xlim = range(x)+c(-1,1)*xadd, ylim = range(y)+c(-1,1)*yadd,
## + type = "n", ann = FALSE)
##
## > usr <- par("usr")
##
## > rect(usr[1], usr[3], usr[2], usr[4], col="green3")
##
## > contour(x, y, volcano, levels = lev, col="yellow", lty="solid",
add=TRUE)
##
## > box()
##
## > title("A Topographic Map of Maunga Whau", font= 4)
##
## > title(xlab = "Meters North", ylab = "Meters West", font= 3)
##
## > mtext("10 Meter Contour Spacing", side=3, line=0.35, outer=FALSE,
## + at = mean(par("usr")[1:2]), cex=0.7, font=3)
##
## > ## Conditioning plots
## >
## > par(bg="cornsilk")
##
## > coplot(lat ~ long | depth, data = quakes, pch = 21, bg = "green3")
##
## > par(opar)
# Tracé de l’objet Nile de type ts
plot(Nile)
title(main="Un petit titre", sub="Un sous-titre")
plot(AirPassengers,lwd=3,type="b",col="skyblue2",col.main=3,main="Joli !",fon
t.main=2)
# Tracé de l’objet volcano de type matrix
par(mfrow=c(1,2))
x=10*1:87
y=10*1:61
image(x,y,volcano,col=terrain.colors(20))
contour(x,y,volcano,col=terrain.colors(15))
windows()
persp(x,y,volcano,phi=40,theta=30,expand=0.75,col="lightgreen")
# Etude de l’objet cars de type data.frame
# Affiche le nom des variables du data.frame
names(cars)
## [1] "speed" "dist"
# Affiche les données de la première variable
row.names(cars)
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"
"15"
## [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
"30"
## [31] "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44"
"45"
## [46] "46" "47" "48" "49" "50"
cars$speed # Affiche les données de la première variable
## [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14
15 15
## [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24
24 25
cars[,1] # Affichage identique par syntaxe matricielle
## [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14
15 15
## [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24
24 25
attach(cars) # Accès aux variables par les noms speed et dist
dev.new() # Ouverture d’un nouveau dispositif graphique
pairs(cars) # Tracé des nuages de points
matplot(cars,type="l") # Tracé des courbes sur un même graphique
windows() # Ouverture d’un nouveau dispositif
layout(matrix(1:4,2,2),width=c(4,1),height=c(2,2))
# Graphiques de Cleveland et boîtes à moustaches des données cars
dotchart(speed, main="Cleveland speed")
text(5,20,as.expression(substitute(min==value1,list(value1=min(speed)))))
text(5,16,as.expression(substitute(MAX==value2,list(value2=max(speed)))))
dotchart(dist, main="Cleveland dist")
boxplot(speed,main="Boxplot speed")
boxplot(sort(dist), main="Boxplot dist")
# Histogrammes des données speed
x11()
par(bg="lightgreen",mfrow=c(1,2))
barplot(table(speed), main="Diagramme en bâtons",col=rainbow(10))
hist(speed, main="Histogramme",col="tomato")
rug(speed)
dev.copy2pdf(file="Histogramme.pdf") # Sauvegarde du dernier graphique
## png
## 2
detach(cars) # Détachons les données cars
Exercice01:
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse
2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ──────────────────────────────────────────
tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all
conflicts to become errors
library(ggplot2)
prestige<-read.table(file.choose(),h=T)
tibble(prestige)
## # A tibble: 102 × 6
## education income women prestige census type
## <dbl> <int> <dbl> <dbl> <int> <chr>
## 1 13.1 12351 11.2 68.8 1113 prof
## 2 12.3 25879 4.02 69.1 1130 prof
## 3 12.8 9271 15.7 63.4 1171 prof
## 4 11.4 8865 9.11 56.8 1175 prof
## 5 14.6 8403 11.7 73.5 2111 prof
## 6 15.6 11030 5.13 77.6 2113 prof
## 7 15.1 8258 25.6 72.6 2133 prof
## 8 15.4 14163 2.69 78.1 2141 prof
## 9 14.5 11377 1.03 73.1 2143 prof
## 10 14.6 11023 0.94 68.8 2153 prof
## # ℹ 92 more rows
ggplot(data=prestige) + geom_histogram(aes(x = prestige))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(data=prestige) + geom_point(aes(x = prestige, y = women))
ggplot(data=prestige) + geom_point(aes(x = prestige, y = women),col=2)
Key R function: geom_boxplot() [ggplot2 package] Key arguments to customize the plot:
width: the width of the box plot notch: logical. If TRUE, creates a notched boxplot. The notch
displays a confidence interval around the median which is normally based on the median
+/- 1.58*IQR/sqrt(n). Notches are used to compare groups; if the notches of two boxes do
not overlap, this is a strong evidence that the medians differ. color, size, linetype: Border
line color, size and type fill: box plot areas fill color outlier.colour, outlier.shape, outlier.size:
The color, the shape and the size for outlying points.
CONCLUSION:
We can say that R is a huge library of graphs that can be manipulated and modified as we
want to an easy and organized data analysis .