Introduction to Biostatistics
Shamik Sen
Dept. of Biosciences & Bioengineering
              IIT Bombay
                      What is R?
• Software environment for statistical computing and data
  analysis
• R is a GNU package and source code of R is freely available.
• Pre-compiled binary versions are provided for various
  operating systems.
• R has a command line interface. But many graphical user
  interfaces are available.
• R can produce publication-quality graphs with
  mathematical symbols
R is an interpreted language
                Applications of R
• Mainly used by statisticians and other practitioners
  requiring an environment for statistical computation and
  software development.
• R supports matrix arithmetic and can also operate as a
  general matrix calculation toolbox – with performance
  benchmarks comparable to GNU Octave or MATLAB
• R can be used to perform high-performance statistical
  computation required for statistical analysis of Big Data.
• R is also being used in Business Analytics.
                 Getting R - 1
• R is an open source programming language. Due
  to its popularity pre-compiled R binaries are also
  available for different platforms.
• Binaries for windows, Unix or MacOS can be
  downloaded from R project website
  https://www.r-project.org.
• These binaries can directly be used to install the R
  programming of a computer.
                Getting R - 2
• However, R is command line so may not be
  suitable for learners.
• For this, many graphical under interfaces
  (GUIs) software are available for R.
• These GUIs-based software provide an user
  friendly interface to write, correct and run R
  code.
• Rstudio is one such widely used GUI interface
  for R.
              Getting R - 3
  • RStudio
                              workspace
Command
windows
                                Additional
                                information
Creating vectors in R
Creating vectors in R
Basic operations on vectors - 1
Basic operations on vectors - 1
               Importing data to R - 1
CSV: comma separated values
.xlsx format
 .csv format
Importing CSV data to R
                          Workspace
Calculating descriptive statistics in R-1
Finding frequency in categorical data
Mean, median, min & max
Calculating descriptive statistics in R-2
Variance and standard deviation
Alternatively
Calculating descriptive statistics in R-3
  Querying
  Sorting
  Which()
  Summary()
Plotting in R - 1
Plotting in R - 2