UNIT 2: READING AND WRITING FILES
Reading Tabular Data from files in R
Programming
Often, the data which is to be read and worked upon is already
stored in a file but is present outside the R environment. Hence,
importing data into R is a mandatory task in such
circumstances. The formats which are supported by R are CSV,
JSON, Excel, Text, XML, etc. The majority of times, the data to
be read into R is in tabular format. The functions used for
reading such data, which is stored in the form of rows and
columns, import the data and return data frame in R. Data
frame is preferred in R because it is easier to extract data from
rows and columns of a data frame for statistical computation
tasks than other data structures in R. The most common
functions which are used for reading tabular data into R
are:- read.table(), read.csv(), fromJSON() and read.xlxs() .
Reading Data from Text File
Functions used for reading tabular data from a text file
is read.table() Parameters:
file: Specifies the name of the file.
header:The header is a logical flag indicating whether the first
line is a header line contains data or not.
nrows: Specifies number of rows in the dataset.
skip: Helps in skipping of lines from the beginning.
colClasses: It is a character vector which indicates class of
each column of the data set.
sep: It a string indicating the way the columns are separated
that is by commas, spaces, colons, tabs etc.
For small or moderately sized data sets, we can
call read.table() without any arguments. R will automatically
figure out the number of rows, the number of columns, classes
of different columns, skip lines that start with #(comment
symbol), etc. If we do specify the arguments, it will make the
execution faster and efficient but here, since the dataset is
small so it would not make much of a difference as it is already
fast and efficient. Example: Let there be a tabular data
file GeeksforGeeks.txt saved in the current directory with data
as follows:
read.table("GeeksforGeeks.txt")
Output:
How to Extract Rows and Columns From Data
Frame
Commands to Extract Rows and Columns
The following represents different commands which could be
used to extract one or more rows with one or more columns.
Note that the output is extracted as a data frame. This could be
checked using the class command.
# All Rows and All Columns
df[,]
# First row and all columns
df[1,]
# First two rows and all columns
df[1:2,]
# First and third row and all columns
df[ c(1,3), ]
# First Row and 2nd and third column
df[1, 2:3]
# First, Second Row and Second and Third COlumn
df[1:2, 2:3]
# Just First Column with All rows
df[, 1]
# First and Third Column with All rows
23
df[,c(1,3)]
Command to Extract a Column as a Data Frame
The following represents a command which can be used to
extract a column as a data frame. If you use a command such
as df[,1], the output will be a numeric vector (in this case). To
get the output as a data frame, you would need to use
something like below.
# First Column as data frame
as.data.frame( df[,1], drop=false)
Command to Extract an Element
The following represents a command which could be used to
extract an element in a particular row and column. It is as
simple as writing a row and a column number, such as the
following:
# Element at 2nd row, third column
df[2,3]
R - CSV Files
In R, we can read data from files stored outside the R
environment. We can also write data into files which will be
stored and accessed by the operating system. R can read and
write into various file formats like csv, excel, xml etc.
In this chapter we will learn to read data from a csv file and
then write data into a csv file. The file should be present in
current working directory so that R can read it. Of course we
can also set our own directory and read files from there.
Getting and Setting the Working Directory
You can check which directory the R workspace is pointing to
using the getwd() function. You can also set a new working
directory using setwd()function.
# Get and print current working directory.
print(getwd())
# Set current working directory.
setwd("/web/com")
# Get and print current working directory.
print(getwd())
When we execute the above code, it produces the following
result −
[1] "/web/com/1441086124_2016"
[1] "/web/com"
This result depends on your OS and your current directory
where you are working.
Input as CSV File
The csv file is a text file in which the values in the columns are
separated by a comma. Let's consider the following data
present in the file named input.csv.
You can create this file using windows notepad by copying and
pasting this data. Save the file as input.csv using the save As
All files(*.*) option in notepad.
id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
8,Guru,722.5,2014-06-17,Finance
Reading a CSV File
Following is a simple example of read.csv() function to read a
CSV file available in your current working directory −
data <- read.csv("input.csv")
print(data)
When we execute the above code, it produces the following
result −
id, name, salary, start_date, dept
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
Analyzing the CSV File
By default the read.csv() function gives the output as a data
frame. This can be easily checked as follows. Also we can check
the number of columns and rows.
data <- read.csv("input.csv")
print(is.data.frame(data))
print(ncol(data))
print(nrow(data))
When we execute the above code, it produces the following
result −
[1] TRUE
[1] 5
[1] 8
Once we read data in a data frame, we can apply all the
functions applicable to data frames as explained in subsequent
section.
Get the maximum salary
# Create a data frame.
data <- read.csv("input.csv")
# Get the max salary from data frame.
sal <- max(data$salary)
print(sal)
When we execute the above code, it produces the following
result −
[1] 843.25
Get the details of the person with max salary
We can fetch rows meeting specific filter criteria similar to a
SQL where clause.
# Create a data frame.
data <- read.csv("input.csv")
# Get the max salary from data frame.
sal <- max(data$salary)
# Get the person detail having max salary.
retval <- subset(data, salary == max(salary))
print(retval)
When we execute the above code, it produces the following
result −
id name salary start_date dept
5 NA Gary 843.25 2015-03-27 Finance
Get all the people working in IT department
# Create a data frame.
data <- read.csv("input.csv")
retval <- subset( data, dept == "IT")
print(retval)
When we execute the above code, it produces the following
result –
id name salary start_date dept
1 1 Rick 623.3 2012-01-01 IT
3 3 Michelle 611.0 2014-11-15 IT
6 6 Nina 578.0 2013-05-21 IT
Get the persons in IT department whose salary is greater than
600
# Create a data frame.
data <- read.csv("input.csv")
info <- subset(data, salary > 600 & dept == "IT")
print(info)
When we execute the above code, it produces the following
result −
id name salary start_date dept
1 1 Rick 623.3 2012-01-01 IT
3 3 Michelle 611.0 2014-11-15 IT
Get the people who joined on or after 2014
# Create a data frame.
data <- read.csv("input.csv")
retval <- subset(data, as.Date(start_date) > as.Date("2014-01-
01"))
print(retval)
When we execute the above code, it produces the following
result −
id name salary start_date dept
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
8 8 Guru 722.50 2014-06-17 Finance
Writing into a CSV File
R can create csv file form existing data frame.
The write.csv() function is used to create the csv file. This file
gets created in the working directory.
# Create a data frame.
data <- read.csv("input.csv")
retval <- subset(data, as.Date(start_date) > as.Date("2014-01-
01"))
# Write filtered data into a new file.
write.csv(retval,"output.csv")
newdata <- read.csv("output.csv")
print(newdata)
When we execute the above code, it produces the following
result −
X id name salary start_date dept
13 3 Michelle 611.00 2014-11-15 IT
24 4 Ryan 729.00 2014-05-11 HR
35 NA Gary 843.25 2015-03-27 Finance
48 8 Guru 722.50 2014-06-17 Finance
Here the column X comes from the data set newper. This can
be dropped using additional parameters while writing the file.
# Create a data frame.
data <- read.csv("input.csv")
retval <- subset(data, as.Date(start_date) > as.Date("2014-01-
01"))
# Write filtered data into a new file.
write.csv(retval,"output.csv", row.names = FALSE)
newdata <- read.csv("output.csv")
print(newdata)
When we execute the above code, it produces the following
result –
id name salary start_date dept
1 3 Michelle 611.00 2014-11-15 IT
2 4 Ryan 729.00 2014-05-11 HR
3 NA Gary 843.25 2015-03-27 Finance
4 8 Guru 722.50 2014-06-17 Finance
R - Excel File
Microsoft Excel is the most widely used spreadsheet program
which stores data in the .xls or .xlsx format. R can read directly
from these files using some excel specific packages. Few such
packages are - XLConnect, xlsx, gdata etc. We will be using xlsx
package. R can also write into excel file using this package.
Install xlsx Package
You can use the following command in the R console to install
the "xlsx" package. It may ask to install some additional
packages on which this package is dependent. Follow the same
command with required package name to install the additional
packages.
install.packages("xlsx")
Verify and Load the "xlsx" Package
Use the following command to verify and load the "xlsx"
package.
# Verify the package is installed.
any(grepl("xlsx",installed.packages()))
# Load the library into R workspace.
library("xlsx")
When the script is run we get the following output.
[1] TRUE
Loading required package: rJava
Loading required package: methods
Loading required package: xlsxjars
Input as xlsx File
Open Microsoft excel. Copy and paste the following data in the
work sheet named as sheet1.
id name salary start_date dept
1 Rick 623.3 1/1/2012 IT
2 Dan 515.2 9/23/2013 Operations
3 Michelle 611 11/15/2014 IT
4 Ryan 729 5/11/2014 HR
5 Gary 43.25 3/27/2015 Finance
6 Nina 578 5/21/2013 IT
7 Simon 632.8 7/30/2013 Operations
8 Guru 722.5 6/17/2014 Finance
Also copy and paste the following data to another worksheet
and rename this worksheet to "city".
name city
Rick Seattle
Dan Tampa
Michelle Chicago
Ryan Seattle
Gary Houston
Nina Boston
Simon Mumbai
Guru Dallas
Save the Excel file as "input.xlsx". You should save it in the
current working directory of the R workspace.
Reading the Excel File
The input.xlsx is read by using the read.xlsx() function as
shown below. The result is stored as a data frame in the R
environment.
# Read the first worksheet in the file input.xlsx.
data <- read.xlsx("input.xlsx", sheetIndex = 1)
print(data)
When we execute the above code, it produces the following
result −
id, name, salary, start_date, dept
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
R - XML Files
XML is a file format which shares both the file format and the
data on the World Wide Web, intranets, and elsewhere using
standard ASCII text. It stands for Extensible Markup Language
(XML). Similar to HTML it contains markup tags. But unlike
HTML where the markup tag describes structure of the page, in
xml the markup tags describe the meaning of the data
contained into he file.
You can read a xml file in R using the "XML" package. This
package can be installed using following command.
install.packages("XML")
Input Data
Create a XMl file by copying the below data into a text editor
like notepad. Save the file with a .xml extension and choosing
the file type as all files(*.*).
<RECORDS>
<EMPLOYEE>
<ID>1</ID>
<NAME>Rick</NAME>
<SALARY>623.3</SALARY>
<STARTDATE>1/1/2012</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>2</ID>
<NAME>Dan</NAME>
<SALARY>515.2</SALARY>
<STARTDATE>9/23/2013</STARTDATE>
<DEPT>Operations</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>3</ID>
<NAME>Michelle</NAME>
<SALARY>611</SALARY>
<STARTDATE>11/15/2014</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>4</ID>
<NAME>Ryan</NAME>
<SALARY>729</SALARY>
<STARTDATE>5/11/2014</STARTDATE>
<DEPT>HR</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>5</ID>
<NAME>Gary</NAME>
<SALARY>843.25</SALARY>
<STARTDATE>3/27/2015</STARTDATE>
<DEPT>Finance</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>6</ID>
<NAME>Nina</NAME>
<SALARY>578</SALARY>
<STARTDATE>5/21/2013</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>7</ID>
<NAME>Simon</NAME>
<SALARY>632.8</SALARY>
<STARTDATE>7/30/2013</STARTDATE>
<DEPT>Operations</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>8</ID>
<NAME>Guru</NAME>
<SALARY>722.5</SALARY>
<STARTDATE>6/17/2014</STARTDATE>
<DEPT>Finance</DEPT>
</EMPLOYEE>
</RECORDS>
Reading XML File
The xml file is read by R using the function xmlParse(). It is
stored as a list in R.
# Load the package required to read XML files.
library("XML")
# Also load the other required package.
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Print the result.
print(result)
When we execute the above code, it produces the following
result −
1
Rick
623.3
1/1/2012
IT
2
Dan
515.2
9/23/2013
Operations
3
Michelle
611
11/15/2014
IT
4
Ryan
729
5/11/2014
HR
5
Gary
843.25
3/27/2015
Finance
6
Nina
578
5/21/2013
IT
7
Simon
632.8
7/30/2013
Operations
8
Guru
722.5
6/17/2014
Finance
Get Number of Nodes Present in XML File
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Find number of nodes in the root.
rootsize <- xmlSize(rootnode)
# Print the result.
print(rootsize)
When we execute the above code, it produces the following
result −
output
[1] 8
Details of the First Node
Let's look at the first record of the parsed file. It will give us an
idea of the various elements present in the top level node.
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Print the result.
print(rootnode[1])
When we execute the above code, it produces the following
result −
$EMPLOYEE
1
Rick
623.3
1/1/2012
IT
attr(,"class")
[1] "XMLInternalNodeList" "XMLNodeList"
Get Different Elements of a Node
# Load the packages required to read XML files.
library("XML")
library("methods")
# Give the input file name to the function.
result <- xmlParse(file = "input.xml")
# Exract the root node form the xml file.
rootnode <- xmlRoot(result)
# Get the first element of the first node.
print(rootnode[[1]][[1]])
# Get the fifth element of the first node.
print(rootnode[[1]][[5]])
# Get the second element of the third node.
print(rootnode[[3]][[2]])
When we execute the above code, it produces the following
result −
1
IT
Michelle
XML to Data Frame
To handle the data effectively in large files we read the data in
the xml file as a data frame. Then process the data frame for
data analysis.
# Load the packages required to read XML files.
library("XML")
library("methods")
# Convert the input xml file to a data frame.
xmldataframe <- xmlToDataFrame("input.xml")
print(xmldataframe)
When we execute the above code, it produces the following
result −
ID NAME SALARY STARTDATE DEPT
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
As the data is now available as a dataframe we can use data
frame related function to read and manipulate the file.
R - JSON Files
JSON file stores data as text in human-readable format. Json
stands for JavaScript Object Notation. R can read JSON files
using the rjson package.
Install rjson Package
In the R console, you can issue the following command to install
the rjson package.
install.packages("rjson")
Input Data
Create a JSON file by copying the below data into a text editor
like notepad. Save the file with a .json extension and choosing
the file type as all files(*.*).
{
"ID":["1","2","3","4","5","6","7","8" ],
"Name":
["Rick","Dan","Michelle","Ryan","Gary","Nina","Simon","Guru" ],
"Salary":
["623.3","515.2","611","729","843.25","578","632.8","722.5" ],
"StartDate":
[ "1/1/2012","9/23/2013","11/15/2014","5/11/2014","3/27/2015"
,"5/21/2013",
"7/30/2013","6/17/2014"],
"Dept":
[ "IT","Operations","IT","HR","Finance","IT","Operations","Financ
e"]
}
Read the JSON File
The JSON file is read by R using the function from JSON(). It is
stored as a list in R.
# Load the package required to read JSON files.
library("rjson")
# Give the input file name to the function.
result <- fromJSON(file = "input.json")
# Print the result.
print(result)
When we execute the above code, it produces the following
result −
$ID
[1] "1" "2" "3" "4" "5" "6" "7" "8"
$Name
[1] "Rick" "Dan" "Michelle" "Ryan" "Gary" "Nina"
"Simon" "Guru"
$Salary
[1] "623.3" "515.2" "611" "729" "843.25" "578" "632.8"
"722.5"
$StartDate
[1] "1/1/2012" "9/23/2013" "11/15/2014" "5/11/2014"
"3/27/2015" "5/21/2013"
"7/30/2013" "6/17/2014"
$Dept
[1] "IT" "Operations" "IT" "HR" "Finance" "IT"
"Operations" "Finance"
Convert JSON to a Data Frame
We can convert the extracted data above to a R data frame for
further analysis using the as.data.frame() function.
# Load the package required to read JSON files.
library("rjson")
# Give the input file name to the function.
result <- fromJSON(file = "input.json")
# Convert JSON file to a data frame.
json_data_frame <- as.data.frame(result)
print(json_data_frame)
When we execute the above code, it produces the following
result −
id, name, salary, start_date, dept
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
Reading Files in R Programming
So far the operations using the R program are done on a
prompt/terminal which is not stored anywhere. But in the
software industry, most of the programs are written to store the
information fetched from the program. One such way is to store
the fetched information in a file. So the two most common
operations that can be performed on a file are:
Importing/Reading Files in R
Exporting/Writing Files in R
Reading Files in R Programming Language
When a program is terminated, the entire data is lost. Storing in
a file will preserve our data even if the program terminates. If
we have to enter a large number of data, it will take a lot of
time to enter them all. However, if we have a file containing all
the data, we can easily access the contents of the file using a
few commands in R. You can easily move your data from one
computer to another without any changes. So those files can be
stored in various formats. It may be stored in a i.e..txt(tab-
separated value) file, or in a tabular format i.e .csv(comma-
separated value) file or it may be on the internet or cloud. R
provides very easier methods to read those files.
File reading in R
One of the important formats to store a file is in a text file. R
provides various methods that one can read data from a text
file.
read.delim(): This method is used for reading “tab-
separated value” files (“.txt”). By default, point (“.”) is used
as decimal point.
Syntax: read.delim(file, header = TRUE, sep = “\t”, dec = “.”,
…)
Parameters:
file: the path to the file containing the data to be read into R.
header: a logical value. If TRUE, read.delim() assumes that
your file has a header row, so row 1 is the name of each
column. If that’s not the case, you can add the argument
header = FALSE.
sep: the field separator character. “\t” is used for a tab-
delimited file.
dec: the character used in the file for decimal points.
Example:
R
# R program reading a text file
# Read a text file using read.delim()
myData = read.delim("geeksforgeeks.txt", header = FALSE)
print(myData)
Output:
1 A computer science portal for geeks.
Note: The above R code, assumes that the file
“geeksforgeeks.txt” is in your current working directory. To
know your current working directory, type the
function getwd() in R console.
read.delim2(): This method is used for reading “tab-
separated value” files (“.txt”). By default, point (“,”) is used
as decimal points.
Syntax: read.delim2(file, header = TRUE, sep = “\t”, dec = “,”,
…)
Parameters:
file: the path to the file containing the data to be read into R.
header: a logical value. If TRUE, read.delim2() assumes that
your file has a header row, so row 1 is the name of each
column. If that’s not the case, you can add the argument
header = FALSE.
sep: the field separator character. “\t” is used for a tab-
delimited file.
dec: the character used in the file for decimal points.
Example:
R
# R program reading a text file
# Read a text file using read.delim2
myData = read.delim2("geeksforgeeks.txt", header = FALSE)
print(myData)
Output:
1 A computer science portal for geeks.
file.choose(): In R it’s also possible to choose a file
interactively using the function file.choose(), and if you’re a
beginner in R programming then this method is very useful
for you.
Example:
R
# R program reading a text file using file.choose()
myFile = read.delim(file.choose(), header = FALSE)
# If you use the code above in RStudio
# you will be asked to choose a file
print(myFile)
Output:
1 A computer science portal for geeks.
read_tsv(): This method is also used for to read a tab
separated (“\t”) values by using the help of readr package.
Syntax: read_tsv(file, col_names = TRUE)
Parameters:
file: the path to the file containing the data to be read into R.
col_names: Either TRUE, FALSE, or a character vector
specifying column names. If TRUE, the first row of the input
will be used as the column names.
Example:
R
# R program to read text file
# using readr package
# Import the readr library
library(readr)
# Use read_tsv() to read text file
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)
Output:
# A tibble: 1 x 1
X1
1 A computer science portal for geeks.
Note: You can also use file.choose() with read_tsv() just like
before.
# Read a txt file
myData <- read_tsv(file.choose())
Reading one line at a time
read_lines(): This method is used for the reading line of your
own choice whether it’s one or two or ten lines at a time. To
use this method we have to import reader package.
Syntax: read_lines(file, skip = 0, n_max = -1L)
Parameters:
file: file path
skip: Number of lines to skip before reading data
n_max: Numbers of lines to read. If n is -1, all lines in the file
will be read.
Example:
R
# R program to read one line at a time
# Import the readr library
library(readr)
# read_lines() to read one line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 1)
print(myData)
# read_lines() to read two line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 2)
print(myData)
Output:
[1] "A computer science portal for geeks."
[1] "A computer science portal for geeks."
[2] "Geeksforgeeks is founded by Sandeep Jain Sir."
Reading the whole file
read_file(): This method is used for reading the whole file. To
use this method we have to import reader package.
Syntax: read_lines(file)
file: the file path
Example:
R
# R program to read the whole file
# Import the readr library
library(readr)
# read_file() to read the whole file
myData = read_file("geeksforgeeks.txt")
print(myData)
Output:
[1] “A computer science portal for geeks.\r\nGeeksforgeeks is
founded by Sandeep Jain Sir.\r\nI am an intern at this amazing
platform.”
Reading a file in a table format
Another popular format to store a file is in a tabular format. R
provides various methods that one can read data from a
tabular formatted data file.
read.table(): read.table() is a general function that can be
used to read a file in table format. The data will be imported as
a data frame.
Syntax: read.table(file, header = FALSE, sep = “”, dec = “.”)
Parameters:
file: the path to the file containing the data to be imported
into R.
header: logical value. If TRUE, read.table() assumes that
your file has a header row, so row 1 is the name of each
column. If that’s not the case, you can add the argument
header = FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.
Example:
R
# R program to read a file in table format
# Using read.table()
myData = read.table("basic.csv")
print(myData)
Output:
1 Name,Age,Qualification,Address
2 Amiya,18,MCA,BBS
3 Niru,23,Msc,BLS
4 Debi,23,BCA,SBP
5 Biku,56,ISC,JJP
read.csv(): read.csv() is used for reading “comma separated
value” files (“.csv”). In this also the data will be imported as a
data frame.
Syntax: read.csv(file, header = TRUE, sep = “,”, dec = “.”, …)
Parameters:
file: the path to the file containing the data to be imported
into R.
header: logical value. If TRUE, read.csv() assumes that your
file has a header row, so row 1 is the name of each column.
If that’s not the case, you can add the argument header =
FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.
Example:
R
# R program to read a file in table format
# Using read.csv()
myData = read.csv("basic.csv")
print(myData)
Output:
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
read.csv2(): read.csv() is used for variant used in countries
that use a comma “,” as decimal point and a semicolon “;” as
field separators.
Syntax: read.csv2(file, header = TRUE, sep = “;”, dec = “,”,
…)
Parameters:
file: the path to the file containing the data to be imported
into R.
header: logical value. If TRUE, read.csv2() assumes that your
file has a header row, so row 1 is the name of each column.
If that’s not the case, you can add the argument header =
FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.
Example:
R
# R program to read a file in table format
# Using read.csv2()
myData = read.csv2("basic.csv")
print(myData)
Output:
Name.Age.Qualification.Address
1 Amiya,18,MCA,BBS
2 Niru,23,Msc,BLS
3 Debi,23,BCA,SBP
4 Biku,56,ISC,JJP
file.choose(): You can also
use file.choose() with read.csv() just like before.
Example:
R
# R program to read a file in table format
# Using file.choose() inside read.csv()
myData = read.csv(file.choose())
# If you use the code above in RStudio
# you will be asked to choose a file
print(myData)
Output:
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
read_csv(): This method is also used for to read a comma (“,”)
separated values by using the help of readr package.
Syntax: read_csv(file, col_names = TRUE)
Parameters:
file: the path to the file containing the data to be read into R.
col_names: Either TRUE, FALSE, or a character vector
specifying column names. If TRUE, the first row of the input
will be used as the column names.
Example:
R
# R program to read a file in table format
# using readr package
# Import the readr library
library(readr)
# Using read_csv() method
myData = read_csv("basic.csv", col_names = TRUE)
print(myData)
Output:
Parsed with column specification:
cols(
Name = col_character(),
Age = col_double(),
Qualification = col_character(),
Address = col_character()
)
# A tibble: 4 x 4
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
Reading a file from the internet
It’s possible to use the
functions read.delim(), read.csv() and read.table() to
import files from the web.
Example:
R
# R program to read a file from the internet
# Using read.delim()
myData =
read.delim("http://www.sthda.com/upload/boxplot_format.txt")
print(head(myData))
Output:
Nom variable Group
1 IND1 10 A
2 IND2 7 A
3 IND3 20 A
4 IND4 14 A
5 IND5 14 A
6 IND6 12 A
Writing to Files in R Programming
R programming Language is one of the very powerful
languages specially used for data analytics in various fields.
Analysis of data means reading and writing data from various
files like excel, CSV, text files, etc. Today we will be dealing
with various ways of writing data to different types of files using
R programming.
R – Writing to Files
Writing Data to CSV files in R Programming Language
CSV stands for Comma Separated Values. These files are used
to handle a large amount of statistical data. Following is the
syntax to write to a CSV file:
Syntax:
R
write.csv(my_data, file = "my_data.csv")
write.csv2(my_data, file = "my_data.csv")
Here,
csv() and csv2() are the function in R programming.
write.csv() uses “.” for the decimal point and a comma (“,
”) for the separator.
write.csv2() uses a comma (“, ”) for the decimal point and
a semicolon (“;”) for the separator.
Writing Data to text files
Text files are commonly used in almost every application in our
day-to-day life as a step for the “Paperless World”. Well, writing
to .txt files is very similar to that of the CSV files. Following is
the syntax to write to a text file:
Syntax:
R
write.table(my_data, file = "my_data.txt", sep = "")
Writing Data to Excel files
To write data to excel we need to install the package known as
“xlsx package”, it is basically a java based solution for reading,
writing, and committing changes to excel files. It can be
installed as follows:
install.packages("xlsx")
and can be loaded and General syntax of using it is:
R
library("xlsx")
write.xlsx(my_data, file = "result.xlsx",
sheetName = "my_data", append = FALSE).