Introduction &
read.csv
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R
Filip Schouwenaars
Instructor, DataCamp
Importing data in R
INTRODUCTION TO IMPORTING DATA IN R
Importing data in R
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
5 types
INTRODUCTION TO IMPORTING DATA IN R
Flat files
states.csv
state,capital,pop_mill,area_sqm
South Dakota,Pierre,0.853,77116
New York,Albany,19.746,54555
Oregon,Salem,3.970,98381
Vermont,Montpelier,0.627,9616
Hawaii,Honolulu,1.420,10931
wanted_df
state capital pop_mill area_sqm
1 South Dakota Pierre 0.853 77116
2 New York Albany 19.746 54555
3 Oregon Salem 3.970 98381
4 Vermont Montpelier 0.627 9616
5 Hawaii Honolulu 1.420 10931
INTRODUCTION TO IMPORTING DATA IN R
utils - read.csv
Loaded by default when you start R
read.csv("states.csv")
What if file in datasets folder of home directory ( ~ )?
path <- file.path("~", "datasets", "states.csv")
path
"~/datasets/states.csv"
read.csv(path)
INTRODUCTION TO IMPORTING DATA IN R
Let's practice!
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R
read.delim &
read.table
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R
Filip Schouwenaars
Instructor, DataCamp
Tab-delimited file
states.txt
state capital pop_mill area_sqm
South Dakota Pierre 0.853 77116
New York Albany 19.746 54555
Oregon Salem 3.970 98381
Vermont Montpelier 0.627 9616
Hawaii Honolulu 1.420 10931
read.delim("states.txt")
state capital pop_mill area_sqm
1 South Dakota Pierre 0.853 77116
2 New York Albany 19.746 54555
3 Oregon Salem 3.970 98381
4 Vermont Montpelier 0.627 9616
5 Hawaii Honolulu 1.420 10931
INTRODUCTION TO IMPORTING DATA IN R
Exotic file format
states2.txt
state/capital/pop_mill/area_sqm
South Dakota/Pierre/0.853/77116
New York/Albany/19.746/54555
Oregon/Salem/3.970/98381
Vermont/Montpelier/0.627/9616
Hawaii/Honolulu/1.420/10931
INTRODUCTION TO IMPORTING DATA IN R
read.table()
Read any tabular file as a data frame
Number of arguments is huge
# Read data with the first row as column headers
read.table("states2.txt",
header = TRUE,
sep = "/")
state capital pop_mill area_sqm
1 South Dakota Pierre 0.853 77116
2 New York Albany 19.746 54555
3 Oregon Salem 3.970 98381
4 Vermont Montpelier 0.627 9616
5 Hawaii Honolulu 1.420 10931
INTRODUCTION TO IMPORTING DATA IN R
Let's practice!
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R
Final Thoughts
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R
Filip Schouwenaars
Instructor, DataCamp
Wrappers
read.table() is the main function
read.csv() = wrapper for CSV
read.delim() = wrapper for tab-delimited files
INTRODUCTION TO IMPORTING DATA IN R
read.csv
states.csv
state,capital,pop_mill,area_sqm
South Dakota,Pierre,0.853,77116
New York,Albany,19.746,54555
Oregon,Salem,3.970,98381
Vermont,Montpelier,0.627,9616
Hawaii,Honolulu,1.420,10931
Defaults
header = TRUE
sep = ","
read.table("states.csv", header = TRUE, sep = ",")
read.csv("states.csv")
INTRODUCTION TO IMPORTING DATA IN R
read.delim
states.txt
state capital pop_mill area_sqm
South Dakota Pierre 0.853 77116
New York Albany 19.746 54555
Oregon Salem 3.970 98381
Vermont Montpelier 0.627 9616
Hawaii Honolulu 1.420 10931
Defaults
header = TRUE
sep = "\t"
read.table("states.txt", header = TRUE, sep = "\t")
read.delim("states.txt")
INTRODUCTION TO IMPORTING DATA IN R
Documentation
?read.table
INTRODUCTION TO IMPORTING DATA IN R
Locale differences
states_aye.csv
state,capital,pop_mill,area_sqm
South Dakota,Pierre,0.853,77116
New York,Albany,19.746,54555
Oregon,Salem,3.970,98381
Vermont,Montpelier,0.627,9616
Hawaii,Honolulu,1.420,10931
states_nay.csv
state;capital;pop_mill;area_sqm
South Dakota;Pierre;0,853;77116
New York;Albany;19,746;54555
Oregon;Salem;3,97;98381
Vermont;Montpelier;0,627;9616
Hawaii;Honolulu;1,42;10931
INTRODUCTION TO IMPORTING DATA IN R
Locale differences
read.csv(file, header = TRUE, sep = ",", quote = "\"",
dec = ".", fill = TRUE, comment.char = "", ...)
read.csv2(file, header = TRUE, sep = ";", quote = "\"",
dec = ",", fill = TRUE, comment.char = "", ...)
read.delim(file, header = TRUE, sep = "\t", quote = "\"",
dec = ".", fill = TRUE, comment.char = "", ...)
read.delim2(file, header = TRUE, sep = "\t", quote = "\"",
dec = ",", fill = TRUE, comment.char = "", ...)
INTRODUCTION TO IMPORTING DATA IN R
states_nay.csv
read.csv("states_nay.csv")
state.capital.pop_mill.area_sqm
South Dakota;Pierre;0 853;77116
New York;Albany;19 746;54555
Oregon;Salem;3 97;98381
Vermont;Montpelier;0 627;9616
Hawaii;Honolulu;1 42;10931
read.csv2("states_nay.csv")
state capital pop_mill area_sqm
1 South Dakota Pierre 0.853 77116
2 New York Albany 19.746 54555
3 Oregon Salem 3.970 98381
4 Vermont Montpelier 0.627 9616
5 Hawaii Honolulu 1.420 10931
INTRODUCTION TO IMPORTING DATA IN R
Let's practice!
I N T R O D U C T I O N T O I M P O R T I N G D ATA I N R