15.3 Adding a Column to a Data Frame
15.3.2 Solution
Use mutate()
from dplyr to add a new column and assign values to it. This returns a new data frame, which you’ll typically want save over the original.
If you assign a single value to the new column, the entire column will be filled with that value. This adds a column named newcol
, filled with NA
:
library(dplyr)
%>%
ToothGrowth mutate(newcol = NA)
#> len supp dose newcol
#> 1 4.2 VC 0.5 NA
#> 2 11.5 VC 0.5 NA
#> ...<56 more rows>...
#> 59 29.4 OJ 2.0 NA
#> 60 23.0 OJ 2.0 NA
You can also assign a vector to the new column:
# Since ToothGrowth has 60 rows, we must create a new vector that has 60 rows
rep(c(1, 2), 30)
vec <-
%>%
ToothGrowth mutate(newcol = vec)
#> len supp dose newcol
#> 1 4.2 VC 0.5 1
#> 2 11.5 VC 0.5 2
#> ...<56 more rows>...
#> 59 29.4 OJ 2.0 1
#> 60 23.0 OJ 2.0 2
Note that the vector being added to the data frame must either have one element, or the same number of elements as the data frame has rows. In the example above we created a new vector that had 60 rows by repeating the values c(1, 2)
thirty times.
15.3.3 Discussion
Each column of a data frame is a vector. R handles columns in data frames slightly differently from standalone vectors because all the columns in a data frame must have the same length.
To add a column using base R, you can simply assign values into the new column like so:
# Make a copy of ToothGrowth for this example
ToothGrowth
ToothGrowth2 <-
# Assign NA's for the whole column
$newcol <- NA
ToothGrowth2
# Assign 1 and 2, automatically repeating to fill
$newcol <- c(1, 2) ToothGrowth2
With base R, the vector being assigned into the data frame will automatically be repeated to fill the number of rows in the data frame.