1.7 Chaining Functions Together With %>%
, the Pipe Operator
1.7.1 Problem
You want to call one function, then pass the result to another function, and another, in a way that is easily readable.
1.7.2 Solution
Use %>%
, the pipe operator. For example:
library(dplyr) # The pipe is provided by dplyr
# Look at the morley data set
morley #> Expt Run Speed
#> 001 1 1 850
#> 002 1 2 740
#> 003 1 3 900
#> ...<94 more rows>...
#> 098 5 18 800
#> 099 5 19 810
#> 100 5 20 870
%>%
morley filter(Expt == 1) %>%
summary()
#> Expt Run Speed
#> Min. :1 Min. : 1.00 Min. : 650
#> 1st Qu.:1 1st Qu.: 5.75 1st Qu.: 850
#> Median :1 Median :10.50 Median : 940
#> Mean :1 Mean :10.50 Mean : 909
#> 3rd Qu.:1 3rd Qu.:15.25 3rd Qu.: 980
#> Max. :1 Max. :20.00 Max. :1070
This takes the morley
data set, passes it to the filter()
function from dplyr, keeping only the rows of the data where Expt
is equal to 1. Then that result is passed to the summary()
function, which calculates some summary statistics on the data.
Without the pipe operator, the code above would be written like this:
summary(filter(morley, Expt == 1))
In this code, function calls are processed from the inside outward. From a mathematical viewpoint, this makes perfect sense, but from a readability viewpoint, this can be confusing and hard to read, especially when there are many nested function calls.
1.7.3 Discussion
This pattern, with the %>%
operator, is widely used with tidyverse packages, because they contain many functions that do relatively small things. The idea is that these functions are building blocks that allow user to compose the function calls together to produce the desired result.
To illustrate what’s going on, here’s a simpler example of two equivalent pieces of code:
f(x)
# Equivalent to:
%>% f() x
The pipe operator in essence takes the thing that’s on the left, and places it as the first argument of the function call that’s on the right.
It can be used for multiple function calls, in a chain:
h(g(f(x)))
# Equivalent to:
%>%
x f() %>%
g() %>%
h()
In a function chain, the lexical ordering of the function calls is the same as the order in which they’re computed.
If you want to store the final result, you can use the <-
operator at the beginning. For example, this will replace the original x
with the result of the function chain:
x %>%
x <- f() %>%
g() %>%
h()
If there are additional arguments for the function calls, they will be shifted to the right when the pipe operator is used. Going back to code from the first example, these two are equivalent:
filter(morley, Expt == 1)
%>% filter(Expt == 1) morley
The pipe operator is actually from the magrittr package, but dplyr imports it and makes it available when you call library(dplyr)
1.7.4 See Also
For many more examples of how to use %>%
in data manipulation, see Chapter 15.