15  Difference between pipes in R

15.1 Performance comparisons

Code

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code

15.1.1 Define a function that takes two arguments

Code
add_numbers <- function(a, b) {
  return(a + b)
}

15.1.2 Compare the performance

Code
rand_number <- runif(1, -100, 100)
Code
benchmark_results <- microbenchmark(
  "%>%" = {rand_number %>% add_numbers(rand_number)},
  "|>" = {rand_number |> add_numbers(rand_number)},
  "no pipe" = {add_numbers(rand_number, rand_number)},
  times = 10000
)
Code
benchmark_results |> str() 
Classes 'microbenchmark' and 'data.frame':  30000 obs. of  2 variables:
 $ expr: Factor w/ 3 levels "%>%","|>","no pipe": 3 3 1 2 2 2 3 2 2 3 ...
 $ time: num  26000 1297200 37200 900 500 ...
Code
print(benchmark_results)
Unit: nanoseconds
    expr  min   lq    mean median   uq     max neval
     %>% 1600 1700 1840.99   1800 1900   37200 10000
      |>  300  400  421.57    400  400    1600 10000
 no pipe  300  400  554.56    400  400 1297200 10000
Code
M <- print(benchmark_results)
Unit: nanoseconds
    expr  min   lq    mean median   uq     max neval
     %>% 1600 1700 1840.99   1800 1900   37200 10000
      |>  300  400  421.57    400  400    1600 10000
 no pipe  300  400  554.56    400  400 1297200 10000
Code
base_pipe <- M %>%
  filter(expr == "|>") 

magrittr_pipe <- M %>%
  filter(expr == "%>%") 

no_pipe <- M %>%
  filter(expr == "no pipe") 

# Calculate the differences in execution time
diff_no_magrittr <- magrittr_pipe$time - no_pipe$time  
diff_no_base <- base_pipe$time - no_pipe$time 

# Calculate the difference-in-differences
percent_diff_in_diff <- (diff_no_magrittr - diff_no_base)  / no_pipe$time * 100

summary(percent_diff_in_diff)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
   0.1311  325.0000  350.0000  343.2731  375.0000 2260.0000 

Note that while the difference is expressed in nanoseconds our computation of percent difference is in percentage points, therefore, for a more complex computations or significantly long chains of pipes, the difference in performance can be more pronounced.

15.2 Useage Differences

Code
filter_row_number <- function(df, num){
  filter(df, row_number() == num) 
}

Both pipe opperators passes the object on the left to the next object on the right.

Code
mtcars |> filter_row_number(1)
          mpg cyl disp  hp drat   wt  qsec vs am gear carb
Mazda RX4  21   6  160 110  3.9 2.62 16.46  0  1    4    4
Code
mtcars %>% filter_row_number(1)
          mpg cyl disp  hp drat   wt  qsec vs am gear carb
Mazda RX4  21   6  160 110  3.9 2.62 16.46  0  1    4    4

%>% allows you to change the placement with a . placeholder:

Code
1 %>% filter_row_number(mtcars, .)
          mpg cyl disp  hp drat   wt  qsec vs am gear carb
Mazda RX4  21   6  160 110  3.9 2.62 16.46  0  1    4    4

However, the |> operator does not allow for the . placeholder:

Code
# this will not work
1 |> filter_row_number(mtcars , .)
Error in filter_row_number(1, mtcars, .): unused argument (.)

The |> operator allows for the use of the _ placeholder, but it must be named:

Code
1 |> filter_row_number(mtcars, num = _)
          mpg cyl disp  hp drat   wt  qsec vs am gear carb
Mazda RX4  21   6  160 110  3.9 2.62 16.46  0  1    4    4

Another important difference is that the %>% operator allows for the use of curly braces to create a temporary environment for the object on the left:

Code
mtcars %>% {subset(.$mpg, .$cyl == 6)}
[1] 21.0 21.0 21.4 18.1 19.2 17.8 19.7

Whereas the |> operator does not allow for the use of curly braces:

Code
mtcars |>  {subset(_$mpg, _$cyl == 6)}
Error: function '{' not supported in RHS call of a pipe (<text>:1:12)