Faster R code with Rust

Nicola Rennie
Lancaster University

Manchester R
27 June 2024

About Me


Academic background in statistics

Experience in data science consultancy

Lecturer in Health Data Science in Lancaster Medical School.

Research interests: machine learning, reproducible research, data visualisation, R pedagogy …




My R code is slow…

Maybe you should use Python instead!

Why is my R code so slow?

There’s lots of ways to do things in R. Some are faster than others…

R is an interpreted language:

  • It executes instructions directly line by line, i.e. translates code into machine language during execution.

  • This is great for interactive debugging and user-friendliness…

  • …but bad for speed.

  • Python is also an interpreted language.




My R code is slow…

Maybe you should use Python instead!

If speed is really the problem, Python alone won’t solve your problems.

So why not use a compiled language?

Compiled languages convert code into machine language before running it, leading to faster execution.

  • Requires recompilation after every change

  • Harder to debug

  • Usually less human-friendly

Using compiled languages with R

  • Identify the bottlenecks in your code (see Rprof()).

  • For things that are very slow, rewrite the R function in a compiled language.

  • Write an R function that calls the new function in the compiled language.

  • Use the now faster R function.

Using compiled languages with R

Pros:

  • Only one person has to learn the compiled language.
  • Everybody else gets to keep using (faster) R code.

Cons:

  • Somebody has to learn a compiled language.
  • Errors might look confusing to an R user.

C++

  • C++ is a compiled, general purpose programming language.

  • The {Rcpp} R package provides a C++ integration for R.

  • {Rcpp} is used by over 2,500 packages on CRAN alone.

But C++ has a steep learning curve.

Rust

Rust is also a compiled, general purpose programming language.

  • An alternative to C++ for speeding up R code.

  • Rust has more rigorous code validation measures than C++.

  • Rust is more memory-safe than C++.

  • It includes a package manager for installing and publishing libraries.

  • It felt easier to learn (with very helpful error messages!)

Rust logo

Writing functions in R and Rust

R

abs_value <- function(x) {
  abs_x <- if (x >= 0) { x } else { -x }
  return(abs_x)
}


Rust

fn abs_value(x: f64) -> f64 {
    let abs_x = if x>=0.0 { x } else { -1.0 * x };
    return abs_x;
}


A trivial example since abs() exists…

Using Rust functions in R

The {rextendr} package allows you to call Rust code from R:

  • in scripts
  • in R packages
  • in R Markdown / Quarto documents with {extendr} code chunks

rextendr logo

Using Rust functions in R

library(rextendr)
rust_function(
  "fn abs_value_rust(x: f64) -> f64 {
    let abs_x = if x>=0.0 { x } else { -1.0 * x };
    return abs_x;
  }"
)


abs_value_rust(-3.45)
[1] 3.45

Using Rust functions in R

You might want use multiple Rust functions in R:

code <- r"(
#[extendr]
fn abs_value_rust(x: f64) -> f64 {
  let number = if x>=0.0 { x } else { -1.0 * x };
  return number;
}

#[extendr]
# More Rust code goes here!!!
)"

rust_source(code = code)

You can also supply a file:

rust_source(file = "abs_value_rust.rs")

Applying functions multiple times

Although a lot of functions in R are vectorised, it isn’t always the case.

Options for applying a function to each element of a vector vec:

  • for loops:
abs_x <- numeric(length = length(x))
for (i in seq_along(x)) {
  abs_x[i] <- abs_value_r(x[i])
}
  • apply family of functions: sapply(x, abs_value_r)
  • {purrr} package: purrr::map_vec(x, abs_value_r)

Vectorising with Rust

fn abs_value_rust(x: f64) -> f64 {
    let abs_x = if x>=0.0 { x } else { -1.0 * x };
    return abs_x;
}

fn abs_value_iter(v: Vec<f64>) -> Vec<f64> {
    let abs_v: Vec<_> = v.iter().map(|x| abs_value_rust(*x)).collect();
    return abs_v;
}

Vectorising with Rust in R

code <- r"(
#[extendr]
fn abs_value_rust(x: f64) -> f64 {
  let number = if x>=0.0 { x } else { -1.0 * x };
  return number;
}

#[extendr]
fn abs_value_iter(v: Vec<f64>) -> Vec<f64> {
    let abs_v: Vec<_> = v.iter().map(|x| abs_value_rust(*x)).collect();
    return abs_v;
}
)"

rust_source(code = code)

Vectorising with Rust

set.seed(20240221)
z <- rnorm(20)
abs_value_iter(z)

Comparison of speed

Line chart of speed improvements showing Rust is consistently faster

Using Rust in an R package

  • Create an R package as normal.

  • Use rextendr::use_extendr() to set up scaffolding for using Rust.

  • Edit src/rust/src/lib.rs to add your Rust functions.

  • Run rextendr::document() to compile Rust code into R function.

  • Run devtools::load_all() (or install and call library() as normal)

You might already be using Rust…

…without knowing it.

  • {dialrs}: R package for parsing phone numbers

  • {arcgisgeocode}: R package that provides access to ArcGIS geocoding services

  • Typst: an alternative to LaTeX for creating PDFs with Quarto

  • Polars: a high-performance DataFrame library for Python

Closing thoughts

  • Rust is fast (and user-friendly-ish…)

  • R is a more user-friendly language, so an R wrapper means that users don’t have to learn Rust to get the speed increase.

  • {rextendr} also enables the inclusion of Rust code in R packages.

  • Similar approaches in Python also available.

Resources