Faster R code with Rust
Nicola Rennie
Lancaster University
Manchester R
27 June 2024
Academic background in statistics
Experience in data science consultancy
Lecturer in Health Data Science in Lancaster Medical School.
Research interests: machine learning, reproducible research, data visualisation, R pedagogy …
Maybe you should use Python instead!
There’s lots of ways to do things in R. Some are faster than others…
R is an interpreted language:
It executes instructions directly line by line, i.e. translates code into machine language during execution.
This is great for interactive debugging and user-friendliness…
…but bad for speed.
Python is also an interpreted language.
Maybe you should use Python instead!
If speed is really the problem, Python alone won’t solve your problems.
Compiled languages convert code into machine language before running it, leading to faster execution.
Requires recompilation after every change
Harder to debug
Usually less human-friendly
Identify the bottlenecks in your code (see Rprof()
).
For things that are very slow, rewrite the R function in a compiled language.
Write an R function that calls the new function in the compiled language.
Use the now faster R function.
Pros:
Cons:
C++ is a compiled, general purpose programming language.
The {Rcpp} R package provides a C++ integration for R.
{Rcpp} is used by over 2,500 packages on CRAN alone.
But C++ has a steep learning curve.
Rust is also a compiled, general purpose programming language.
An alternative to C++ for speeding up R code.
Rust has more rigorous code validation measures than C++.
Rust is more memory-safe than C++.
It includes a package manager for installing and publishing libraries.
It felt easier to learn (with very helpful error messages!)
A trivial example since abs()
exists…
The {rextendr} package allows you to call Rust code from R:
{extendr}
code chunksYou might want use multiple Rust functions in R:
Although a lot of functions in R are vectorised, it isn’t always the case.
Options for applying a function to each element of a vector vec
:
for
loops:apply
family of functions: sapply(x, abs_value_r)
{purrr}
package: purrr::map_vec(x, abs_value_r)
Create an R package as normal.
Use rextendr::use_extendr()
to set up scaffolding for using Rust.
Edit src/rust/src/lib.rs
to add your Rust functions.
Run rextendr::document()
to compile Rust code into R function.
Run devtools::load_all()
(or install and call library()
as normal)
…without knowing it.
{dialrs}: R package for parsing phone numbers
{arcgisgeocode}: R package that provides access to ArcGIS geocoding services
Typst: an alternative to LaTeX for creating PDFs with Quarto
Polars: a high-performance DataFrame library for Python
…
Rust is fast (and user-friendly-ish…)
R is a more user-friendly language, so an R wrapper means that users don’t have to learn Rust to get the speed increase.
{rextendr} also enables the inclusion of Rust code in R packages.
Similar approaches in Python also available.
{rextendr} documentation: extendr.github.io/rextendr
The Rust Programming Language book: doc.rust-lang.org/book
See also github.com/dbdahl/cargo-framework.