library(ggplot2)
<- data.frame(x = LETTERS[1:4], y = 1:4)
plot_df ggplot(data = plot_df,
mapping = aes(x = x, y = y, fill = x)) +
geom_col()
Data Visualisation
The Art of Visualisation with ggplot2
Each chapter of The Art of Visualisation with ggplot2
covers a different data visualization, showing: the data exploration process; the choice of data visualization type; the initial design ideas with hand-drawn sketches; the first build of a plot; and the iterative process of styling plots. For each plot, full R code is provided and explained for each step of the creative process.
None of the visualizations you’ll see were created specifically for the book - each one is an original data visualization created using real TidyTuesday datasets over a series of years. Think of each chapter as a case study starting with a new data set you’ve never seen before, and working through a process to get to an insightful, artistic visualization.
The following charts are examples from Chapter 7 - Cats: data-driven annotations with ggtext, which begins by highlighting the use of data visualisation as a method of exploring and quality checking the data you have.
The book encourages starting out with no or low-code tools to sketch ideas.
After deciding on how the data will be visualised, the remainder of the chapter explains how to implement it using ggplot2
in R. It starts from creating a basic version of the chart using the default setting, before introducing the reader the various extension packages and tricks to elevate the design.
Best Practices for Data Visualisation guidance for the Royal Statistical Society
Alongside Andreas Krause (Idorsia Pharmaceuticals) and Brian Tarran (Royal Statistical Society), I co-authored the Royal Statistical Society’s Best Practices for Data Visualisation guidance, published in July 2023. The guide covers both the art and the science of data visualisation. It is written primarily for contributors to Royal Statistical Society publications – chiefly, Significance magazine, the Journal of the Royal Statistical Society Series A, and Real World Data Science – but the information and advice within is of broad relevance and use to any data visualisation task. The guide can be viewed at rss.org.uk/datavisguide.
Developing an R package
The guide also contains technical instructions for authors on how to edit the settings of their visualisation software to better style their graphics. The guide covers examples of doing so in R and Python (with examples added by the community for other software), with R being used often (both base R and {ggplot2}). To help authors with styling their charts to fit in with the guidance in this document, we have developed an R package, {RSSthemes}.
This R package:
- Stores and load font files
- Defines colour palettes with RSS colours
- Creates
theme_*()
andscale_*
functions for {ggplot2} - Sets (and unsets) different base R defaults
Let’s see an example in action! Let’s say you have simple data set that you want to use to make a bar chart with {ggplot2}
You can restyle it, using the {RSSthemes} package and two lines of code:
library(RSSthemes)
ggplot(data = plot_df,
mapping = aes(x = x, y = y, fill = x)) +
geom_col() +
scale_fill_rss_d(palette = "signif_qual") +
theme_significance()
Before and after styling with {RSSthemes}
As of March 2024, the guide has been viewed by over 14,000 users, in 152 countries.
Significance magazine
Significance is a bi-monthly magazine aiming to communicate statistics and data-related stories in an accessible manner. I’m a member of the editorial board, specifically focusing on data visualisation.
Data visualisation challenges
Although I primarily use R for data visualisation, I also have experience of using Python, ObservableJS, D3.js, Tableau, rawgraphs, Inkscape, and Figma.
TidyTuesday
TidyTuesday is a weekly data challenge aimed at the R community. Every week a new dataset is posted alongside a chart or article related to that dataset, and ask participants explore the data. You can access the data and find out more here.
Below is a selection of my contributions to the TidyTuesday challenge. All contributions and the code used to generate them can be found on GitHub. You can also browse through all contributions using this dashboard.
Inspired by this New York Times visualisation
View the scrollytelling version.
Plotnine contest
Plotnine is a visualisation library that brings the Grammar of Graphics to Python, and the 2024 Plotnine Contest aims to bring the community together to create and share with others great plotnine
examples. You can read the blog post about how I created this annotated area chart for my entry into the Plotnine contest. This visualisation was a runner up in the Plotnine contest.
#30DayChartChallenge
The #30DayChartChallenge is a data visualisation challenge organised by Cédric Scherer and Dominic Royé. Participants make one chart each day of the challenge, inspired by the daily prompt. The prompts are also split across 5 different categories. Check out the challenge at twitter.com/30DayChartChall. I used a range of tools to create my charts, including R, Python, Tableau, rawgraphs, Inkscape, and Figma.
#30DayMapChallenge
The #30DayMapChallenge is a data visualisation challenge organised by Topi Tjukanov. Participants make one map each day of the challenge, inspired by the daily prompt. Check out the challenge on GitHub, or see contributions on the official website.
Tableau Public
You can find examples of my visualisations created using Tableau on my Tableau Public profile.
Other challenges
Viz For Social Good (VFSG) volunteers create informative and impactful data visualizations for mission-driven organizations across the globe. Check out the organisation here.