More effective data visualisation

Dr Nicola Rennie

Welcome!

Who am I?

Data visualisation specialist.


Mainly working with R, Python, and D3.


Background in statistics, operational research, and data science consultancy.

NR logo

What to expect during this workshop

  • Combines slides, examples, and discussions for you to participate in.

  • Ask questions throughout!

What to expect during this workshop


I hope you end up with more questions than answers after this workshop!


Stranger Things questions gif

Source: giphy.com

Workshop resources

Course website: nrennie.rbind.io/training-data-visualisation

Screenshot of course website

What’s this session about?

In this session we will cover…

  • why you should visualise data;

  • choosing a chart type;

  • some guidelines for making better charts;

  • examples of good and bad charts!

The role of visualisation

Why visualise data?

Data visualisation has two main purposes:

  • Exploratory data analysis and identifying data issues
  • Communicating insights and results

book shelf cartoon

Exploratory data visualisation

Statistic Value
Mean(x) 54.26
Mean(y) 47.83
Standard deviation(x) 16.77
Standard deviation(y) 26.94
Correlation(x, y) -0.06

Because summary statistics aren’t enough…

Communicating insights with data visualisation

Grab attention

Visualisations stand out. If a reader is short on time or uncertain about whether a document is of interest, an attention-grabbing visualisation may entice them to start reading.

Improve access to information

Textual descriptions can be lengthy and hard to read, and are frequently less precise than a visual depiction showing data points and axes.

Summarise content

Visual displays allow for summarising complex textual content, aiding the reader in memorising key points.

Communicating insights with data visualisation

John Snow collected data on cholera deaths and created a visualisation where the number of deaths was represented by the height of a bar at the corresponding address in London.

This visualisation showed that the deaths clustered around Broad Street, which helped identify the cause of the cholera transmission, the Broad Street water pump.

Snow. 1854.

John Snow cholera map

Choosing a chart type

What are you trying to communicate?

Data visualisations must serve a purpose.

Ask yourself:

  • What is the purpose?
  • Does the visualisation support the purpose?
  • Is it quick, accurate, and intuitive?

Common relationships

  • Correlation: The relationship between two variables.

  • Deviation: The difference between a value and an average or another value.

  • Distribution: How data values are spread for a variable.

  • Geography: The pattern of data across different locations or areas.

  • Magnitude: The size of values.

  • Parts of a whole: The relative sizes of components within a whole.

  • Ranking: The position of data within a hierarchy or scale.

  • Time: How a value changes over time.

Why do pie charts have a bad reputation?

Why do 3D charts have a bad reputation?

On the plot on the left, how tall is the bar?

Two 3D bar charts

Choosing a chart type

Screenshot of FT chart type poster

Data visualisation best practices

Elements of charts

  • Layout
  • Aspect ratio
  • Lines
  • Points
  • Colours
  • Axes
  • Symbols
  • Legends
  • Orientation
  • Auxiliary elements
  • Dimensionality

Layouts, aspect ratios, and axes

Layouts, aspect ratios, and axes

Layouts, aspect ratios, and axes

Layouts, aspect ratios, and axes

Longer labels are best on the y-axis, horizontally.

Layouts, aspect ratios, and axes

Should the axes start at 0?

Layouts, aspect ratios, and axes

They don’t always have to start at zero…

Layouts, aspect ratios, and axes

Order categories appropriately…

Layouts, aspect ratios, and axes

Badly ordered chart of covid cases

Source: Georgia Department of Public Health

Layouts, aspect ratios, and axes

Default:

Magnitude ordered:

Naturally ordered:

Plotting multiple values

Avoid spaghetti plots!

Effectively plotting multiple values

Alternatives to spaghetti:

  • Show a smaller number of lines (e.g. compare a few countries to average)
  • Use facets (AKA small multiples)
  • Use colour only to highlight lines

Plotting multiple variables

Effectively plotting multiple variables

Some alternatives:

  • Separate plots, each with their own axis, and place the plots side-by-side.
  • Plot different variables on the x- and y- axis.
  • Rescale the variables, rather than the axis.

Lines aren’t always appropriate

  • Suggest an order
  • Suggest continuity

Colours

Why use colours in data visualisation?

  • Colours should serve a purpose, e.g. discerning groups of data

  • Colours can highlight or emphasise parts of your data.

  • Not always the most effective for, e.g. communicating differences between variables.

Colours

Different types of colour palettes…


… for different types of data.

Examples of sequential, diverging, and qualitative palettes

Is this a good choice of colours?

Colours

Are intuitive colours always best?

Example: red and blue used to show hot and cold

Tip: never switch to the opposite meaning!

Are intuitive colours always best?

Example: pink and blue used to show women and men

Tip: think about colour associations.

Colours

Screenshot of colorbrewer website

Legends

  • Should not use up valuable space for data
  • May be integrated into the figure e.g. direct labels
  • Should follow the order of the data
  • Should not the only way to match colour to categories

Legends: direct labels

Legends: ordering

What do you think about this chart?

05:00

In groups, discuss the following chart. What is good and bad about it?


Source: commonslibrary.parliament.uk/general-election-2019-how-many-women-were-elected available under Open Parliament Licence.

Bar chart

Discussion

Key points

  • Charts should have a purpose, and the chart type should support that purpose.

  • Actively design visualisations with your audience in mind.

  • Every rule should be broken for some visualisations.

Good charts don’t have to be boring!

Stacked diverging bar chart of lego colours

Cara Thompson (cararthompson.com)

small multiples are charts of college basketball

Cedric Scherer (cedricscherer.com)

Good charts don’t have to be boring!

Supreme court justice chart

Tanya Shapiro (tanyaviz.com)

Sloped area chart

Dan Oehm (gradientdescending.com)

Workshop resources

Course website: nrennie.rbind.io/training-data-visualisation

Screenshot of course website