+ - 0:00:00
Notes for current slide

Using Color

Rei Sanchez-Arias, Ph.D.

Using colors to distinguish, represent data, and highlight

1 / 50

Using Color

2 / 50

Using Color

Color is an important tool in designing visualizations. It allows you to encode another variable or split the data into groups.

However, there are a lot of issues you need to consider when choosing which colors to use and how to apply them.

Color is an important tool in visualizations and it is important to use it appropriately to have the largest impact.

3 / 50

Power of Color

Consider the following plot of GDP vs life expectancy

GDP per capita vs life expectancy across countries in 2010.
You see a general upward trend.

4 / 50

Power of Color (2)

GDP vs life expectancy, now colored by region. You can easily see where countries in different regions group together.

5 / 50

Power of Color (3)

Much more information was provided by adding color provided:

  • By coloring the region the countries belong to, we can see how the countries are distributed in the world on this plot.

  • The low GDP countries are almost all in sub-Saharan Africa, and on the high end you see Europe and North America.

  • The orange in between is the Commonwealth of Independent States, former Soviet states

6 / 50

Color Palettes

7 / 50

Color Palettes

A color palette is the range of colors used to encode the data values.

You will have different sorts of palettes for quantitative and qualitative data.

Choosing the correct palette is extremely important.

8 / 50

A refreshed color palette for charts in R 4.0.0

https://blog.revolutionanalytics.com/2020/04/r-400-is-released.html

Jet

The default color palette in visualization software such as MATLAB and Python's matplotlib library used to be jet (fortunately, both have updated to new palettes).

You might also know this as the rainbow palette. The jet palette goes from a dark blue to a dark red, shifting to green and yellow along the way.

A Better Default Colormap for Matplotlib | SciPy 2015 | Nathaniel Smith and Stéfan van der Walt introducing viridis

We show the palette below both in color and converted to gray-scale to show the luminance along the spectrum.

9 / 50

Jet (2)

The jet/rainbow palette is flawed because the luminance does not transition smoothly from one end to the other. The yellow is much brighter than the rest of the colors which can make some data seem more important than it really is.

not a smooth gradient

10 / 50

Jet (3)

You can see that the short cyan and yellow regions are much stronger perceptually than the other regions. Data that falls into these regions will be overemphasized.

Looking at the gray-scale versions, it's apparent that the cyan and yellow regions pop out due to their greater luminance compared to the rest of the palette. This happens due to the way our brains perceive color.

Our eyes are more sensitive to green than red, and more sensitive to red than blue.

11 / 50

Perceptual Importance

You can see how jet distorts perceptual importance in the example below.

The yellow and cyan regions are much brighter and eye catching than the red and blue regions which are actually the interesting (extreme) parts.

Our brains are interpreting higher luminance as more important. You can see this in the gray-scale versions where there are luminance spikes at the edges where it's colored yellow and cyan.

12 / 50

Diverging Palette

We could use a palette that is linear in intensity with extremes while providing a smooth transition between them. Here we use a diverging palette going from red to light yellow to green.

13 / 50

Diverging Palette (2)

This palette has smooth transitions between the positive and negative regions.

With this color palette, the transitions between the bands are smooth and the red and green regions have equal luminance.

14 / 50

Sequential Palettes

There are two basic types of linear luminance palettes, sequential and diverging.

Sequential palettes have a smooth transition from light to dark or dark to light. These are great for continuous data that is all positive so low values are light and high values are dark (or the other way around if you prefer).

Here's an example of a sequential palette going from light to dark red

You can also use palettes that shift hues as well as luminance:

Sequential palettes applied to two Gaussian blobs

15 / 50

Diverging Palettes

When you work with data that has some breakpoint, values that go from negative to positive for example, it is typically best to use a diverging palette.

Diverging palettes transition from one color to another, passing through a light (or dark) color with the luminance shifting linearly through the palette.

16 / 50

Diverging Palettes (2)

Here is what it looks like applied to the same Gaussians as before, but one is negative. The jet palette is also included so you can see how the cyan and yellow make rings around the blobs, while in the other palettes it is a smooth transition between them.


17 / 50

Palettes for Qualitative Data

For qualitative data, you are often comparing data from different groups or categories. For this you need to choose colors that are as visually separate as possible.

I want hue is a great tool for building optimally distinct palettes


We show a scatter plot of the sepal lengths and petal widths of a sample of irises.
By coloring the species different, you can easily see the data falls into three fairly distinct clusters.

18 / 50

Color Contrast

Check: color.adobe.com

Triad
Triad
Monochromatic
Monochromatic
Complementary
Complementary
Split complementary
Split complementary
19 / 50

Perceptually uniform colors

Typical palettes
Traditional palettes vs. viridis
20 / 50

Perceptually uniform colors

Typical palettes
Traditional palettes vs. viridis
Deuteranopic palettes
Traditional palettes vs. viridis as seen with deuteranopia
20 / 50

Perceptually uniform colors example

Florida counties filled by area, jet (rainbow) palette (not good)
Florida counties filled by area, viridis::viridis palette
Florida counties filled by area, viridis::inferno palette

Built using the albersusa, tidycensus and sf packages

21 / 50

Color Blindness

Humans perceive color through signals produced by cells in the retina called cones.

Light comes into the eye, hits the cones, and the cones send off electrical signals to the brain. There are (typically) three types of cones: short (S), medium (M), and long (L). They are sensitive to different frequencies (colors) of light.

22 / 50

Color Blindness (2)

Short cones prefer blue, medium prefer green, and long prefer red.

  • Around 10% of men and 1% of women have mutations that affect these cones and produce what is known as color blindness.

  • The most common form is red-green color blindness, typically caused by the medium cones shifting sensitivity towards red light, a mutation called deuteranomaly.

23 / 50

Color Blindness (3)

People with deuteranomaly cannot distinguish between red and green, as shown in this image of a red and a green apple.

The bottom row is how you would see these two apples if you had red-green color blindness

24 / 50

Color Scales

25 / 50

Color as a Tool to Distinguish

From: "Fundamentals of Data Visualization" by Claus O. Wilke

  • We use color to distinguish discrete items or groups that do not have an intrinsic order, such as different countries on a map or different manufacturers of a certain product.

  • In this case, we use a qualitative color scale. Such a scale contains a finite set of specific colors that are chosen to look clearly distinct from each other while also being equivalent to each other. The second condition requires that no one color should stand out relative to the others.

  • The colors should not create the impression of an order, as would be the case with a sequence of colors that get successively lighter.

26 / 50

Color as a Tool to Distinguish (2)

From: "Fundamentals of Data Visualization" by Claus O. Wilke

27 / 50

Color to Represent Data Values

Color can also be used to represent data values, such as income, temperature, or speed. In this case, we use a sequential color scale.

Such a scale contains a sequence of colors that clearly indicate

  • which values are larger or smaller than which other ones

  • how distant two specific values are from each other (the color scale needs to vary uniformly across its entire range).

Sequential scales can be based on a single hue (e.g., from dark blue to light blue) or on multiple hues (e.g., from dark red to light yellow).

28 / 50

Color to Represent Data Values (2)

The ColorBrewer Blues scale is a monochromatic scale that varies from dark to light blue. The Heat and Viridis scales are multi-hue scales that vary from dark red to light yellow and from dark blue via green to light yellow, respectively.

29 / 50

Color to Represent Data Values (3)

Representing data values as colors is particularly useful when we want to show how the data values vary across geographic regions. We can draw a map of the geographic regions and color them by the data values. Such maps are called choropleths.

30 / 50

Diverging Scales

Diverging scales can be thought of as two sequential scales stitched together at a common midpoint color. Common color choices for diverging scales include brown to greenish blue, pink to yellow-green, and blue to red.

31 / 50

Color as a Tool to Highlight

  • Color can also be an effective tool to highlight specific elements in the data. There may be specific categories or values in the dataset that carry key information about the story we want to tell, and we can strengthen the story by emphasizing the relevant figure elements to the reader.

  • An easy way to achieve this emphasis is to color these figure elements in a color or set of colors that vividly stand out against the rest of the figure.

  • This effect can be achieved with accent color scales, which are color scales that contain both a set of subdued colors and a matching set of stronger, darker, and/or more saturated colors

32 / 50

Color as a Tool to Highlight

When working with accent colors, it is critical that the baseline colors do not compete for attention.

33 / 50

Practical Rules
and Guidelines

34 / 50

Some Guidelines

Stephen Few has a good article, "Practical Rules for Using Color in Charts", outlining practical rules for using color correctly

  • If you want different objects of the same color in a table or graph to look the same, make sure that the background is consistent.

  • If you want objects in a table or graph to be easily seen, use a background color that contrasts sufficiently with the object.

35 / 50

Some Guidelines

Stephen Few has a good article, "Practical Rules for Using Color in Charts", outlining practical rules for using color correctly

  • If you want different objects of the same color in a table or graph to look the same, make sure that the background is consistent.

  • If you want objects in a table or graph to be easily seen, use a background color that contrasts sufficiently with the object.

  • Use color only when needed to serve a particular communication goal.

  • Use different colors only when they correspond to differences of meaning in the data.

35 / 50

Some Guidelines (2)

  • Use soft, natural colors to display most information and bright and/or dark colors to highlight information that requires greater attention.

  • When using color to encode a sequential range of quantitative values, stick with a single hue (or a small set of closely related hues) and vary intensity from pale colors for low values to increasingly darker and brighter colors for high values.

36 / 50

Some Guidelines (2)

  • Use soft, natural colors to display most information and bright and/or dark colors to highlight information that requires greater attention.

  • When using color to encode a sequential range of quantitative values, stick with a single hue (or a small set of closely related hues) and vary intensity from pale colors for low values to increasingly darker and brighter colors for high values.

  • Non-data components of tables and graphs should be displayed just visibly enough to perform their role, but no more so, for excessive salience could cause them to distract attention from the data.

  • To guarantee that most people who are colorblind can distinguish groups of data that are color coded, avoid using a combination of red and green in the same display.

36 / 50

Colors in ggplot2

37 / 50

Use Color to Your Advantage

From: "Data Visualization" by Kieran Healy


Choose a color palette based on its ability to express the data you are plotting.

  • An unordered categorical variable like Country or Sex, requires distinct colors that will not be easily confused with one another.

  • An ordered categorical variable like Level of Education requires a graded color scheme of some kind running from less to more or earlier to later.

  • If your variable is ordered, is your scale centered on a neutral midpoint with departures to extremes in each direction, as in a Likert scale?

38 / 50

Use Color to Your Advantage (2)

In general, the default color palettes that ggplot makes available are well-chosen for their perceptual properties and aesthetic qualities. We can also use color and color layers as device for emphasis, to highlight particular data points or parts of the plot, perhaps in conjunction with other features.

We choose color palettes for mappings through one of the scale_ functions for color or fill.

You can use the RColorBrewer package to make a wide range of named color palettes available to you, and choose from those. When used in conjunction with ggplot, you access these colors by specifying the scale_color_brewer() or scale_fill_brewer() functions, depending on the aesthetic you are mapping.

39 / 50

RColorBrewer

Diverging palettes

Sequential palettes.

Qualitative palettes

40 / 50

RColorBrewer Example

library(tidyverse)
p <- ggplot(data = iris,
mapping = aes(x = Sepal.Length, y = Sepal.Width,
color = Species))

41 / 50

RColorBrewer Example (2)

p +
geom_point(size = 2) +
scale_color_brewer(palette = "Set2") +
theme(legend.position = "top")



Available qualitative palettes:

Accent, Dark2, Paired, Pastel1, Pastel2, Set1, Set2, Set3

42 / 50

RColorBrewer Example (3)

p +
geom_point(size = 2) +
scale_color_brewer(palette = "Pastel2") +
theme(legend.position = "top")

43 / 50

RColorBrewer Example (4)

p +
geom_point(size = 2) +
scale_color_brewer(palette = "Dark2") +
theme(legend.position = "top")

44 / 50

Setting Colors Manually

You can also specify colors manually, via scale_color_manual() or scale_fill_manual(). These functions take a value argument that can be specified as vector of color names or color values that R knows about.

R knows many color names (like red, and green, and cornflowerblue). Try running demo('colors') in the console for an overview.

Alternatively, color values can be specified via their hexadecimal RGB value.

45 / 50

Hexadecimal RGB value

A way of encoding color values in the RGB color-space, where each channel can take a value from 0 to 255 like this. A color hex value begins with a hash or pound character, #, followed by three pairs of hexadecimal or "hex" numbers.

Hex values are in Base 16, with the first six letters of the alphabet standing for the numbers 10 to 15. This allows a two-character hex number to range from 0 to 255.

You read them as #rrggbb, where rr is the two-digit hex code for the red channel, gg for the green channel, and bb for the blue channel. So #CC55DD translates in decimal to CC = 204 (red), 55 = 85 (green), and DD = 221 (blue).

46 / 50

Custom Palette

Introduce a palette that is friendly to color-blind viewers:

cb_palette <- c("#999999", "#E69F00",
"#56B4E9", "#009E73",
"#F0E442", "#0072B2",
"#D55E00", "#CC79A7")


p +
geom_point(size = 2) +
scale_color_manual(values = cb_palette) +
theme(legend.position = "top")

47 / 50

5 tips on designing colorblind-friendly visualizations

https://www.tableau.com/about/blog/2016/4/examining-data-viz-rules-dont-use-red-green-together-53463

R Graphics Cookbook, 2nd edition Winston Chang

https://r-graphics.org/

Other Good
References

48 / 50

Resources

49 / 50

Resources (2)

  • Colours Cafe: Instagram account for colours inspiration

  • Vischeck: Simulate how your images look for people with different forms of colorblindness (web-based)

  • Coolors.co: hundreds of color palettes and beautiful color schemes.

scales::show_col(c("#532D8E", "#B1B3B5", "#5CB8B2", "#AF95D3",
"#202C61", "#FFE9E9", "#EBC1F4", "#AC62D2"), ncol = 8)

50 / 50

Using Color

2 / 50
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow