5 Multi-part Plots

5.1 Interaction plots

Interaction plots are commonly used to help display or interpret a factorial design. Just as with the bar chart of means, interaction plots represent data summaries and so they are built up with a series of calls to stat_summary().

shape acts much like fill in previous plots, except that rather than producing different colour fills for each level of the IV, the data points are given different shapes.
size lets you change the size of lines and points. If you want different groups to be different sizes (for example, the sample size of each study when showing the results of a meta-analysis or population of a city on a map), set this inside the aes() function; if you want to change the size for all groups, set it inside the relevant geom_*() function'.
scale_color_manual() works much like scale_color_discrete() except that it lets you specify the colour values manually, instead of them being automatically applied based on the palette. You can specify RGB colour values or a list of predefined colour names -- all available options can be found by running colours() in the console. Other manual scales are also available, for example, scale_fill_manual().

ggplot(dat_long, aes(x = condition, y = rt, 
                     shape = language,
                     group = language,
                     color = language)) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_summary(fun = "mean", geom = "line") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .2) +
  scale_color_manual(values = c("blue", "darkorange")) +
  theme_classic()

Figure 5.1: Interaction plot.

You can use redundant aesthetics, such as indicating the language groups using both colour and shape, in order to increase accessibility for colourblind readers or when images are printed in greyscale.

5.2 Combined interaction plots

A more complex interaction plot can be produced that takes advantage of the layers to visualise not only the overall interaction, but the change across conditions for each participant.

This code is more complex than all prior code because it does not use a universal mapping of the plot aesthetics. In our code so far, the aesthetic mapping (aes) of the plot has been specified in the first line of code because all layers used the same mapping. However, is is also possible for each layer to use a different mapping -- we encourage you to build up the plot by running each line of code sequentially to see how it all combines.

The first call to ggplot() sets up the default mappings of the plot that will be used unless otherwise specified - the x, y and group variable. Note the addition of shape, which will vary the shape of the geom according to the language variable.
geom_point() overrides the default mapping by setting its own colour to draw the data points from each language group in a different colour. alpha is set to a low value to aid readability.
Similarly, geom_line() overrides the default grouping variable so that a line is drawn to connect the individual data points for each participant (group = id) rather than each language group, and also sets the colours.
Finally, the calls to stat_summary() remain largely as they were, with the exception of setting colour = "black" and size = 2 so that the overall means and error bars can be more easily distinguished from the individual data points. Because they do not specify an individual mapping, they use the defaults (e.g., the lines are connected by language group). For the error bars, the lines are again made solid.

ggplot(dat_long, aes(x = condition, y = rt, 
                     group = language, shape = language)) +
  # adds raw data points in each condition
  geom_point(aes(colour = language),alpha = .2) +
  # add lines to connect each participant's data points across conditions
  geom_line(aes(group = id, colour = language), alpha = .2) +
  # add data points representing cell means
  stat_summary(fun = "mean", geom = "point", size = 2, colour = "black") +
  # add lines connecting cell means by condition
  stat_summary(fun = "mean", geom = "line", colour = "black") +
  # add errorbars to cell means
  stat_summary(fun.data = "mean_se", geom = "errorbar", 
               width = .2, colour = "black") +
  # change colours and theme
  scale_color_brewer(palette = "Dark2") +
  theme_minimal()

Figure 5.2: Interaction plot with by-participant data.

5.3 Facets

So far we have produced single plots that display all the desired variables. However, there are situations in which it may be useful to create separate plots for each level of a variable. This can also help with accessibility when used instead of or in addition to group colours. The below code is an adaptation of the code used to produce the grouped scatterplot (see Figure 4.8) in which it may be easier to see how the relationship changes when the data are not overlaid.

Rather than using colour = condition to produce different colours for each level of condition, this variable is instead passed to facet_wrap().
Set the number of rows with nrow or the number of columns with ncol. If you don't specify this, facet_wrap() will make a best guess.

ggplot(dat_long, aes(x = rt, y = age)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_wrap(facets = vars(condition), nrow = 2)

Figure 5.3: Faceted scatterplot

As another example, we can use facet_wrap() as an alternative to the grouped violin-boxplot (see Figure 4.9) in which the variable language is passed to facet_wrap() rather than fill. Using the tilde (~) to specify which factor is faceted is an alternative to using facets = vars(factor) like above. You may find it helpful to translate ~ as by, e.g., facet the plot by language.

ggplot(dat_long, aes(x = condition, y= rt)) +
  geom_violin() +
  geom_boxplot(width = .2, fatten = NULL) +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .1) +
  facet_wrap(~language) +
  theme_minimal()

Figure 5.4: Facted violin-boxplot

Finally, note that one way to edit the labels for faceted variables involves converting the language column into a factor. This allows you to set the order of the levels and the labels to display.

ggplot(dat_long, aes(x = condition, y= rt)) +
  geom_violin() +
  geom_boxplot(width = .2, fatten = NULL) +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .1) +
  facet_wrap(~factor(language, 
                     levels = c("monolingual", "bilingual"),
                     labels = c("Monolingual participants", 
                                "Bilingual participants"))) +
  theme_minimal()

Figure 5.5: Faceted violin-boxplot with updated labels

5.4 Storing plots

Just like with datasets, plots can be saved to objects. The below code saves the histograms we produced for reaction time and accuracy to objects named p1 and p2. These plots can then be viewed by calling the object name in the console.

p1 <- ggplot(dat_long, aes(x = rt)) +
  geom_histogram(binwidth = 10, color = "black")

p2 <- ggplot(dat_long, aes(x = acc)) +
  geom_histogram(binwidth = 1, color = "black")

Importantly, layers can then be added to these saved objects. For example, the below code adds a theme to the plot saved in p1 and saves it as a new object p3. This is important because many of the examples of ggplot2 code you will find in online help forums use the p + format to build up plots but fail to explain what this means, which can be confusing to beginners.

p3 <- p1 + theme_minimal()

5.5 Saving plots as images

In addition to saving plots to objects for further use in R, the function ggsave() can be used to save plots as images on your hard drive. The only required argument for ggsave is the file name of the image file you will create, complete with file extension (this can be "eps", "ps", "tex", "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf"). By default, ggsave() will save the last plot displayed. However, you can also specify a specific plot object if you have one saved.

ggsave(filename = "my_plot.png") # save last displayed plot
ggsave(filename = "my_plot.png", plot = p3) # save plot p3

The width, height and resolution of the image can all be manually adjusted. Fonts will scale with these sizes, and may look different to the preview images you see in the Viewer tab. The help documentation is useful here (type ?ggsave in the console to access the help).

5.6 Multiple plots

As well as creating separate plots for each level of a variable using facet_wrap(), you may also wish to display multiple different plots together. The patchwork package provides an intuitive way to do this. Once it is loaded with library(patchwork), you simply need to save the plots you wish to combine to objects as above and use the operators +, / () and | to specify the layout of the final figure.

5.6.1 Combining two plots

Two plots can be combined side-by-side or stacked on top of each other. These combined plots could also be saved to an object and then passed to ggsave.

p1 + p2 # side-by-side

Figure 5.6: Side-by-side plots with patchwork

p1 / p2 # stacked

Figure 5.7: Stacked plots with patchwork

5.6.2 Combining three or more plots

Three or more plots can be combined in a number of ways. The patchwork syntax is relatively easy to grasp with a few examples and a bit of trial and error. The exact layout of your plots will depend upon a number of factors. Create three plots names p1, p2 and p3 and try running the examples below. Adjust the use of the operators to see how they change the layout. Each line of code will draw a different figure.

p1 / p2 / p3
(p1 + p2) / p3
p2 | p2 / p3

5.7 Customisation part 4

5.7.1 Axis labels

Previously when we edited the main axis labels we used the scale_* functions. These functions are useful to know because they allow you to customise many aspects of the scale, such as the breaks and limits. However, if you only need to change the main axis name, there is a quicker way to do so using labs(). The below code adds a layer to the plot that changes the axis labels for the histogram saved in p1 and adds a title and subtitle. The title and subtitle do not conform to APA standards (more on APA formatting in the additional resources), however, for presentations and social media they can be useful.

p1 + labs(x = "Mean reaction time (ms)",
          y = "Number of participants",
          title = "Distribution of reaction times",
          subtitle = "for 100 participants")

Figure 5.8: Plot with edited labels and title

You can also use labs() to remove axis labels, for example, try adjusting the above code to x = NULL.

5.7.2 Redundant aesthetics

So far when we have produced plots with colours, the colours were the only way that different levels of a variable were indicated, but it is sometimes preferable to indicate levels with both colour and other means, such as facets or x-axis categories.

The code below adds fill = language to violin-boxplots that are also faceted by language. We adjust alpha and use the brewer colour palette to customise the colours. Specifying a fill variable means that by default, R produces a legend for that variable. However, the use of colour is redundant with the facet labels, so you can remove this legend with the guides function.

ggplot(dat_long, aes(x = condition, y= rt, fill = language)) +
  geom_violin(alpha = .4) +
  geom_boxplot(width = .2, fatten = NULL, alpha = .6) +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .1) +
  facet_wrap(~factor(language, 
                     levels = c("monolingual", "bilingual"),
                     labels = c("Monolingual participants", 
                                "Bilingual participants"))) +
  theme_minimal() +
  scale_fill_brewer(palette = "Dark2") +
  guides(fill = "none")

Figure 3.7: Violin-boxplot with redundant facets and fill.

5.8 Activities 4

Before you go on, do the following:

Rather than mapping both variables (condition and language) to a single interaction plot with individual participant data, instead produce a faceted plot that separates the monolingual and bilingual data. All visual elements should remain the same (colours and shapes) and you should also take care not to have any redundant legends.

ggplot(dat_long, aes(x = condition, y = rt, group = language, shape = language)) +
  geom_point(aes(colour = language),alpha = .2) +
  geom_line(aes(group = id, colour = language), alpha = .2) +
  stat_summary(fun = "mean", geom = "point", size = 2, colour = "black") +
  stat_summary(fun = "mean", geom = "line", colour = "black") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .2, colour = "black") +
  theme_minimal() +
  facet_wrap(~language) +
  guides(shape = FALSE, colour = FALSE)

# this wasn't easy so if you got it, well done!

Choose your favourite three plots you've produced so far in this tutorial, tidy them up with axis labels, your preferred colour scheme, and any necessary titles, and then combine them using patchwork. If you're feeling particularly proud of them, post them on Twitter using #PsyTeachR.

4 Representing Summary Statistics

6 Advanced Plots