Consider Image-Based Waffle Charts for Your Next Data Story

How this was made

show_code = "hidden"

Code

html`To learn more about how the output in this article was created you can choose to show the code I used. Currently the code in this article is &nbsp;<span class="dropdown-container" style="width: 5rem;">${viewof show_code}</span>.`

To learn more about how the output in this article was created you can choose to show the code I used. Currently the code in this article is .

In this article I share how to create an image based waffle chart. With some creativity I believe these types of waffle charts can be imactful, while retaining the simplicity of waffle charts. Typically waffle charts will either be made up of tiles or repeating symbols or images. However, the flavour of waffle chart I present in this article is made by cropping images to be in the desired proportion and superimposing a grid.

To demonstrate my approach I use data about electricity production in Australia. The proportion of Australian electricity generated by each energy source is tabulated in Table 1.

Code

library(tidyverse)
library(patchwork)
library(magick)
d <- read_csv('data.csv', show_col_types = FALSE)
waffle_df <- d |>
  drop_na() |>
  filter(!str_detect(Fuel,'Total')) |>
  mutate(category = case_when(
    Fuel == "Black coal" ~ 'black-coal',
    Fuel == "Brown coal" ~ "brown-coal",
    Fuel == "Natural gas" ~ "natural-gas",
    Fuel %in% c("Large-scale solar PV", "Small-scale solar PV") ~ "solar",
    Fuel == "Hydro" ~ "hydro",
    Fuel == "Wind" ~ 'wind',
    TRUE ~ "other"
  )) |>
  group_by(category) |>
  summarise(tot = sum(Quantity)) |>
  mutate(prop = tot/sum(tot) * 100) |>
  arrange(desc(prop)) |>
  mutate(cumulative = cumsum(prop)) |>
  mutate(start = cumulative - prop) |>
  rowid_to_column() |>
  mutate(category_upper = str_to_title(str_replace(category, '-', ' ')),
         category_label = paste0(category_upper, ' (', scales::label_percent(.1)(prop/100),')'))

Code

waffle_df |>
  select(category_upper, tot, prop) |>
  mutate(prop = scales::label_percent(.1)(prop/100),
         tot = scales::label_number(1)(tot)) |>
  knitr::kable(col.names = c('Category', 'Energy (GWh)', 'Proportion'),
               align = 'lrr')

Table 1: Breakdown of electricity production in Australia, from DCCEEW (2024) Australian Energy Statistics, Table O.

Category	Energy (GWh)	Proportion
Black Coal	96 174	35.0%
Natural Gas	48 865	17.8%
Solar	41 969	15.3%
Brown Coal	31 459	11.5%
Wind	31 385	11.4%
Hydro	16 666	6.1%
Other	7 957	2.9%

There are various libraries that make it easier to create standard waffle charts, but I am not aware of an approach that quite gets the look I am going for. In this article I propose an approach that uses ggplot2, patchwork, and magick.

Code

get_xy <- function(image) {
  matrix_data <- image |>
    as.raster() |>
    as.matrix()
  row_indices <- row(matrix_data)
  col_indices <- col(matrix_data)
  df <- data.frame(
    x = as.vector(col_indices),
    y = max(row_indices)-as.vector(row_indices),
    z = as.vector(matrix_data)
  )
  df
}
apply_target <- function(image,target_width) {
  image_info <- image_info(image)
  image_width <- image_info$width
  num_repeats <- ceiling(target_width / image_width)
  repeated_images <- image_append(rep(image,num_repeats))
  final_image <- image_crop(repeated_images, geometry = paste0(target_width, "x"))
  final_image
}
waffle_img <- waffle_df |>
  mutate(image_name = paste0(category,'.png')) |>
  mutate(target_width = round(prop/100*1000*10)) |>
  mutate(pixel_start = cumsum(target_width) - target_width) |>
  rowwise() |>
  mutate(image = list(image_read(image_name) |>
           image_resize(geometry = "x100") |>
           apply_target(target_width) |>
           get_xy())) |>
  unnest(image) |>
  mutate(x=x + pixel_start) |>
  mutate(y=y/100-0.495,
         x=x/100)

I identified suitable images and cropped sections with usable patterns. I modified the images using magick such that their height is 100 pixels. Next I repeated the images and cropped them to the desired widths. For example, if the desired width is 700 pixels for each, then the images would look like in Figure 1. However, for the waffle plot the desired width is determined by the data.

Code

waffle_df |>
  mutate(image_name = paste0(category, '.png')) |>
  mutate(target_width = round(prop / 100 * 400 * 10)) |>
  mutate(pixel_start = cumsum(target_width) - target_width) |>
  rowwise() |>
  mutate(image = list(
    image_read(image_name) |>
      image_resize(geometry = "x100") |>
      apply_target(700) |>
      get_xy()
  )) |>
  unnest(image) |>
  arrange(prop) |>
  (\(x) {
    ggplot(x,aes(
      y = as.numeric(fct_inorder(category_upper)) * 100 + y,
      x = x,
      fill = z
    )) +
      geom_raster() +
      geom_label(
        aes(
          y = as.numeric(fct_inorder(category_upper)) * 100 + 50,
          x = 350,
          label = category_upper
        ),
        data = waffle_df |> arrange(prop),
        fontface = 'bold',
        label.size = NA,
        label.r = unit(0.5, 'mm'),
        color = 'white',
        fill = alpha('black', 0.5)
      ) +
      scale_fill_identity() +
      theme_void() +
      labs(fill = '') +
      theme(legend.position = 'none') +
      coord_equal()
  })()

Figure 1: Some images of patterns that I use to represent the different energy sources. The images were created by cropping pictures found on Wikimedia Commons.

After making the sizes correspond to their contribution to electricity generation, I arrange the images horizontally on a single axis. Next I create 10 variants of the plot which show the x-axis area between 0-10%, 10-20%, …, 90-100% using coord_equal with different xlim parameters. These plots are then stacked on top of each other using patchwork. Finally, I place a very small plot under all the others that I use to label on top of the waffle chart. The reason for this is so that I can set clip='off'. If we set clip='off' for any of the other plots then the xlim trick would fail. The resulting waffle chart is shown in Figure 2.

Code

p <- waffle_img |>
  ggplot(aes(x, y, fill = z)) +
  geom_raster() +
  scale_fill_identity() +
  theme_void() +
  labs(fill = '') +
  theme(legend.position = 'none')
get_waffle <- function(p,text_between=c(0,100)) {
  p_combine <- list()
  for (i in 0:9) {
    start_pos <- i * 10
    p1 <- p +
      coord_equal(
        xlim = c(start_pos, start_pos + 10),
        ylim = c(-0.5, 0.5),
        expand = F
      ) +
      geom_vline(
        xintercept = (start_pos + 1):(start_pos + 9),
        color = 'white',
        lwd = 0.5
      ) +
      theme(plot.background = element_rect(fill='white',color='white',
                                           linewidth = 0),
            plot.margin=margin(t=0.5))
    # if (i == 0)
    #   p1 <- p1 + theme(plot.margin = margin(t = 12))
    p_combine[i+1] <- list(p1)
  }
  reduce(p_combine, `+`) + plot_layout(ncol = 1)
}

# to account for margins I have to adjust position slightly
fix_pos <- \(y) {return(y*1.005)} 

get_waffle(p) +
  (ggplot() +
  geom_label(
    aes(y = fix_pos(10-floor(start/10)), x = start%%10, label = category_label),
    vjust = 0,
    angle = 0,
    hjust = 0,
    nudge_x = 0.1,
    nudge_y = -0.4,
    size = 3.5,
    fontface = 'bold',
    label.size = NA,
    label.r = unit(0.5, 'mm'),
    color='white',
    fill = alpha('black', 0.5),
    data = waffle_df
  ) +
  scale_x_continuous(breaks=1:10) +
  coord_equal(
        xlim = c(0,10),
        ylim = c(-0.5/10, 0.5/10),
        expand = F,
        clip = 'off'
      ) +
  theme_void()) +
  plot_layout(heights = c(rep(10,10),1),
              widths = 15) +
  plot_annotation(title='Australian Electricity Energy Sources',
                  theme=theme(plot.title = element_text(hjust=0.5)))