Exercise 02

Modified

November 20, 2024

Practice these skills
  • Build a layered plot using ggplot(), aes(), and different geom_ functions
  • Explore the difference between discrete and continuous variables
  • Use coord_sf() to modify a plot created with geom_sf()
Think about these questions
  • What makes a scale better or worse at visualizing data?
  • How do you write a clear and accurate title, legend, or caption?
  • When is it appropriate to modify the geometry of your data when making a map?

1 Setup

This exercises uses the {ggplot2} and {dplyr} packages (both from the tidyverse family of packages) and the {sf} package:

library(ggplot2)
library(dplyr)
library(sf)

For this week’s exercise, we are also going to use data from the {rnaturalearth} package (which downloads data from the Natural Earth data project) and the {smoothr} package that works with sf objects. If these packages are not installed already, make sure to install those packages and re-start your session:

pak::pkg_install(c("rnaturalearth", "rnaturalearthdata", "smoothr"))
library(rnaturalearth)

We are going to use ne_download() to download the countries dataset and then use st_centroid() to make a version of this dataset where the features show the center of each country instead of the boundaries:

# Download data from Natural Earth
countries_ne <- ne_download(scale = "medium", type = "countries")

countries <- countries_ne |> 
  # Drop unused columns
  select(
    !starts_with(c("ISO", "ADM0", "FCLASS", "NAME", "MAPCOLOR"))
  ) |> 
  # Exclude Antarctica
  filter(
    SOVEREIGNT != "Antarctica"
  ) |>
  st_transform(crs = 3857)

countries_center <- st_centroid(countries)

glimpse(countries)

2 Exercises

2.1 Plotting a single variable

Find a discrete variable in countries and then create a plot with geom_bar():

ggplot(data = countries) +
  geom_bar(mapping = aes(x = ____))

Next, find a continuous variable and make a plot with geom_histogram():

ggplot(data = countries) +
  geom_histogram(mapping = aes(x = ____))

Now, let’s make a map! Use countries_center and geom_sf() to make a map with a continuous variable mapped to size:

ggplot(data = countries_center) +
  geom_sf(aes(size = ____))

Next, make a map with geom_sf() with one discrete variable mapped to color:

ggplot(data = countries) +
  geom_sf(aes(color = ____))

Is that the map you expected? Try it again with the discrete variable mapped to fill:

ggplot(data = countries) +
  geom_sf(aes(fill = ____))

Now, try mapping the data using the same variable but use facet_wrap() to create a set of small maps for each region:

ggplot(data = countries) +
  geom_sf(aes(fill = ____)) +
  facet_wrap(~ ____)

Now, make a plot using any geom function of your choice:

ggplot(data = countries) +
  ____

Explain in plain language. What does your map or plot show? ____

Render, commit, and push your changes to GitHub with the commit message “Added answers on plotting a single variable”.

Make sure to commit and push all changed files so that your Git pane is empty afterwards.

2.2 Plotting two variables

For this next section, we are also going to use the nc data from the {sf} package since it has a larger number of continuous variables:

nc <- read_sf(system.file("shape/nc.shp", package = "sf"))

First, find two continuous variables and create a scatter plot with geom_point():

ggplot(data = nc, mapping = aes(____)) +
  geom_point()

Next, look in your data for one discrete and one continuous variable then use aes() to set those variables for geom_col(). The geom_col() function is similar to geom_bar() but you must provide both an x and a y variable:

ggplot(data = nc, mapping = aes(____)) +
  geom_col()

2.3 Using scales and colors

{ggplot2} uses naming conventions to organize the scale functions. This isn’t the same for every function but they look something like: “scale_”. So, scale_fill_viridis_d() applies the Viridis color scale to a discrete variable mapped to the fill aesthetic.

Use the data to create a map and take a look at the colors set when you use scale_fill_viridis_c():

ggplot(data = nc) +
  geom_sf(aes(fill = ____)) +
  scale_fill_viridis_c()

The ColorBrewer scales are designed for use with thematic maps. Use ?scale_color_brewer() to pull up the documentation for this function and review the information on the type and palette parameters.

Now, map a variable to the color aesthetic for geom_sf() and assign an appropriate type and palette value:

ggplot(data = countries_center) +
  geom_sf(aes(color = ____)) +
  scale_color_brewer(type = ____, palette = ____)

Switching from color to fill, try it again with a different type and palette value:

ggplot(data = countries) +
  geom_sf(aes(fill = ____)) +
  scale_fill_brewer(type = ____, palette = ____)

One last time, but we’re using scale_fill_distiller(). Note that this scale_fill_distiller() scale only works with continuous values. If you get an error, you may need to map a different variable to fill:

ggplot(data = nc) +
  geom_sf(aes(fill = ____)) +
  scale_fill_distiller(type = ____, palette = ____)

2.4 Adding labels, legends, and themes

Set the data for ggplot() and then use the labs() function to apply a title and caption that make sense:

ggplot(data = ____) +
  geom_sf(color = "black", fill = NA) +
  labs(
    title = ____,
    caption = ____
  )

Now, map fill to a variable in your data using aes() and then use labs() to assign a label for fill:

ggplot(data = ____) +
  geom_sf(aes(fill = ____)) +
  labs(
    ____
  )

Finally, put all of these elements together with a theme function. theme_minimal() and theme_void() are good themes to use for maps but you can explore all of the options in the ggplot2 documentation.

ggplot(data = ____) +
  geom_sf(mapping = aes(____)) +
  labs(
    title = ____,
    caption = ____,
    ____
  ) +
  ____

Now is another good time to render, commit, and push your changes to GitHub with a meaningful commit message.

Once again, make sure to commit and push all changed files so that your Git pane is empty afterwards.

2.5 Map making with {ggplot2}

By default, any map created with geom_sf() will show the graticulates on the map and axis labels with the coordinate values. You can also hide or change graticules and axis title. Add data to this map and then hide these graticules by adding theme_void():

ggplot(data = ____) +
  geom_sf(color = "black", fill = NA) +
  ____

Try setting the panel.grid argument to element_blank() to hide the grid:

ggplot(data = ____) +
  geom_sf(color = "black", fill = NA) +
  theme(
    panel.grid = ____
  )

Regardless of the selected theme, when you are using geom_sf(), you can also modify or suppress the graticule and axis labels using the label_graticule and label_axes parameters for coord_sf():

ggplot(data = countries) +
  geom_sf() +
  coord_sf(
    label_graticule = "----",
    label_axes = "----"
  ) +
  theme_minimal()

Now, try “zooming” into a selected area of your map using the xlim and ylim arguments for coord_sf():

ggplot(data = ____) +
  geom_sf(color = "black", fill = NA) +
  coord_sf(
    xlim = ____,
    ylim = ____
  )

If you have difficulty with this one, look back at our week 2 slides for an example showing how to use sf::st_bbox() to get xmin, xmax, ymin, and ymax values for the xlim and ylim parameter.

There are some cases when you need to modify the geometry of your data as part of the process of making a map. The st_simplify() function is one way to do that. Try setting dTolerance to a low value, e.g. dTolerance = 10, and run the code block. Then try to run it again with dTolerance = 100000.

usa <- filter(countries, NAME == "United States of America")

simple_usa <- st_simplify(x = usa, dTolerance = ____)

ggplot() +
  geom_sf(
    data = usa,
    color = "orange"
    ) +
  geom_sf(
    data = simple_usa,
    color = "purple"
  ) +
  theme_void()

What happens when you increase the value of dTolerance? ____

Now, let’s try to same thing but smoothing features with smoothr::smooth() instead of simplifying with sf::st_simplify(). Start by setting smoothness to a small number, smoothness = 0.5, and then run again with higher and higher numbers:

smooth_usa <- smoothr::smooth(x = usa, method = "ksmooth", smoothness = ____)

ggplot() +
  geom_sf(
    data = usa,
    color = "orange"
    ) +
  geom_sf(
    data = smooth_usa,
    color = "purple"
  ) +
  theme_void()

What happens when you increase the value of smoothness? ____

Check the documentation for st_simplify() or smoothr::smooth() for more information on how these functions work to modify the geometry. The rmapshaper::ms_simplify() is another function for the simplification of polygons in simple feature objects. In contrast to the other two examples, this function is topologically aware and preserves existing boundaries between contiguous polygons.

2.6 Optional: Creating maps with {tmap}

Pick one of the maps you created in the prior questions of this exercise and create a similar version using the {tmap} package.

You can install {tmap} the same as any other package:

# pak::pkg_install("tmap")

Then load the library:

library(tmap)

And make a map using data from {rnaturalearth} or another source of your choice:

____

What is the same about making a map with {tmap} compared to {ggplot2}? ____

What is different about making a map with {tmap} compared to {ggplot2}? ____

Do you have any preference between the two? ____

Render, commit, and push your final changes to GitHub with a meaningful commit message.

Make sure to commit and push all changed files so that your Git pane is empty afterwards.