library(ggplot2)
library(dplyr)
library(sf)
Exercise 02
- Build a layered plot using
ggplot()
,aes()
, and differentgeom_
functions - Explore the difference between discrete and continuous variables
- Use
coord_sf()
to modify a plot created withgeom_sf()
- What makes a scale better or worse at visualizing data?
- How do you write a clear and accurate title, legend, or caption?
- When is it appropriate to modify the geometry of your data when making a map?
1 Setup
This exercises uses the {ggplot2}
and {dplyr}
packages (both from the tidyverse family of packages) and the {sf}
package:
For this week’s exercise, we are also going to use data from the {rnaturalearth}
package (which downloads data from the Natural Earth data project) and the {smoothr}
package that works with sf
objects. If these packages are not installed already, make sure to install those packages and re-start your session:
::pkg_install(c("rnaturalearth", "rnaturalearthdata", "smoothr")) pak
library(rnaturalearth)
We are going to use ne_download()
to download the countries
dataset and then use st_centroid()
to make a version of this dataset where the features show the center of each country instead of the boundaries:
# Download data from Natural Earth
<- ne_download(scale = "medium", type = "countries")
countries_ne
<- countries_ne |>
countries # Drop unused columns
select(
!starts_with(c("ISO", "ADM0", "FCLASS", "NAME", "MAPCOLOR"))
|>
) # Exclude Antarctica
filter(
!= "Antarctica"
SOVEREIGNT |>
) st_transform(crs = 3857)
<- st_centroid(countries)
countries_center
glimpse(countries)
2 Exercises
2.1 Plotting a single variable
Find a discrete variable in countries and then create a plot with geom_bar()
:
ggplot(data = countries) +
geom_bar(mapping = aes(x = ____))
Next, find a continuous variable and make a plot with geom_histogram()
:
ggplot(data = countries) +
geom_histogram(mapping = aes(x = ____))
Now, let’s make a map! Use countries_center
and geom_sf()
to make a map with a continuous variable mapped to size:
ggplot(data = countries_center) +
geom_sf(aes(size = ____))
Next, make a map with geom_sf()
with one discrete variable mapped to color:
ggplot(data = countries) +
geom_sf(aes(color = ____))
Is that the map you expected? Try it again with the discrete variable mapped to fill:
ggplot(data = countries) +
geom_sf(aes(fill = ____))
Now, try mapping the data using the same variable but use facet_wrap()
to create a set of small maps for each region:
ggplot(data = countries) +
geom_sf(aes(fill = ____)) +
facet_wrap(~ ____)
Now, make a plot using any geom
function of your choice:
ggplot(data = countries) +
____
Explain in plain language. What does your map or plot show? ____
Render, commit, and push your changes to GitHub with the commit message “Added answers on plotting a single variable”.
Make sure to commit and push all changed files so that your Git pane is empty afterwards.
2.2 Plotting two variables
For this next section, we are also going to use the nc
data from the {sf}
package since it has a larger number of continuous variables:
<- read_sf(system.file("shape/nc.shp", package = "sf")) nc
First, find two continuous variables and create a scatter plot with geom_point()
:
ggplot(data = nc, mapping = aes(____)) +
geom_point()
Next, look in your data for one discrete and one continuous variable then use aes()
to set those variables for geom_col()
. The geom_col()
function is similar to geom_bar()
but you must provide both an x and a y variable:
ggplot(data = nc, mapping = aes(____)) +
geom_col()
2.3 Using scales and colors
{ggplot2}
uses naming conventions to organize the scale functions. This isn’t the same for every function but they look something like: “scale_scale_fill_viridis_d()
applies the Viridis color scale to a discrete variable mapped to the fill aesthetic.
Use the data to create a map and take a look at the colors set when you use scale_fill_viridis_c()
:
ggplot(data = nc) +
geom_sf(aes(fill = ____)) +
scale_fill_viridis_c()
The ColorBrewer scales are designed for use with thematic maps. Use ?scale_color_brewer()
to pull up the documentation for this function and review the information on the type and palette parameters.
Now, map a variable to the color
aesthetic for geom_sf()
and assign an appropriate type
and palette
value:
ggplot(data = countries_center) +
geom_sf(aes(color = ____)) +
scale_color_brewer(type = ____, palette = ____)
Switching from color
to fill
, try it again with a different type and palette value:
ggplot(data = countries) +
geom_sf(aes(fill = ____)) +
scale_fill_brewer(type = ____, palette = ____)
One last time, but we’re using scale_fill_distiller()
. Note that this scale_fill_distiller()
scale only works with continuous values. If you get an error, you may need to map a different variable to fill
:
ggplot(data = nc) +
geom_sf(aes(fill = ____)) +
scale_fill_distiller(type = ____, palette = ____)
2.4 Adding labels, legends, and themes
Set the data for ggplot()
and then use the labs()
function to apply a title and caption that make sense:
ggplot(data = ____) +
geom_sf(color = "black", fill = NA) +
labs(
title = ____,
caption = ____
)
Now, map fill
to a variable in your data using aes()
and then use labs()
to assign a label for fill:
ggplot(data = ____) +
geom_sf(aes(fill = ____)) +
labs(
____ )
Finally, put all of these elements together with a theme function. theme_minimal()
and theme_void()
are good themes to use for maps but you can explore all of the options in the ggplot2 documentation.
ggplot(data = ____) +
geom_sf(mapping = aes(____)) +
labs(
title = ____,
caption = ____,
____+
) ____
Now is another good time to render, commit, and push your changes to GitHub with a meaningful commit message.
Once again, make sure to commit and push all changed files so that your Git pane is empty afterwards.
2.5 Map making with {ggplot2}
By default, any map created with geom_sf()
will show the graticulates on the map and axis labels with the coordinate values. You can also hide or change graticules and axis title. Add data to this map and then hide these graticules by adding theme_void()
:
ggplot(data = ____) +
geom_sf(color = "black", fill = NA) +
____
Try setting the panel.grid
argument to element_blank()
to hide the grid:
ggplot(data = ____) +
geom_sf(color = "black", fill = NA) +
theme(
panel.grid = ____
)
Regardless of the selected theme, when you are using geom_sf()
, you can also modify or suppress the graticule and axis labels using the label_graticule
and label_axes
parameters for coord_sf()
:
ggplot(data = countries) +
geom_sf() +
coord_sf(
label_graticule = "----",
label_axes = "----"
+
) theme_minimal()
Now, try “zooming” into a selected area of your map using the xlim
and ylim
arguments for coord_sf()
:
ggplot(data = ____) +
geom_sf(color = "black", fill = NA) +
coord_sf(
xlim = ____,
ylim = ____
)
If you have difficulty with this one, look back at our week 2 slides for an example showing how to use sf::st_bbox()
to get xmin, xmax, ymin, and ymax values for the xlim and ylim parameter.
There are some cases when you need to modify the geometry of your data as part of the process of making a map. The st_simplify()
function is one way to do that. Try setting dTolerance to a low value, e.g. dTolerance = 10
, and run the code block. Then try to run it again with dTolerance = 100000
.
<- filter(countries, NAME == "United States of America")
usa
<- st_simplify(x = usa, dTolerance = ____)
simple_usa
ggplot() +
geom_sf(
data = usa,
color = "orange"
+
) geom_sf(
data = simple_usa,
color = "purple"
+
) theme_void()
What happens when you increase the value of dTolerance? ____
Now, let’s try to same thing but smoothing features with smoothr::smooth()
instead of simplifying with sf::st_simplify()
. Start by setting smoothness to a small number, smoothness = 0.5
, and then run again with higher and higher numbers:
<- smoothr::smooth(x = usa, method = "ksmooth", smoothness = ____)
smooth_usa
ggplot() +
geom_sf(
data = usa,
color = "orange"
+
) geom_sf(
data = smooth_usa,
color = "purple"
+
) theme_void()
What happens when you increase the value of smoothness? ____
Check the documentation for st_simplify()
or smoothr::smooth()
for more information on how these functions work to modify the geometry. The rmapshaper::ms_simplify()
is another function for the simplification of polygons in simple feature objects. In contrast to the other two examples, this function is topologically aware and preserves existing boundaries between contiguous polygons.
2.6 Optional: Creating maps with {tmap}
Pick one of the maps you created in the prior questions of this exercise and create a similar version using the {tmap}
package.
You can install {tmap}
the same as any other package:
# pak::pkg_install("tmap")
Then load the library:
library(tmap)
And make a map using data from {rnaturalearth}
or another source of your choice:
____
What is the same about making a map with {tmap}
compared to {ggplot2}
? ____
What is different about making a map with {tmap}
compared to {ggplot2}
? ____
Do you have any preference between the two? ____
Render, commit, and push your final changes to GitHub with a meaningful commit message.
Make sure to commit and push all changed files so that your Git pane is empty afterwards.