library(tidyverse)
library(sf)
Exercise 04
ℹ️ See week 4 for related slides and readings
1 Overview
This week’s exercises are excerpted from Ch. 3, Ch. 4, and Ch. 5 in Geocomputation with R. These exercises build on our last exercise using {dplyr}
and include some of the same skills including:
- Filtering rows or observations
- Grouping and summarizing data by variable
New skills you will practice with this exercise include:
- Using non-spatial joins for data frames
- Computing geometric measurements
- Using spatial filters
- Using geometric operations on a simple feature geometry set
- Using geometric operations on pairs of simple feature geometries
2 Setup
This exercise uses the sf
and tidyverse
packages:
We are also going to use the us_states
and us_states_df
data from the {spData}
package:
library(spData)
Note that the us_states
loaded for this exercise is different than the us_states
we created during class with the tigris::states()
function. For this exercise, the bonus exercises are mixed in with the other questions but you are welcome to skip them if you do not want go for the bonus part of the exercise.
3 Exercises
3.1 Filtering data
Find all states that belong to the West region, have an area below 250,000 km2and in 2015 a population greater than 5,000,000 residents (Hint: you may need to use the function units::set_units()
or as.numeric()
).
|>
us_states ____
Find all states that belong to the South region, had an area larger than 150,000 km2 or a total population in 2015 larger than 7,000,000 residents.
|>
us_states ____
3.2 Joining and summarizing data
What was the total population in 2015 in the us_states
dataset? What was the minimum and maximum total population in 2015?
|>
us_states ____
Add variables from us_states_df
to us_states
, and create a new object called us_states_stats
.
- What function did you use and why?
- Which variable is the key in both datasets?
- What is the class of the new object?
Tip: we are covering joins in more detail next week—check out the R for Data Science chapter on Joins for more information.
<- us_states |>
us_states_stats ____
us_states_df
has two more rows than us_states
. How can you find them? Hint: try to use the dplyr::anti_join()
function.
____(us_states, us_states_df)
How much has population density changed between 2010 and 2015 in each state?
Calculate the change in percentages and map them with plot()
or geom_sf()
:
Calculate the change in the number of residents living below the poverty level between 2010 and 2015 for each state. Hint: See ?us_states_df
for documentation on the poverty level columns.
Bonus: Calculate the change in the percentage of residents living below the poverty level in each state.
What was the minimum, average and maximum state’s number of people living below the poverty line in 2015 for each region?
Bonus: What is the region with the largest increase in people living below the poverty line?
3.3 Spatial operations
Section 4.2 (in Geocomputation with R) established that Canterbury was the region of New Zealand containing most of the 100 highest points in the country. How many of these high points does the Canterbury region contain?
<- nz |>
canterbury filter(Name == "Canterbury")
|>
nz_height ____
Bonus: plot the result using the ggplot2::geom_sf()
function to show all of New Zealand, canterbury
region highlighted in yellow, high points in Canterbury represented by red crosses (Hint: try using shape = 7
) and high points in other parts of New Zealand represented by blue circles.
See the help page ?ggplot2::shape
and run the examples to see an illustration of different shape
values.
Which region has the second highest number of nz_height
points, and how many does it have?
|>
nz_height ____
Generalizing the question to all regions: how many of New Zealand’s 16 regions contain points which belong to the top 100 highest points in the country? Which regions?
Bonus: create a table listing these regions in order of the number of points and their name. Hint: use dplyr::slice_max()
and gt::gt()
.
Using st_buffer()
, how many points in nz_height
are within 100 km of Canterbury?
|>
nz_height st_buffer(____, dist = ____)
3.4 Spatial predicates
Test your knowledge of spatial predicates by finding out and plotting how US states relate to each other and other spatial objects.
The starting point of this part of the exercise is to create an object representing Maryland state in the USA using the filter()
function and plot the resulting object in the context of US states.
<- filter(____, ____)
maryland
ggplot() +
geom_sf(data = us_states) +
geom_sf(data = ____)
Create a new object representing all the states that geographically intersect with Maryland and plot the result (hint: the most concise way to do this is with the subsetting method [
but you can also use sf::st_filter()
).
<- ____ states_intersecting_md
Create another object representing all the objects that touch (have a shared boundary with) Maryland and plot the result (hint: remember you can use the argument op = st_intersects
when subsetting with base R or .predicate = st_intersects
when using st_filter()
).
<- ____ states_touching_md
Bonus: create a straight line from the centroid of Maryland to the centroid of California near the West coast of the USA (hint: functions st_centroid()
, st_union()
and st_cast()
described in Chapter 5 may help) and identify which states this long East-West line crosses.
How far is the geographic centroid of Maryland from the geographic centroid of Canterbury, New Zealand?
Calculate the length of the boundary lines of US states in meters. Which state has the longest border and which has the shortest? Hint: The st_length
function computes the length of a LINESTRING
or MULTILINESTRING
geometry.
|>
us_states ____