I'm the Map

Introduction

In a nutshell, choropleth comes from the Greek choros (area) and pleth (value). They distribute colors the map regions based on the measure which then allows for comparisons using the relative value associated with each color of one region to another.

Preliminary Items

First Things First! Set your Working Directory

Your working directory is simply where your script will look for anything it needs like external data sets. There are a few ways to go about doing this which we will cover. However for now, just do the following:

  1. Open up a new script by going to File > New File > R Script.
  2. Save it in a preferably empty folder as whatever you want.
  3. Go to the menu bar and select Session > Set Working Directory > To Source File Location.

Download the script

Copying and pasting syntax from a browser can cause problems. To avoid this issue, please download a script with all of the needed code presented in this walkthrough.

Loading libraries

Go ahead and load these or install and then load them.

library(tidyverse)
library(tidycensus)
library(viridis)
library(tools)

Setting up your API

As before, you will utilize the Census API. You’ll likely receive a prompt to save it using

To install your API key for use in future sessions, run this function with install = TRUE.

I suggest rerunning the command and do just that!

# Input your Census key
census_api_key("YOUR CENSUS API KEY")

# Reload the .Renviron marker
readRenviron("~/.Renviron")

Walkthrough

There are two major functions implemented in tidycensus::get_decennial(), which grants access to the 1990, 2000, and 2010 decennial US Census APIs, and get_acs, which grants access to the 5-year American Community Survey APIs. While limited in comparison to the other method we used, what it lacks in breadth, it soon makes up in depth which we will see soon!

Getting variables from the Census or ACS requires knowing the variable ID - and there are thousands of these IDs across the different Census files. To search for variables without pulling your hair out, use the load_variables function. The function takes two required arguments:

  1. the year of the Census or endyear of the ACS sample, and

  2. the dataset - one of sf1, sf3, or acs5.

For ideal functionality, it is recommended that you assign the result of this function to a variable, setting cache = TRUE to store the result on your computer for future access, and using the View function in RStudio to interactively browse for variables. However, this is not required. However, as an example, we will take the 2016 American Community Survey data set and save it.

acs16 <- load_variables(2016, "acs5", cache = TRUE) 
head(acs16) # or View(acs16) if you prefer
## # A tibble: 6 x 3
##   name       label                           concept                            
##   <chr>      <chr>                           <chr>                              
## 1 B00001_001 Estimate!!Total                 UNWEIGHTED SAMPLE COUNT OF THE POP…
## 2 B00002_001 Estimate!!Total                 UNWEIGHTED SAMPLE HOUSING UNITS    
## 3 B01001_001 Estimate!!Total                 SEX BY AGE                         
## 4 B01001_002 Estimate!!Total!!Male           SEX BY AGE                         
## 5 B01001_003 Estimate!!Total!!Male!!Under 5… SEX BY AGE                         
## 6 B01001_004 Estimate!!Total!!Male!!5 to 9 … SEX BY AGE

In this basic example, let’s look at US income by county using the data set B19013_001 (such a clear and obvious name)

title <- acs16 %>% # use the acs_16 data set we created
  filter(name == "B19013_001") %>% # filter everything else out except where we find B19013_001
  select(concept) # select the column concept that provides a very brief description

# Convert to title case. We have to go in a funny way simply due to the encoding of text within the Census API
title <- tolower(title) %>%
          toTitleCase()

title
## [1] "Median Household Income in the Past 12 Months (in 2016 Inflation-Adjusted Dollars)"

Then let’s get some income level data from that set…

us_county_income <-  get_acs(geography = "county", 
          variables = "B19013_001", 
          shift_geo = TRUE, 
          geometry = TRUE)
## Getting data from the 2015-2019 5-year ACS
## Using feature geometry obtained from the albersusa package
## Please note: Alaska and Hawaii are being shifted and are not to scale.

…and plot!

ggplot(us_county_income) + 
  geom_sf(aes(fill = estimate), color = NA) + 
  # Removes the graticule lines from the plot
  coord_sf(datum = NA) + 
  theme_minimal() + 
  scale_fill_viridis()

Looks good. ow let’s get some West Virginia population data like we did earlier in the term.

wv_pop <- get_acs(geography = "county", 
                     variables = "B01003_001", 
                     state = "WV",
                     geometry = TRUE)
## Getting data from the 2015-2019 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |=========                                                             |  14%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |==============================                                        |  44%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |=====================================                                 |  54%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |==========================================                            |  61%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%
head(wv_pop) # or View(wv_pop) if you prefer
## Simple feature collection with 6 features and 5 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -82.64474 ymin: 37.84242 xmax: -80.16639 ymax: 39.72189
## geographic CRS: NAD83
##   GEOID                           NAME   variable estimate moe
## 1 54039  Kanawha County, West Virginia B01003_001   183279  NA
## 2 54007  Braxton County, West Virginia B01003_001    14190  NA
## 3 54033 Harrison County, West Virginia B01003_001    67908  NA
## 4 54085  Ritchie County, West Virginia B01003_001     9844  NA
## 5 54103   Wetzel County, West Virginia B01003_001    15436  NA
## 6 54099    Wayne County, West Virginia B01003_001    40303  NA
##                         geometry
## 1 MULTIPOLYGON (((-81.90454 3...
## 2 MULTIPOLYGON (((-80.98495 3...
## 3 MULTIPOLYGON (((-80.60255 3...
## 4 MULTIPOLYGON (((-81.32879 3...
## 5 MULTIPOLYGON (((-80.94378 3...
## 6 MULTIPOLYGON (((-82.6432 38...

So far we haven’t done anything out of left field, in that it the process and results are pretty similar to what came before. It is at this point you should note that there is something much more interesting coming. Let’s redo the plot above using a very cool package called leaflet. Let’s lod (or install and load) some new libraries:

library(leaflet) # a JavaScript library for interactive maps
library(stringr) # Well I lied...this isn't new!
library(sf) # A very nice package that allows us to us a standardized way to encode spatial vector data, aka spatial objects.
## Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1

I cheated a little and know how many colors we’ll need (10). Below is one of many ways to construct a continuous scale.

pal <- colorQuantile(palette = "viridis", domain = wv_pop$estimate, n = 10)

Now it gets interesting (well for me anyways…yes I find maps fun). Below we’re going to construct an interactive plot. You can click on the individual counties to see their names:

wv_pop %>%
  st_transform(crs = "+init=epsg:4326") %>% 
  leaflet(width = "100%") %>%
  addProviderTiles(provider = "CartoDB.Positron") %>%
  addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
              stroke = FALSE,
              smoothFactor = 0,
              fillOpacity = 0.7,
              color = ~ pal(estimate)) %>%
  addLegend("bottomright", 
            pal = pal, 
            values = ~ estimate,
            title = "Population percentiles",
            opacity = 1)
## Warning in CPL_crs_from_input(x): GDAL Message 1: +init=epsg:XXXX syntax is
## deprecated. It might return a CRS with a non-EPSG compliant axis order.

Want an explanation of the code? If so, keep reading. If you have fallen asleep, maybe move on to the next batch of codes.

Note about geographic information system (GIS)

What the heck is st_transform(crs = "+init=epsg:4326")? Well without getting too into GIS because that is an entire semesters worth of material in itself, the epsg part specifically refers to an optimal coordinate system first developed by the European Petroleum Survey Group, an organization involved in best practices for surveying and applied geodesy. The EPSG was absorbed into the International Association of Oil & Gas Producers in 2005 but the code epsg has become a standard.

If you want to know more about GIS in general, let me know as I am considering teaching a section. Again, maps are fun! Inn the meantime, to know more about GIS in general, consider checking out the ersi homepage: https://www.esri.com/en-us/what-is-gis/overview. They make the most popular software package for GIS called ArcGIS and thought it is expensive (and I mean expensive) in the outside world, much like Tableau, it is free for students and educators alike. Moreover, companies and organizations that use ArcGIS use it heavily and are more likely than not well aware of the investment. Moreover, mapping geometries (and by this I mean anything from GPS to LiDAR to measuring distances on your phone and more) is a lucrative area to be in. Take a look at this recent report by National Geographic on how wrong we were about the size and population of the Mayan people which absolutely blew my mind: https://news.nationalgeographic.com/2018/02/maya-laser-lidar-guatemala-pacunam/. That is simply one amazing facet of GIS!

OK what were we doing? Oh yeah. OK back to our maps.

Let’s now look at one of our border states - Pennsylvania. In this case, we are going to use data regarding Pennsylvania’s population by county and proportional estimates by pulling from the ACS data set B01003_001.

# Use the acs_16 data set we created
acs16 %>% 
  # filter everything else out except where we find B01003_001
  filter(name == "B01003_001") %>% 
  # select the column concept that provides a very brief description
  select(concept) 
## # A tibble: 1 x 1
##   concept         
##   <chr>           
## 1 TOTAL POPULATION
# Pretty much the exact same pull we did above...
pa_pop <- get_acs(geography = "county", 
                    variables = "B01003_001", 
                    state = "PA",
                    geometry = TRUE)
## Getting data from the 2015-2019 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
# ... but this time we'll color it differently.
pal <- colorNumeric(palette = "plasma", 
                    domain = pa_pop$estimate)
# GIS things. Look at the code. Most of it is repetitive and for the most part, I think you should be able to figure out what's generally going on. 
pa_pop %>%
  st_transform(crs = "+init=epsg:4326") %>%
  leaflet(width = "100%") %>%
  addProviderTiles(provider = "CartoDB.Positron") %>%
  addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
              stroke = FALSE,
              smoothFactor = 0,
              fillOpacity = 0.7,
              color = ~ pal(estimate)) %>%
  addLegend("bottomright", 
            pal = pal, 
            values = ~ estimate,
            title = "County Populations",
            opacity = 1)

Well the output makes sense. PA is a pretty rural state barring areas close to Washington DC and just Pittsburgh. Let’s compare it to an a state like Florida that is overtly populated:

# Again, use the acs_16 data set we created
fl_pop <- get_acs(geography = "county", 
                  variables = "B01003_001", 
                  state = "FL",
                  geometry = TRUE)
## Getting data from the 2015-2019 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
# Assign out color palette for comparison
pal <- colorNumeric(palette = "plasma", 
                    domain = fl_pop$estimate)
# GIS things again. Notice that the base code is consistent.
fl_pop %>%
  st_transform(crs = "+init=epsg:4326") %>%
  leaflet(width = "100%") %>%
  addProviderTiles(provider = "CartoDB.Positron") %>%
  addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
              stroke = FALSE,
              smoothFactor = 0,
              fillOpacity = 0.7,
              color = ~ pal(estimate)) %>%
  addLegend("bottomright", 
            pal = pal, 
            values = ~ estimate,
            title = "County Populations",
            opacity = 1)