A Map to the Past

Introduction

This is a deeper exploration of mapping with a purpose. In this case, we are going to map the results of the 2016 Presidential Election. For a look at the maps we are going to do, take a look at this Washington Post article.

Preliminary Items

First Things First! Download the script and data set

Please download all of the materials needed for this walkthrough and put them all in a folder by themselves.

Week 7 materials

Set your Working Directory

Your working directory is simply where your script will look for anything it needs like external data sets. There are a few ways to go about doing this which we will cover. However for now, just do the following:

  1. Open up the included script by going to File > Open File or double click the file itself if RStudio is your default program for opening .R files.
  2. To set your working directory:
  • Go to the menu bar and select Session > Set Working Directory > To Source File Location OR
  • run setwd(dirname(rstudioapi::getActiveDocumentContext()$path))1

Loading libraries

Go ahead and load these or install and then load them.

library(tidyverse)
library(patchwork) 
library(RColorBrewer) 
library(scales)
library(ggpubr)

We should first take a look at the new ones being used here

LibraryDescriptionRepositoryExample
tidyverseNopeGithubTutorial
patchworkAllows for the differing layouts of completed visualizationsGithubVignette
RColorBrewerPredefined palette originally based on map drawingsGithubInteractive
scalesProvides greater functionality to manipulate scales within plotsGithubVignette
ggpubrGives predefined settings for publication quality plotsGithubVignette

Loading files

We’ll be using a R data file which for historical reasons either ends with .rdata or .rda. The nice aspect of using this format in lieu of a regular data set is that objects and full global environments can be saved. Generally this ensures that you or someone else is able to pick up in a screen exactly as it was before with all of the packages, variables, etc. ready to go. Think of it as a snapshot of exactly what an RStudio session looked like when it was saved as the data file.

To avoid any conflicts, let’s empty and refresh your Global Environment which you can access by clicking the Environment tab in the top window on the right. Now if there is something in there you absolutely need and cannot get back, then skip this step. If not, please clean it out using

rm(list = ls())

We’ll be using the file election2016.Rdata. To load it up, run the following

load(election2016.RData)

If that does not work, go to the actual data set, right click and open it with RStudio.

Again if you have familiarity with the structure of your directory, then use the appropriate path. Notice that you don’t have to say something like variable <- load("/data/election2016.RData") because you aren’t just loading up a data set!

Left field reminder to save your API key

We will not be using the Census API this time around but please save your key for future data pulls. You’ll likely receive a prompt to save it using

# Input your Census key generally

# Input your Census key for the tidycensus package
# Load the library
library(tidycensus)

# Save you
census_api_key("YOUR CENSUS API KEY", install = TRUE)

# Reload the .Renviron marker
readRenviron("~/.Renviron")

Walkthrough

Notes

  1. The very first thing to mention here is that the states of Maine and Nebraska split electoral college votes. In 2016, Trump took all 5 of Nebraska’s votes while Maine was split with 3 for Clinton and 1 for Trump. As of result, the difference is used in the data set.

  2. The columns that are used within the data set were pulled from the Federal Election Commission (FEC) API. Generally if you know how to use one API, most others follow the same structure so the FEC API is fair game for those of you who got somewhat comfortable with the Census API. For an ever growing list of available US government APIs, data.gov is a great place to go to.

Getting Ready

First take a look at the variables within the data set itself

names(election2016)
##  [1] "state"        "votes"        "st"           "fips"         "total_vote"  
##  [6] "vote_margin"  "winner"       "party"        "pct_margin"   "r_points"    
## [11] "d_points"     "pct_clinton"  "pct_trump"    "pct_johnson"  "pct_other"   
## [16] "clinton_vote" "trump_vote"   "johnson_vote" "other_vote"   "ev_dem"      
## [21] "ev_rep"       "ev_oth"       "census"       "region"

or view the entire set in a spreadsheet like form

View(election2016) 

Since we’re looking at the general election, let’s go ahead and define the two major parties with colors

partyColors <- c("#2E74C0", "#CB454A") 
scales::show_col(partyColors)

Part I: Lollipops

A lollipop plot, or Cleveland dot, is essentially a barplot’s cousin where the bar is transformed in a line and a dot at the top. Let’s draw one using the data.

Notice the ! here. It means not and in this R reads it like

  1. look at all values in the election2016 data set and then
  2. do not include the data point DC from the st, or state column
ggplot(data = subset(election2016, !st %in% "DC"), 
       mapping = aes(x = r_points,
                     xend=r_points,
                     y = reorder(state, r_points),
                     color = party)) +
  geom_vline(xintercept = 0, color = "gray30") +
  geom_point(size = 3) +
  scale_color_manual(values = partyColors)

That’s not bad but it is pretty bland so let’s make it nicer

ggplot(data = subset(election2016, !st %in% "DC"), 
       mapping = aes(x = r_points,
                     xend=r_points,
                     y = reorder(state, r_points),
                     color = party)) + 
  geom_vline(xintercept = 0, color = "gray30") +
  geom_segment(aes(x = 0, y = state, xend = r_points, 
                   yend = state), color = "grey70") +
  geom_point(shape = 21, size = 3, fill="white", stroke = 1) + 
  scale_color_manual(values = partyColors) +
  scale_x_continuous(labels = function(x) paste0(abs(x), "%"), limits = c(-35, 50)) +
  theme_minimal() +
  theme(
    legend.position = "none",
    panel.border = element_blank(),
    axis.ticks = element_blank()
  ) +
  xlab(NULL) +
  ylab("State")

Oh no what happened? The states are in reverse alphabetical order. Think about how you can fix it to display the data in a better way, try it out and then see a solution using the dropdown below.

Better representation
ggplot(data = subset(election2016, !st %in% "DC"),  
       mapping = aes(x = r_points,  
                     xend=r_points,  
                     y = reorder(state, r_points),  
                     color = party)) +  
  geom_vline(xintercept = 0, color = "gray30") +  
  geom_segment(aes(x = 0, y = reorder(state, r_points), xend = r_points, 
                   yend = reorder(state, r_points)), color = "grey70") +  
  # Here is the fix but what do you think happened?  
  geom_point(shape = 21, size = 3, fill="white", stroke = 1) +  
  scale_color_manual(values = partyColors) +  
  scale_x_continuous(labels = function(x) paste0(abs(x), "%"), limits = c(-35, 50)) +  
  theme_minimal() +  
  theme(  
    legend.position = "none",  
    panel.border = element_blank()  
  ) +  
  xlab("Party Affiliation") +  
  ylab("State")

If you thought that reordering the column state by r_points, then you were correct and win eleventy billion points.

Side Quest

We can make this plot prettier. First load (or install and load) the following library.

library(ggthemes)

For more information about themes, please see the ggthemes vignette. Anyway, if you are aware of the data analyst and all around predictive guru Nate Silver who uses analytic and data visualizations for multiple reasons, then the website FiveThirtyEight - may be something familiar. We’ll redraw the plot above using a theme commonly found on the FiveThirtyEight website using the options scale_color_fivethirtyeight() and theme_fivethirtyeight().

ggplot(data = subset(election2016, !st %in% "DC"),
       mapping = aes(x = r_points,
                     xend = r_points,
                     y = reorder(state, r_points),
                     color = party)) + 
  geom_vline(xintercept = 0, color = "gray30") +
  geom_segment(aes(x = 0, y = reorder(state, r_points), xend = r_points, 
                   yend = reorder(state, r_points)), color = "#5e3c58") +
  geom_point(shape = 21, size = 3, fill="white", stroke = 1) + 
  scale_x_continuous(labels = function(x) paste0(abs(x), "%"), 
                     expand = c(0, 0), limits = c(-35, 50)) +
  theme(
    legend.position = "none",
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  xlab("Party Affiliation") +
  ylab("State") + 
  scale_color_fivethirtyeight() +
  theme_fivethirtyeight() 

Part II: Choropleths

In a nutshell, choropleth comes from the Greek choros (area) and pleth (value). They distribute colors the map regions based on a measure. This is then used for comparisons using the relative value associated with each color of one region to another.

With that in mind, let’s get some map data…

library(maps)
## 
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
## 
##     map

…for all of the states…

state <- map_data("state")

…and take a look.

head(state)
##        long      lat group order  region subregion
## 1 -87.46201 30.38968     1     1 alabama      <NA>
## 2 -87.48493 30.37249     1     2 alabama      <NA>
## 3 -87.52503 30.37249     1     3 alabama      <NA>
## 4 -87.53076 30.33239     1     4 alabama      <NA>
## 5 -87.57087 30.32665     1     5 alabama      <NA>
## 6 -87.58806 30.32665     1     6 alabama      <NA>

That looks pretty big! We can check the dimensions by

dim(state)
## [1] 15537     6

This time we’re going to plot map data using a different set of commands…well sort of in the fact that were not flattening the plot, rather it’s being viewed as if on a globe.

ggplot(data = state,
            aes(x = long, y = lat,
                group = group, fill = region)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
  guides(fill = FALSE)

You probably noticed that the coordinates for latitude and longitude started where we defined them but you may not know about albers. I won’t get into the history but albers refers to the Albers projection which is the industry standard for how maps are constructed. Play around with some of the others some of which were noted in last week’s assignments page.

Anyway, we need to get our own data on to the map. First let’s lower the case of the column name

election2016$region <- tolower(election2016$state) # 

In this case we want the state level data to be primary source since election2016 data is dependent on states.

state_elec <- left_join(state, election2016) 
## Joining, by = "region"

Take a look at the result

head(state_elec)
##        long      lat group order  region subregion   state votes st fips
## 1 -87.46201 30.38968     1     1 alabama      <NA> Alabama     9 AL    1
## 2 -87.48493 30.37249     1     2 alabama      <NA> Alabama     9 AL    1
## 3 -87.52503 30.37249     1     3 alabama      <NA> Alabama     9 AL    1
## 4 -87.53076 30.33239     1     4 alabama      <NA> Alabama     9 AL    1
## 5 -87.57087 30.32665     1     5 alabama      <NA> Alabama     9 AL    1
## 6 -87.58806 30.32665     1     6 alabama      <NA> Alabama     9 AL    1
##   total_vote vote_margin winner      party pct_margin r_points d_points
## 1    2123372      588708  Trump Republican      27.73    27.72   -27.72
## 2    2123372      588708  Trump Republican      27.73    27.72   -27.72
## 3    2123372      588708  Trump Republican      27.73    27.72   -27.72
## 4    2123372      588708  Trump Republican      27.73    27.72   -27.72
## 5    2123372      588708  Trump Republican      27.73    27.72   -27.72
## 6    2123372      588708  Trump Republican      27.73    27.72   -27.72
##   pct_clinton pct_trump pct_johnson pct_other clinton_vote trump_vote
## 1       34.36     62.08        2.09      1.46       729547    1318255
## 2       34.36     62.08        2.09      1.46       729547    1318255
## 3       34.36     62.08        2.09      1.46       729547    1318255
## 4       34.36     62.08        2.09      1.46       729547    1318255
## 5       34.36     62.08        2.09      1.46       729547    1318255
## 6       34.36     62.08        2.09      1.46       729547    1318255
##   johnson_vote other_vote ev_dem ev_rep ev_oth census
## 1        44467      31103      9      0      0  South
## 2        44467      31103      9      0      0  South
## 3        44467      31103      9      0      0  South
## 4        44467      31103      9      0      0  South
## 5        44467      31103      9      0      0  South
## 6        44467      31103      9      0      0  South

Here’s a basic test plot.

ggplot(data = state_elec,
            aes(x = long, y = lat,
                group = group, fill = party)) +
      geom_polygon(color = "gray90", size = 0.1) +
      coord_map(projection = "albers", lat0 = 39, lat1 = 45) 

Let’s try it with the colors we assigned for each party.

ggplot(data = state_elec,
             mapping = aes(x = long, y = lat,
                           group = group, fill = party)) +
        geom_polygon(color = "gray90", size = 0.1) +
        coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
        scale_fill_manual(values = partyColors) +
        labs(title = "2016 Presidential Election Results", fill = NULL) +
        theme_map() 

r better yet, why don’t we go crazy and use the five thirty eight theme!

ggplot(data = state_elec,
       mapping = aes(x = long, y = lat,
                     group = group, fill = party)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
  scale_fill_manual(values = partyColors) +
  labs(title = "2016 Presidential Election Results", fill = NULL) +
  theme_map()+
  theme_fivethirtyeight() 

Now that you’ve recovered from that, there are many avenues we could explore using this data set. Here we’ll look at the individual candidates and their winnings by the popular vote.

Let’s look for Trump’s winnings…

tr1 <- ggplot(data = state_elec,
    mapping = aes(x = long, y = lat, group = group, fill = pct_trump)) +
    geom_polygon(color = "gray90", size = 0.1) +
    coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
    theme_map() + 
    theme_fivethirtyeight() +
    scale_fill_gradient(low = "white", high = "#CB454A") +
    labs(title = "Trump vote") + 
    guides(fill = FALSE)

…and then for Clinton’s

cl1 <- ggplot(data = state_elec,
             mapping = aes(x = long, y = lat, group = group, fill = pct_clinton)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  theme_map() + 
  theme_fivethirtyeight() +
  scale_fill_gradient(low = "white", high = "#2E74C0") +
  labs(title = "Clinton vote") +
  guides(fill = FALSE) 

And finally we’ll plot them both

tr1 + cl1

That looks pretty clean but something is off or maybe because I’m partially colorblind.The dense color distribution on the Trump map makes it appear that he won everywhere nearly everywhere and since we are looking at population data, Clinton won

nrow(filter(election2016, winner == "Clinton"))  

states. We found this by counting the number of rows after filtering the election2016 data set where the winner was Clinton. Now let’s try to take a look at the legends to get an idea of what may be going wrong. Again here are Trump’s winnings

tr2 <- ggplot(data = state_elec,
             mapping = aes(x = long, y = lat, group = group, fill = pct_trump)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  theme_map() + 
  theme_fivethirtyeight() +
   theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +
  labs(title = "Trump vote", fill = "Percent") + 
  scale_fill_gradient(low = "white", high = "#CB454A")

and Clinton’s winnings

# Clinton's winnings
cl2 <- ggplot(data = state_elec,
             mapping = aes(x = long, y = lat, group = group, fill = pct_clinton)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  theme_map() + 
  theme_fivethirtyeight() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +
  labs(title = "Clinton vote", fill = "Percent") + 
  scale_fill_gradient(low = "#FFFFFF", high = "#2E74C0")

And together

tr2 + cl2

Ah the percents differ! The range indicated within Trump’s map is shorter than Clinton’s yielding the false representation. To compare them, they should be the same. To do this, we first check out the minimum and maximum values for each former candidate.

trrange <- range(election2016$pct_trump, na.rm = TRUE) # Check the range of Trump's percentages
clrange <- range(election2016$pct_clinton, na.rm = TRUE) # Check the range of Clinton's percentages
range(trrange, clrange) # Check the range of both
## [1]  4.09 90.86

Well it looks like the standard 0-100% will probably work the best here.

tr3 <- ggplot(data = state_elec,
              mapping = aes(x = long, y = lat, group = group, fill = pct_trump)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  theme_map() + 
  theme_fivethirtyeight() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +
  labs(title = "Trump vote", fill = "Percent") + 
  scale_fill_gradient(limits=c(0, 100), low = "#FFFFFF", high = "#CB454A")
cl3 <- ggplot(data = state_elec,
                mapping = aes(x = long, y = lat, group = group, fill = pct_clinton)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  theme_map() + 
  theme_fivethirtyeight() +
   theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +
  labs(title = "Clinton vote", fill = "Percent") + 
  scale_fill_gradient(limits=c(0, 100), low = "#FFFFFF", high = "#2E74C0")
# Plot them both
tr3 + cl3

Now compare all four plots:

tr2 + cl2 + tr3 + cl3 + plot_layout(nrow = 2)

here! That looks better. Now that’s all fine and dandy but individual maps still only tell us about the performance of those candidates among st the states but not against each other. So what we really need to do is to care about the margin of wins. To do this, we can either pick the column d_points or r_points. Since the Democrats lost the general election in 2016 and there was a great deal of coverage regarding the popular vote, let’s choose the Republican points as the comparative baseline and use d_points.

lighter_map <- 
  ggplot(data = state_elec,
         mapping = aes(x = long, y = lat, group = group, fill = d_points)) + 
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  scale_fill_gradient2(
    low = "#CB454A", mid = scales::muted("purple"),
    high = "#2E74C0", breaks = c(-100, -75, -50,-25, 0, 25, 50, 75, 100), 
    labels = c(100, 75, 50, 25, 0, 25, 50, 75, 100)) +
  theme_map() +
  theme_fivethirtyeight() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  labs(title = "Winning margins",
       subtitle = "(The Lighter Map)",
       fill = "Percent")
lighter_map

Take a moment to look at the map and code. Most of the items are probably familiar by now barring scale_fill_gradient2. There are many ways to apply gradients within ggplot. If you want to know more, look at the tidyverse page on scale gradients. Anyway, its seems a little hard to discern the difference between the choice of reds and blues, especially with purple being the middle color. Let’s try taking true reds and blues rather than a shade of them.

darker_map <- 
ggplot(data = state_elec,
       mapping = aes(x = long, y = lat, group = group, fill = d_points)) + 
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  scale_fill_gradient2(low = "red", mid = scales::muted("purple"),
                       high = "blue", breaks = c(-100, -75, -50,-25, 0, 25, 50, 75, 100), 
                       labels = c(100, 75, 50, 25, 0, 25, 50, 75, 100)) +
  theme_map() +
  theme_fivethirtyeight() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +
  labs(title = "Winning margins",
       subtitle = "(The Darker Map)",
       fill = "Percent")
darker_map

In this case, using the range we have includes everyone so there is no reason to stretch out the percentage which makes the coloring lighter. However, I still think that the colors are for The Darker Map are still not bright enough. Maybe the data set is skewing our results. Let’s find out.

# Grab the outliers
outliers = boxplot(election2016$d_points, plot=FALSE)$out 
# Extract the row information from the original data frame
election2016[election2016$d_points %in% outliers,] 
## # A tibble: 1 x 24
##   state        votes st     fips total_vote vote_margin winner party  pct_margin
##   <chr>        <int> <chr> <int>      <int>       <int> <chr>  <chr>       <dbl>
## 1 District of…     3 DC       11     311268      270107 Clint… Democ…       86.8
## # … with 15 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>

Well clearly Washington D.C. is the outlier here. What if we remove it…

darker_map_NODC <- ggplot(data = subset(state_elec,
                           !region %in% "district of columbia"),
             aes(x = long, y = lat, group = group, fill = d_points)) +
  geom_polygon(color = "gray90", size = 0.1) +
  coord_map(projection = "albers", lat0 = 39, lat1 = 45) + 
  scale_fill_gradient2( low = "red", mid = scales::muted("purple"),
                        high = "blue", breaks = c(-100, -75, -50,-25, 0, 25, 50, 75, 100), 
                        labels = c(100, 75, 50, 25, 0, 25, 50, 75, 100)) +
  theme_map() +
  theme_fivethirtyeight() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "#FFFFFF", color = "#FFFFFF"),
    plot.background = element_rect(fill = "#FFFFFF"),
    legend.background = element_rect(fill = "#FFFFFF")) +  
  labs(title = "Winning margins",
                subtitle =str_wrap("(The Darker Map without Washington D.C.)",30),
       fill = "Percent")

…and re plot both darker maps to compare.

darker_map + darker_map_NODC

Now that made a huge difference mainly since the outlier (DC) was such an outlier that it was stretching the color distribution. You can see evidence of this by looking at the legends below each plot.

Part III: Choropleths with Blocks

For this part, please load the following libraries

library(devtools)
## Loading required package: usethis
library(statebins)
library(RColorBrewer)

Then run devtools::install_github("hrbrmstr/statebins") and

library(statebins)

On a side note, There is a version of the package statebins on CRAN but it lacks the functionality we need. More about the new version of statebins here. You do not have to comment it out after the first installation. Read the message if you attempt to run it again. However at some point there will be an official version posted to CRAN so it is recommended that you uninstall this by using remove.packages(statebins). If you get an odd error about ggplot2 being corrupt, restart your R Studio (make sure to save your work space!)

The RColorBrewer package

The RColorBrewer package is the most utilized collection of color palettes used in R. There is a lot of color theory which amazingly is quite interesting and grounded in science and history, but we simply do not have the bandwidth to cover that in this course. But if you’re interested, check out the PBS Eons clip below which addresses the Purple Earth Hypothesis.

For resources, a few sites you may wish to check out are listed below

You can see the entire set of palettes using

display.brewer.all()

which probably looks better in R Studio than here.

Statebins

We’re going to be using the package statebins but the basic command does not come with a great deal of flexibility. For example

statebins(election2016, value_col="pct_trump", palette = "Reds", round=TRUE) +
  theme_statebins()

is built with the default color palette within R. While it’s fine, without hacking or overwriting the basic commands, we are unable make changes like expanding the limits on the legend. For that, we need the command geom_statebins which does not exist in the CRAN version at the time of this writing. With that command, we have the ability to to work within ggplot2. You should also be aware that ggplot really does not like the base R palette! Below, were going to use brewer.pal("all", "Reds") which calls the Reds palette from RColorBrewer

basicRedBrews <- brewer.pal("all", "Reds")
## Warning in brewer.pal("all", "Reds"): n too large, allowed maximum for palette Reds is 9
## Returning the palette you asked for with that many colors

Did you catch the Warning? The RColorBrewer package is nice but limited. We have measures from 50 states but only have 9 colors to account for the differences.

show_col(basicRedBrews)

Now if we apply it to the plot below…

trumpNineRed <- ggplot(election2016, aes(state=state, fill=pct_trump)) +
             geom_statebins() +
             coord_equal() +
             theme_statebins() +
             scale_fill_gradientn(limits = c(0, 100), colors = brewer.pal("all", "Reds")) + 
             labs(title = str_wrap("Trump 2016 Popular Vote Percent Distribution", 30), 
             # Notice the text wrap via thge stringr package.
             subtitle ="(Nine Shades of Red)", 
             # cough cough proper APA formatting cough cough
             fill = NULL)
## Warning in brewer.pal("all", "Reds"): n too large, allowed maximum for palette Reds is 9
## Returning the palette you asked for with that many colors

…we get the same warning and the plot looks like

trumpNineRed

Now sometimes it is a good idea to have a small palette even if you have a large data set which I’ll address later on. However, let’s assume we want 50 colors, and in this case 50 shades of red (not to be confused with another color and what I can only assume is a terrible movie). Unless you really enjoy constructing palettes manually (which many people do!), there is a way to have R do it for you - enter the colorRampPalette command.

# Define a range of 50 colors based on the RColorBrewer Reds palette.
fiftyRedBrews <- colorRampPalette(basicRedBrews)(50) 
show_col(fiftyRedBrews)

Now that’s more like it! So let’s try the plot from earlier again:

trumpFiftyRed <- ggplot(election2016, aes(state=state, fill=pct_trump)) +
  geom_statebins() +
  coord_equal() +
  theme_statebins() +
  scale_fill_gradientn(limits = c(0, 100), colors = fiftyRedBrews) +     
  labs(title = str_wrap("Trump 2016 Popular Vote Percent Distribution", 30),
       subtitle ="(50 Shades of Red)",
       fill = NULL)
trumpFiftyRed

Now compare the other nine color plot versus this newer version with 50.

trumpNineRed + trumpFiftyRed

Did it make a difference? Well that depends on whom you ask. Even if I cannot see one, that does not imply that others cannot. With that said, there are situations where having a smaller palette makes sense. One example of this is when you want to explore groups

threeRedBrews <- brewer.pal(3, "Reds") # We're taking three evenly distributed colors
show_col(threeRedBrews)

trumpThreeRed <- ggplot(election2016, aes(state=state, fill=pct_trump)) +
  geom_statebins() + 
  coord_equal() + # If you run the ggplot and geom_statbins line, it looks rectangular.
  theme_statebins() + # This is akin to the base ggplot2 theme_minimal()
  scale_fill_gradientn(limits = c(0, 100), colors = threeRedBrews) +
  labs(title = str_wrap("Trump 2016 Popular Vote Percent Distribution",30),
       subtitle ="(Three Shades of Red)",
       fill = NULL) 
trumpThreeRed

From the plot, there is an indication from the shading that the the Midwest and south went more for Trump than the coastal areas. This brings us to a good pointYour choice in palettes in both color and range are situationally dependent. In a nutshell, never plot blindly and always think about what you want people to see! Remember, you (should) know your data inside and out but is

  • that understanding translate over into your visualization and

  • visually appealing?

Also, never assume that the same ranges will give you similar results as we’ll see below. However, let’s perform the same type of plot for Clinton firsts.

basicBlueBrews <- brewer.pal("all", "Blues")
## Warning in brewer.pal("all", "Blues"): n too large, allowed maximum for palette Blues is 9
## Returning the palette you asked for with that many colors
show_col(basicBlueBrews)

fiftyBlueBrews <- 
  colorRampPalette(basicBlueBrews)(50)
show_col(fiftyBlueBrews)

clintonFiftyBlue <- 
  ggplot(election2016, aes(state=state, fill=pct_clinton)) +
  geom_statebins() +
  coord_equal() +
  theme_statebins() +
  scale_fill_gradientn(limits = c(0, 100), colors = fiftyBlueBrews) +
  labs(title = str_wrap("Clinton 2016 Popular Vote Percent Distribution", 30),
       subtitle ="(50 Shades of Blue)",
       fill = NULL)
clintonFiftyBlue

threeBlueBrews <- brewer.pal(3, "Blues")
show_col(threeBlueBrews)

clintonThreeBlue <- ggplot(election2016, aes(state=state, fill=pct_clinton)) +
  geom_statebins() +
  coord_equal() +
  theme_statebins() +
  scale_fill_gradientn(limits = c(0, 100), colors = threeBlueBrews) +
  labs(title = str_wrap("Clinton 2016 Popular Vote Percent Distribution", 30),
       subtitle ="(Three Shades of Blue)",
       fill = NULL)
clintonThreeBlue

Now to view them all

trumpFiftyRed + clintonFiftyBlue + trumpThreeRed + clintonThreeBlue + plot_layout(ncol = 2)

Looking at these, it appears that regional differences are best shown by clintonFiftyBlue and trumpThreeRed both of which clearly do not have the same number of colors and corresponding distributions.

trumpThreeRed + clintonFiftyBlue

How can we know for sure? Let’s check out the distributions for each candidate beginning with Trump

ggplot(election2016) +
  geom_bar(mapping = aes(x=state, y=pct_trump, fill = pct_trump), stat = "identity") +
  scale_fill_gradientn(limits = c(0, 100), colors = threeRedBrews) +
  facet_grid(~ census, drop = TRUE, scales = "free_x") +
  theme_bw() +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    legend.position = "bottom"
  ) +
  xlab("Regions") +
  ylab("") + 
  labs(title = "2016 Presidential Results for Trump", 
       subtitle ="Distribution by US Region",
       fill = NULL)

Trump’s darker reds are relatively consistent in the Midwest and South but split in the West. If we plot Clinton’s pattern, you would expect to see the opposite and that is exactly what is plotted.

ggplot(election2016) +
        geom_bar(mapping = aes(x=state, y=pct_clinton, fill = pct_clinton), stat = "identity") +
  scale_fill_gradientn(limits = c(0, 100), colors = fiftyBlueBrews) +
  facet_grid(~ census, drop = TRUE, scales = "free_x") +
  theme_bw() +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    legend.position = "bottom"
  ) +
  xlab("Regions") +
  ylab("") + 
  labs(title = "2016 Presidential Results for Clinton", 
       subtitle ="Distribution by US Region",
       fill = NULL)

Clinton’s darker blues are reds are relatively consistent and concentrated in the Northeast but split in the West. Both of these make sense since Trump won handily in the Midwest and South while Clinton dominated the Northeast. But just to be sure, we’ll look at the comparative data set without DC (since it was an outlier).

ggplot(data = subset(election2016,
                       !region %in% "district of columbia"),
         aes(state = state, fill = d_points)) +
  geom_statebins() +
  coord_equal() +
  theme_statebins() +
  scale_fill_gradient2(low = "red", mid = scales::muted("purple"),
                        high = "blue", breaks = c(-100, -75, -50,-25, 0, 25, 50, 75, 100), 
                        labels = c(100, 75, 50, 25, 0, 25, 50, 75, 100)) +
  labs(fill = NULL)

Sure, this is a method to see the states that are primarily Democrat or Republican. But its also a way to view how mixed (or purple) we are as a nation as well in terms of our voting.

Part IV: Augmenting Maps

Now staying on the 2016 election focus, we’re going to try to reproduce a better version of the results found in the last plot in Part III. What’s the problem? Well many people concentrate on the popular vote. Regardless of your stance on the electoral college, that representative system is what we currently live under and did shape the outcome of the 2016 presidential election. Recall that Clinton amassed

sum(election2016$clinton_vote)
## [1] 65853625

votes while Trump garnered

sum(election2016$trump_vote)
## [1] 62985106

votes which told us that Clinton won the popular vote by

sum(election2016$clinton_vote)-sum(election2016$trump_vote)
## [1] 2868519

votes. To explore the electoral college and its impact, a static ggplot map does not allow for more than three variables as we saw in a prior session. So we have to figure out another way to show size. Why don’t we do it by changing the size of the states?

First let’s get a relatively under utilized set of data from the maps package.

mapbase <- map_data("state.vbm") %>% 
rename(state = region)  %>% # Rename the column region to state
  select_if(~sum(!is.na(.)) > 0) # Drop any columns where the entire set is NA

This .vbm data set is called a SpatialPolygonsDataFrame object in R (yes that’s the actual term) but in the real world, it is known as a Visibility Base Map. What is important here is that the state level longitudinal and latitudinal coordinates been adjusted such that the sizes are as proportionally equal to each other as possible while maintaining the overall shape of the United States. The original point of this was to have all the states be of equal sizes to be utilized as symbols. However, someone much more intelligent than me thought about using it to display demographic data as well. The set looks a bit different than the typical maps data set you have seen before, but its not far off

head(mapbase) # or View(mapbase)
##   long lat group order   state
## 1   89  34     1     1 Alabama
## 2   96  34     1     2 Alabama
## 3   98  20     1     3 Alabama
## 4   91  20     1     4 Alabama
## 5   91  16     1     5 Alabama
## 6   89  17     1     6 Alabama

Take a look at the states now:

map("state.vbm", fill=T, col="white")

Or if you prefer a bit of color in your life:

RViridisD <- viridis_pal(option = "D")(50) # To view use show_col(RViridis)
map("state.vbm", fill=T, col=RViridisD)

We could have used some other versions of the Viridis set. Try the above for options A, B, C, D, or E. Alternatively, you can use a site like this to generate small or large color palettes.

However I urge you to follow elements of proper color theory as outlined here with examples

This next data set state.vbm.center is made up centroids. These provide coordinates of the centers of the 50 states (in alphabetical order).

data(state.vbm.center) # This is a list. Do View(state.vbm.center) to see it.

Now we have multiple data files so we have to join them somehow. First we’ll look at the set that was just pulled and use only what we need. Extra columns and general data points that we don’t need just add to the amount of time it takes for R to get things done.
Below we are going to convert the list state.vbm.center to a data frame using pipes and recall all those do is provide shortcuts for multiple regular R commands. The steps are as follows:

  • Call the state.vbm.center data set

  • Force (aka coerce) the list to a data frame

  • Add in state names which are given by the variable state that we renamed earlier.

  • Pull everything into a variable called truncated_map

truncated_map <- state.vbm.center %>% 
                    as.data.frame() %>% 
                    mutate(state = unique(mapbase$state))

Let’s take a look at what we did.

head(truncated_map) # or View(truncated_map)
##    x  y      state
## 1 93 27    Alabama
## 2 10 19     Alaska
## 3 32 34    Arizona
## 4 76 33   Arkansas
## 5 13 42 California
## 6 46 51   Colorado

Remember that truncated_map is simply a conversion of the list state.vbm.center to a data frame.
Now let’s merge some data sets by using joins. There are many ways to accomplish the end goal which is to put mapbase (state level map data), election2016 (our current data set with various types of election information by state except for electoral college votes), and truncated_map (the centroid data set) together into one cohesive data frame (or file if you prefer). You could arguably download all three and do it in say Excel but it would take quite a while and unless there has been a change in the recent version, it appears you have to do by coding in a language called Visual Basic within the software anyway. Joins are a great way to have computers handle that nonsense and in general, if you ever come across files that you need merged, this type of logic and command structure apples in that situation too! OK here we go!
Step 1: Join mapbase (state level map data) with election2016 (current election data set) and call it merged_electoral

merged_electoral <- 
  election2016 %>% 
    filter(!st == "DC") %>% 
# Since DC is not a state, it is not included in the map data set so we need to get rid
# of it.
      right_join(mapbase) %>% 
# Its a right join since the mapbase data has the coordinates to draw our map
          select_if(~sum(!is.na(.)) > 0) 
## Joining, by = "state"
# Drop any columns where the entire set is NA

Let’s take a look at what we did.

head(merged_electoral) # or View(merged_electoral)
## # A tibble: 6 x 28
##   state   votes st     fips total_vote vote_margin winner party      pct_margin
##   <chr>   <int> <chr> <int>      <int>       <int> <chr>  <chr>           <dbl>
## 1 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 2 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 3 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 4 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 5 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 6 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## # … with 19 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>, long <dbl>,
## #   lat <dbl>, group <dbl>, order <int>

Step 2: Join merged_electoral (state level map data with election data) with truncated_map (the centroid data set) and call it merged_map

merged_map <- merged_electoral %>% 
                left_join(truncated_map) # Its a left join since again the 
## Joining, by = "state"
                                         # merged_electoral data has the coordinates
                                         # to draw our map

OK we’re done and have our full data set. Again, let’s take a look at what we did

head(merged_map) # or View(merged_map)
## # A tibble: 6 x 30
##   state   votes st     fips total_vote vote_margin winner party      pct_margin
##   <chr>   <int> <chr> <int>      <int>       <int> <chr>  <chr>           <dbl>
## 1 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 2 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 3 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 4 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 5 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 6 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## # … with 21 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>, long <dbl>,
## #   lat <dbl>, group <dbl>, order <int>, x <dbl>, y <dbl>

Notice that we have two sets of coordinates now: The ones for the states (long and lat) which differ since those are used to draw the outline and the centroids (x and y) which do not differ except by state and are simply the center point of each state. We need to “scale” or make the standard outlines of the state given by long and lat by reducing the size of each proportionally. To do that, we need a starting point and an ending point.

Well we have the starting point which is just the standard outline of the state itself which we saw earlier

ggplot(merged_map, aes(long, lat, group = group, fill=state)) + 
  geom_polygon(fill="#FFFFFF", color="#000000") +
  coord_fixed() + 
  theme_minimal() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  guides(fill = FALSE) +  
  xlab(NULL) +
  ylab(NULL)

and we certainly have an ending point that are given by the centroids.

ggplot(merged_map, aes(x, y, group = group, fill=state)) + 
  geom_point(color="#8b0a50", size=1) +
  coord_fixed() + 
  theme_minimal() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  guides(fill = FALSE) +  
  xlab(NULL) +
  ylab(NULL)

Again, these denote the center point of each state

ggplot(merged_map, aes(group = group, fill=state)) + 
  geom_polygon(aes(long, lat), fill="#FFFFFF", color="#000000") +
  geom_point(aes(x, y), color="#8b0a50", size=1) +
  coord_fixed() + 
  theme_minimal() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  guides(fill = FALSE) +  
  xlab(NULL) +
  ylab(NULL)

Which implies the largest any state representation can be is the outline and the smallest is the centroid. So within each state, we scale the long and lat points to be closer to the centroid by some factor which in this case are the electoral votes by state.
First we’ll do it for Trump…

Trump_rescaledByEC <- merged_map %>% 
                  group_by(state) %>% 
                  filter(party == "Republican") %>% # Only include where Republicans won
                  mutate(longscale = (votes*10/538)*(long - x) + x, # number of electoral votes x 10 divided by the total electoral votes
                  latscale = (votes*10/538)*(lat - y) + y) 

…then view the result…

head(Trump_rescaledByEC) # or View(Trump_rescaledByEC)
## # A tibble: 6 x 32
## # Groups:   state [1]
##   state   votes st     fips total_vote vote_margin winner party      pct_margin
##   <chr>   <int> <chr> <int>      <int>       <int> <chr>  <chr>           <dbl>
## 1 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 2 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 3 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 4 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 5 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 6 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## # … with 23 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>, long <dbl>,
## #   lat <dbl>, group <dbl>, order <int>, x <dbl>, y <dbl>, longscale <dbl>,
## #   latscale <dbl>

…and finally plot.

TrumpByState_EC <- ggplot(Trump_rescaledByEC, aes(long, lat, group = group, fill = pct_trump)) + 
#  geom_path() + # Enable if you want to see the original outline
  geom_polygon(aes(longscale, latscale)) +
  coord_fixed() + 
  theme_void() + 
  scale_fill_gradientn(limits = c(0, 100), colors = threeRedBrews) +
  labs(title = str_wrap("Trump 2016 Popular Vote Resized by Electoral Vote For States Won", 30),
       fill = "Percent") 
TrumpByState_EC

Now let’s do the same for Clinton…

Clinton_rescaledByEC <- merged_map %>% 
  group_by(state) %>% 
  filter(party == "Democrat") %>% # Only include where Democrats won
  mutate(longscale = (votes*10/538)*(long - x) + x,
         latscale = (votes*10/538)*(lat - y) + y) 

…then view the result…

head(Clinton_rescaledByEC) # or View(Clinton_rescaledByECp)
## # A tibble: 6 x 32
## # Groups:   state [1]
##   state      votes st     fips total_vote vote_margin winner  party   pct_margin
##   <chr>      <int> <chr> <int>      <int>       <int> <chr>   <chr>        <dbl>
## 1 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## 2 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## 3 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## 4 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## 5 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## 6 California    55 CA        6   14237884     4269978 Clinton Democr…       30.0
## # … with 23 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>, long <dbl>,
## #   lat <dbl>, group <dbl>, order <int>, x <dbl>, y <dbl>, longscale <dbl>,
## #   latscale <dbl>

…and finally plot.

ClintonByState_EC <- ggplot(Clinton_rescaledByEC, aes(long, lat, group = group, fill = pct_clinton)) + 
  #  geom_path() + # Enable if you want to see the original outline
  geom_polygon(aes(longscale, latscale)) +
  coord_fixed() + 
  theme_void() + 
  scale_fill_gradientn(limits = c(0, 100), colors = fiftyBlueBrews) +
  labs(title = str_wrap("Clinton 2016 Popular Vote Resized by Electoral Vote by State", 30),
       fill = "Percent")
ClintonByState_EC

If we look at them both

TrumpByState_EC + ClintonByState_EC

then we find indicators of how Trump won. Again this is nice but a side-by-side map is only good when assessing each candidate individually and comparisons take effort. I would personally rather be lazy so let’s view them both on the same map.
Here we simply do not subset the data by party

All_rescaledByEC <- merged_map %>% 
  group_by(state) %>% 
  mutate(longscale = (votes*10/538)*(long - x) + x,
         latscale = (votes*10/538)*(lat - y) + y)

Here’s what it looks like…

head(All_rescaledByEC) # or View(All_rescaledByEC)
## # A tibble: 6 x 32
## # Groups:   state [1]
##   state   votes st     fips total_vote vote_margin winner party      pct_margin
##   <chr>   <int> <chr> <int>      <int>       <int> <chr>  <chr>           <dbl>
## 1 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 2 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 3 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 4 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 5 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## 6 Alabama     9 AL        1    2123372      588708 Trump  Republican       27.7
## # … with 23 more variables: r_points <dbl>, d_points <dbl>, pct_clinton <dbl>,
## #   pct_trump <dbl>, pct_johnson <dbl>, pct_other <dbl>, clinton_vote <int>,
## #   trump_vote <int>, johnson_vote <int>, other_vote <int>, ev_dem <int>,
## #   ev_rep <int>, ev_oth <int>, census <chr>, region <chr>, long <dbl>,
## #   lat <dbl>, group <dbl>, order <int>, x <dbl>, y <dbl>, longscale <dbl>,
## #   latscale <dbl>

…and here’s the plot.

AllByState_EC <- ggplot(All_rescaledByEC, aes(long, lat, group=group, fill=winner)) + 
  #  geom_path() + # Enable if you want to see the original outline
  geom_polygon(aes(longscale, latscale)) +
  coord_fixed() + 
  theme_void() + 
  scale_fill_manual(values= partyColors) +
  labs(title = "States Won by Each Party",
       subtitle ="with each resized by electoral votes",
       fill = NULL) +
  theme_map()+
  theme_statebins() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +
  xlab("") +
  ylab("")
AllByState_EC

But what about the margins you ask? No you didn’t ask. Well here is a plot of those anyway

MarginsByState_EC <- ggplot(data = All_rescaledByEC, aes(long, lat, group=group, fill = d_points)) + 
  geom_polygon(aes(longscale, latscale)) +
 # geom_polygon(color = "gray90", size = 0.1) +
  coord_fixed() + 
# coord_map(projection = "albers", lat0 = 39, lat1 = -20) + 
  scale_fill_gradient2( low = "red", mid = scales::muted("purple"),
                        high = "blue", breaks = c(-100, -75, -50,-25, 0, 25, 50, 75, 100), 
                        labels = c(100, 75, 50, 25, 0, 25, 50, 75, 100)) +
  theme_map() +
  theme_statebins() +
  theme(
    axis.ticks = element_blank(),
    axis.text = element_blank(),
    panel.border = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  ) +      
  labs(title = "Winning margins by Popular Vote",
       subtitle ="with each state resized by electoral votes",
       fill = "Percent")  +
  xlab("") +
  ylab("")
MarginsByState_EC

And just to compare both for fun…yes I said fun:

AllByState_EC + MarginsByState_EC


  1. Consider just pasting it at the top of your script and leaving it there. Please note that this will not work in an Rmarkdown file or Shiny app.↩︎