Police Killings in Western U.S. During 2015

Kelsey Graczyk, Cydni Mullikin, and Tia Kinilau


This vignette is based on data collected for the 538 story entitled “Where Police Have Killed Americans in 2015” by Ben Casselman available here.

First, we load the required packages to reproduce the analysis.


It is important to note that the data used in our graphs come from the Western states, which include: Oregon, Washington, Idaho, Montana, Wyoming, California, Hawaii, Alaska, Colorado, New Mexico, Utah, Arizona, and Nevada. These graphs do not represent the entire United States. Our data only represents the population proportions of the people in the Western portion of the United States.

Filter the data frame to only the Western U.S.

The data frame shown comes from the fivethirtyeight package using the police_killings data set. This original data frame both includes and focuses on where– state, city, longitude/latitude, and street address– and how the killings occurred by police officers in the U.S. in the year 2015. The police_killings_final data frame that we created filters out only the western portion of the United States and specifies the variables of race/ethnicity, cause of death, armed, and age in order to focus on our interests.

west_states <- c("OR", "WA", "ID", "MT", "WY", "CA", "HI", "AK", "CO", "NM", "UT", "AZ", "NV")
police_killings_final <- police_killings %>%
  filter(state %in% west_states) %>%
  select(raceethnicity, cause, armed, age)

Group by cause of death and armed

This data frame takes the data from police_killings_final and summarizes the total count based off the cause of death as well as the type of weapon the victim was armed with, if any at all. The “count” in this data frame refers to the number of people that fit the specific criteria of each category. The inclusion of tidyr::complete was made in order to have the bars in the plot that follows be of the same width.

police_complete <- police_killings_final %>% 
  group_by(cause, armed) %>% 
  summarize(count = n()) %>% 
  ungroup() %>% 
  tidyr::complete(cause, armed, fill = list(count = 0))
cause armed count
Death in custody Disputed 0
Death in custody Firearm 1
Death in custody Knife 3
Death in custody No 2
Death in custody Non-lethal firearm 0
Death in custody Other 0
Death in custody Vehicle 0
Gunshot Disputed 1
Gunshot Firearm 74
Gunshot Knife 23
Gunshot No 27
Gunshot Non-lethal firearm 6
Gunshot Other 5
Gunshot Vehicle 6
Struck by vehicle Disputed 0
Struck by vehicle Firearm 2
Struck by vehicle Knife 1
Struck by vehicle No 0
Struck by vehicle Non-lethal firearm 0
Struck by vehicle Other 0
Struck by vehicle Vehicle 0
Taser Disputed 0
Taser Firearm 4
Taser Knife 0
Taser No 1
Taser Non-lethal firearm 0
Taser Other 0
Taser Vehicle 0

Relationship between the cause of death and victims being armed

For this plot we wondered what it would look like if we compared the causes of death from a police officer with the type of weapon the victims were armed with, if at all.

police_complete <- police_killings_final %>% 
  group_by(cause, armed) %>% 
  summarize(count = n()) %>% 
  ungroup() %>% 
  tidyr::complete(cause, armed, fill = list(count = 0))
ggplot(data = police_complete,
       mapping = aes(x = cause, y = count))+
  geom_col(aes(fill = armed), position = "dodge", color = "white") +
  labs(x = "Cause of Death", y = "Count", title = "Cause of Death vs Armed Victims")

This plot shows the relationship between the cause of death and the type of weapon the victim was armed with, if any. From this graph we can see that those who were killed by a gunshot had a high count of having a firearm on them during the incident. Furthermore, in the gunshot category, nearly a quarter of the victims were unarmed in general. It is apparent that significantly more deaths occurred by gunshot than any other category– death in custody, struck by vehicle, or by taser. In the instance of categories other than gunshot, it is interesting to note that the victims were only armed with a firearm, knife, or nothing at all.

Now let’s take a zoomed-in look at deaths caused by gunshot

From the previous graph, it was apparent that the most prevalent cause of death was by gunshot. We decided to give you a better look at the people and what types of weapons they were armed with, if any.

police_zoom <- police_killings_final %>%
  filter( cause == "Gunshot")
ggplot(data = police_zoom, mapping = aes(x = armed)) +
  geom_bar(position = "dodge", color = "white", aes(fill = armed))+
  labs(x = "Armed?", y = "Count", title = "Death by Gunshot" )

This graph shows us that those who were killed by gunshot were mostly armed with a firearm. Although this graph serves as a visual representation, it does not give a numerical summary of the data. Below, we created a table that depicts this data more in depth.

Now let’s take a look at the proportion of those killed by gunshot

This data frame focuses on those who were killed by gunshot. It displays the actual proportions of the different weapons carried by those who were shot.

police_proportion <- police_killings_final %>%
  filter( cause == "Gunshot") %>%
  group_by(armed) %>%
  summarize( count = n()) %>%
  mutate(prop = count / sum(count))
armed count prop
Disputed 1 0.0070423
Firearm 74 0.5211268
Knife 23 0.1619718
No 27 0.1901408
Non-lethal firearm 6 0.0422535
Other 5 0.0352113
Vehicle 6 0.0422535

From this data frame, we can see that more than 50% of those who were killed by gunshot were also carrying a firearm, while about 20% were not armed at all.

Now let’s see how race/ethnicity ties in with our previous findings

For this plot, we wondered what it would look like if we compared the causes of death with the races/ethnicities of those who were killed by police officers. We also used the na.omit function to exclude all rows that had entries with missing NA values.

police_killings_omit <- police_killings_final %>%
  na.omit() %>%
  group_by(cause, raceethnicity) %>%
  summarize(count = n()) %>%
  ungroup() %>%
  tidyr::complete(cause, raceethnicity, fill = list(count = 0))
ggplot(data = police_killings_omit, 
       mapping = aes(x = cause, y = count)) +
  geom_col(aes(fill = raceethnicity), position = "dodge", color = "white") +
  labs(x = "Cause of Death", y = "Count", title = "Cause of Death vs Race/Ethnicity")

This second plot shows the relationship between the causes of death and the races/ethnicities of those who were killed by police officers. It is clear that whites had the highest count of death by gunshot in the Western United States. However, this may be due to the fact that the Western U.S. is predominantly made up of the Caucasian race. It may be of importance to note that in today’s media African Americans are suggested to be more susceptible to gunfire than whites– something contested in our graph. It is also interesting to point out that only the white race was killed by being struck by a vehicle. In the case of being killed by a taser, the number of deaths were evenly distributed throughout all races and ethnicities.

Focus on race/ethnicity within the gunshot variable

For this plot, we decided to focus on the races/ethnicities of the victims who were killed by gunshot because this category had the most prominent number of individuals.

police_zoom_RE <- police_killings_final %>%
  na.omit() %>% 
  filter(cause == "Gunshot")
ggplot(data = police_zoom_RE, mapping = aes(x = raceethnicity)) +
  geom_bar(aes(fill = raceethnicity), position = "dodge", color = "white") +   
  labs(x = "Race", y = "Count", title = "Death by Gunshot vs Race/Ethnicity")

This graphic displays that white people have the highest count of deaths by gunshot, followed by Hispanics/Latinos and then African Americans. The two lowest counts came from Asians/Pacific Islanders as well as Native Americans. While it shows that more white people were killed by gunshot in the U.S. in 2015, it is also the case that whites make up the vast majority of the Western Region of the United States.

Concluding thoughts

Considering we have focused on data based on the Western region of the U.S., it’s impossible to say that our information is representative of the entire country. When looking at cause of death and ethnicity, our graphs depict whites being killed by gunshot more than any other race which might not accurately portray the ethnicities and their proportions throughout the entire United States. We were interested in this data frame because of social media’s portrayal of the discrimination against particular ethnicities throughout the country, yet it is clear from the data on the western region that the media has a skewed perspective. In order to have more accurate data and interpretations we would need the numbers from the rest of the U.S. and its parallel population ratios of ethnicities. On a side note, our graph depicting cause of death and armed victims has a value including “other” armed options. We assumed that these options included items such as bats, brass knuckles, crowbars, and other blunt objects able to inflict serious damage.

We also looked up the demographics of the Western U.S. in order to get a better understanding of the data being displayed. On the Kaiser Family Foundation website, we found proportions of races/ethnicities in the Western U.S. that corresponded to the information we mentioned earlier.