Skip to content

Instantly share code, notes, and snippets.

@klmr
Last active May 5, 2022 11:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save klmr/23ed79f973c75b11b0b5 to your computer and use it in GitHub Desktop.
Save klmr/23ed79f973c75b11b0b5 to your computer and use it in GitHub Desktop.
Gun ownership vs gun deaths reanalysis
---
author: "<a href=\"http://twitter.com/klmr\">@klmr</a>"
title: "Gun deaths and gun ownership in the USA"
date: 2015-06-19
output:
html_document:
theme: readable
---
```{r echo=FALSE}
library(dplyr, warn.conflicts = FALSE, quietly = TRUE)
library(ggplot2, warn.conflicts = FALSE, quietly = TRUE)
library(ggthemes, warn.conflicts = FALSE, quietly = TRUE)
fte = theme_fivethirtyeight() %+replace%
theme(axis.title = element_text(),
axis.title.y = element_text(angle = 90),
rect = element_rect(fill = 'white', colour = 'white', size = 0.5, linetype = 1),
panel.margin.x = NULL,
panel.margin.y = NULL)
theme_set(fte)
scale_colour_discrete = scale_colour_fivethirtyeight
scale_fill_discrete = scale_fill_fivethirtyeight
library(XML, warn.conflicts = FALSE, quietly = TRUE)
options(stringsAsFactors = FALSE)
```
On Twitter, somebody claimed that
<blockquote class="twitter-tweet" lang="en"><p lang="en" dir="ltr">guess what, gun ownership correlates with gun deaths in the US <a href="https://t.co/PmOMEOnMy1">https://t.co/PmOMEOnMy1</a> <a href="http://t.co/LYFtpHGeNy">pic.twitter.com/LYFtpHGeNy</a></p>&mdash; Arthur Charpentier (@freakonometrics) <a href="https://twitter.com/freakonometrics/status/611818369931722752">June 19, 2015</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
But looking at the data, I wasn’t convinced:
<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p lang="en" dir="ltr"><a href="https://twitter.com/freakonometrics">@freakonometrics</a> <a href="https://twitter.com/neilhall_uk">@neilhall_uk</a> Uhm, no. Clearly two clusters, and next to no correlation within each cluster.</p>&mdash; Konrad Rudolph (@klmr) <a href="https://twitter.com/klmr/status/611826272881233920">June 19, 2015</a></blockquote>
I decided to actually look at the data and verify that the “clusters” with Gun
ownership < 20% and > 20%, respectively, were indeed internally uncorrelated.
## Plotting the data
As a first step, I retrieve the data and put it into a usable format. The first
table answers the question,
> Are any firearms now kept in or around your home? Include those kept in a
> garage, outdoor storage area, car, truck, or other motor vehicle?
```{r}
gun_ownership_url = 'http://www.washingtonpost.com/wp-srv/health/interactives/guns/ownership.html'
gun_ownership = readHTMLTable(gun_ownership_url, header = TRUE, which = 1)
gun_ownership = gun_ownership[-1, ]
parse_num = function (x) as.numeric(sub(',', '', x))
gun_ownership = select(gun_ownership, State = 1, Total = 2, Yes = 3,
`Yes %` = 4, No = 5, `No %` = 6) %>%
mutate_each(funs(parse_num), -State)
head(gun_ownership)
```
The “\*” annotation next to a state name indicates that the state possesses
child access prevention laws, according to the website. Let’s annotate this
properly.
```{r}
gun_ownership = gun_ownership %>%
mutate(`Child access prevention` = grepl('\\*$', State),
State = sub('\\*$', '', State))
```
DC is called “The District”. Let’s fix that.
```{r}
gun_ownership[gun_ownership$State == 'The District', 'State'] = 'District of Columbia'
```
Next, here’s the firearms deaths data.
```{r}
# Website actively prevents scraping, but allows downloading data.
#gun_deaths_url = 'http://kff.org/other/state-indicator/firearms-death-rate-per-100000/'
#gun_deaths = readHTMLTable(gun_deaths_url)
# Use manually downloaded CSV output.
gun_deaths = read.csv('raw_data.csv', skip = 3) %>%
select(State = 1, `Deaths per 100000` = 2)
head(gun_deaths)
```
Now merge the data.
```{r}
data = inner_join(gun_ownership, gun_deaths, by = 'State') %>%
select(State,
`Gun ownership (%)` = `Yes %`,
`Child access prevention`,
`Deaths per 100000`)
head(data)
```
Plot the data to verify that it’s roughly the same as that published.
```{r}
plot = ggplot(data, aes(x = `Gun ownership (%)`, y = `Deaths per 100000`)) +
geom_point()
plot
```
The general shape is certainly correct. A few states however have very
different positions. Let’s add state names.
```{r}
state_names_url = 'http://www.infoplease.com/ipa/A0110468.html'
state_names = readHTMLTable(state_names_url, header = TRUE, which = 1) %>%
{.[-1, ]} %>%
select(State = 1, Code = 3)
state_names[state_names$State == 'Dist. of Columbia', 'State'] = 'District of Columbia'
data = inner_join(data, state_names, by = 'State')
plot = plot %+% data
plot + geom_text(aes(x = `Gun ownership (%)` + 2, label = Code), size = 3)
```
This shows the correspondence better. The only clearly moved state is “Arizona”
(AZ). Let’s look at the correlation now.
```{r}
model = `Deaths per 100000` ~ `Gun ownership (%)`
fit = lm(model, data)
p = function (fit) format.pval(anova(fit)$`Pr(>F)`[1], digits = 10)
plot + geom_smooth(method = lm) +
annotate('text', x = 40, y = 3,
label = sprintf('R^2 == %0.2f', summary(fit)$r.squared), parse = TRUE)
```
So, a clear correlation at $P$ = `r p(fit)` significance.
## Clusters
Now let’s examine these clusters. I’m using a gun ownership cut-off of 20%
because that’s the clusters I initially perceived:
```{r}
plot + aes(color = `Gun ownership (%)` < 20)
```
Here I’ll first look at their correlation and then try to explain the clusters.
```{r}
data = data %>%
mutate(`Bottom left` = `Gun ownership (%)` < 20)
bottom_left_fit = lm(model, filter(data, `Bottom left`))
top_right_fit = lm(model, filter(data, ! `Bottom left`))
```
These correlate much less clearly (bottom left: $R^2$ =
`r summary(bottom_left_fit)$r.squared`, top right: $R^2$ =
`r summary(top_right_fit)$r.squared`) but the correlation in
the top right cluster is still significant: $P$ = `r p(top_right_fit)` (the
bottom left cluster isn’t, $P$ = `r p(bottom_left_fit)`).
Let’s look at the clusters on a map:
```{r fig.width=10}
us_map = map_data('state')
map = ggplot(data) +
geom_map(aes(map_id = tolower(State)), map = us_map) +
expand_limits(x = us_map$long, y = us_map$lat) +
coord_fixed()
map + aes(fill = `Gun ownership (%)` < 20)
```
… all states in the Northeast. In fact, if you don’t like guns, or don’t want
to get murdered, simply don’t live in the Midwest:
```{r fig.width=10}
map + aes(fill = `Gun ownership (%)`)
map + aes(fill = `Deaths per 100000`)
```
## Child access prevention laws
Finally let’s look at the states with child access prevention laws.
```{r}
plot + geom_point(aes(color = `Child access prevention`))
```
… unsurprisingly, those are the states with low gun ownership, and fewer
deaths. Unfortunately this once again doesn’t tell us whether these laws
actually contribute to the lower number of guns (the exact opposite is in fact
likely: states with a previously lower number of guns per capita have less
incentive to oppose gun control laws, because they are less vested in gun
rights).
## What does this tell us?
Not much. The correlation between gun possession and gun deaths is real, and
even when excluding the exceptionally sane Northeast states, the correlation
persists (contrary to what I had assumed). What remains unclear is the cause.
Whenever a similar plot gets posted online, the implicit claim is that there’s
a *causal* relationship. But the correlation shows nothing of the sort.
A more comprehensive look at the evidence can be found on [Skeptics Stack
Exchange](http://skeptics.stackexchange.com/q/976/82). The discussion there
explains some of the problems with merely looking at such correlations. The
studies linked there show one thing very clearly: **there is no consensus**.
And if there’s any effect at all, [it will be small in comparison to other
factors such as poverty level
etc.](http://www.jstor.org/stable/3487348?seq=1#page_scan_tab_contents)
This isn’t to say that (much) stricter gun control laws aren’t desirable, nor
that they wouldn’t be effective. However, they are only a very small part of
the story, and cross-state correlation is not good evidence to show their
effectiveness.
In the end, I think a more compelling argument in favour of gun control laws
is the following consideration: would you rather live in a society where
citizens (up to 60%!) place so little trust in the system that they fully
expect to have to defend themselves using extreme violence, or would you
rather live in a society that is peaceful and civilised so that citizens can
live with the expectation that risks to their lives are negligible and they do
not constantly have to fear bodily harm? For me, the answer is clear.
---
[Source as a Gist](https://gist.github.com/23ed79f973c75b11b0b5)
@halisile3
Copy link

Can I use this script for my product page of CCI 500 primers for sale which is hosted on WordPress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment