Skip to content

Instantly share code, notes, and snippets.

@diegovalle
Last active December 31, 2015 14:09
Show Gist options
  • Save diegovalle/7998868 to your computer and use it in GitHub Desktop.
Save diegovalle/7998868 to your computer and use it in GitHub Desktop.
Changes in homicide rates
Changes in homicide rates
========================================================
Look, another example of using the [mxmortalitydb](https://github.com/diegovalle/mxmortalitydb) package! Changes and trend in homicide rates in the most violent metro areas or big municipios (note the log scales). For comparison the homicide rate in Chicago (metro area) was 8.2
```{r}
library(mxmortalitydb)
library(stringr)
library(plyr)
library(ggplot2)
library(grid) ## needed for arrow
```
```{r}
plotMetro <- function(metro.name, metro.areas) {
## Plot the homicide counts in a metro area or municipio
## metro.name - name of the metro area to plot
## metro.areas - data frame containing a list of metro areas in the same format as the metro.area dataframe from mxmortalitydb
## data.frame metro.areas contains the 2010 CONAPO metro areas
df <- merge(injury.intent, metro.areas, by.x = c("state_reg", "mun_reg"),
by.y = c("state_code", "mun_code"))
## Yearly homicides in Mexico City, by state of registration
df2 <- ddply(subset(df, metro_area == metro.name & intent.imputed == "Homicide"), .(year_reg),
summarise, count = length(state_reg))
ggplot(df2, aes(year_reg, count)) + geom_line() +
labs(title = str_c("Homicides (plus deaths of unknown intent classified as homicide) in\n", metro.name)) +
ylim(0, max(df2$count)) +
ylab("homicide count") +
xlab("year of registration") +
theme_bw()
}
plotChanges <- function(df, metro.areas, country.rate, years) {
## Plot of rates and trends
## df - injury.intent dataframr
## metro.areas - data frame containing a list of metro areas in the same format as the metro.area dataframe from mxmortalitydb
## country.rate - rate to show as a gray dotted line
## years - start and end year to compare changes
## When the municipio where the death occurred is unknown use
## the municipio where it was registered as place of occurrance
df[df$mun_occur_death == 999, ]$state_occur_death <-
df[df$mun_occur_death == 999, ]$state_reg
df[df$mun_occur_death==999, ]$mun_occur_death <-
df[df$mun_occur_death==999, ]$mun_reg
## Counts of homicide by state and municipio
df <- ddply(subset(df, year_reg %in% years & intent.imputed == "Homicide"),
.(state_occur_death, mun_occur_death, year_reg),
summarise, count = length(state_reg))
## Merge the counts with our fake metro areas
df <- merge(df, metro.areas, by.x = c("state_occur_death", "mun_occur_death"),
by.y = c("state_code", "mun_code"))
## Now get the counts by metro area (which may contain more than one municipio)
df <- ddply(df, .(metro_area, year_reg), summarise,
count = sum(count),
population = sum(mun_population_2010),
rate = count / population * 10^5)
## We are only interesed if the metro area at some time had a homicide rate of at least 15
df <- subset(df, metro_area %in% subset(df, rate > 15)$metro_area)
## Make sure the dataframe is ordered by metro and year
df <- df[order(df$metro_area, df$year_reg),]
## Order the chart by homicide rate in 2012
df$metro_area <- reorder(df$metro_area, df$rate, function(x) x[[2]])
## Data frame for the arrow structure
arrows <- ddply(df, .(metro_area), summarise,
start = rate[1],
end = rate[2],
metro_area = metro_area[1],
change = ifelse(rate[1] >= rate[2], "decrease" ,"increase"))
ggplot(df,
aes(rate, metro_area, group = as.factor(year_reg), color = as.factor(year_reg))) +
geom_point(aes(size = log(count))) +
labs(title = "Homicide (plus deaths of unknown intent classified as homicide) rates and trends") +
scale_size("number\nof\nhomicides", breaks = c(log(50),log(500),log(3000)),
labels = c(50, 500, 3000)) +
geom_segment(data = arrows, aes(x= start , y = metro_area,
xend = end, yend = metro_area,
group = change, color = change),
arrow=arrow(length=unit(0.3,"cm")), alpha = .8) +
scale_color_manual("year\nand\ntrend",
values = c("gray", "black", "blue", "red")) +
ylab("metro area or municipio") +
xlab("homicide rate") +
#scale_x_log10()+
geom_vline(xintercept = country.rate, linetype = 2, color = "#666666") +
annotate("text", y = "Tapachula", x = 25, label = "country\naverage\n2012",
hjust = -0.1, size = 4, color = "#666666") +
theme_bw()
}
```
```{r}
## Let's treat the big municipalities which are not part of a metro area
## as if they were one
## rename big.municipios to merge with metro.areas
big.municipios2 <- big.municipios
names(big.municipios2) <- c("state_code", "mun_code",
"mun_population_2010", "metro_area")
metro.areas.fake <- rbind.fill(metro.areas, big.municipios2)
```
And changes from 2006 to 2012:
```{r fig.width=8, fig.height=9.6}
plotChanges(injury.intent, metro.areas.fake, 24.5, c(2011,2012))
ggsave("change2011-2012.svg", dpi = 100, width = 8, height = 9.60)
plotChanges(injury.intent, metro.areas.fake, 24.5, c(2006,2012))
ggsave("change2006-2012.svg", dpi = 100, width = 8, height = 9.60)
```
Interesting that Tijuana had about the same homicide rate in 2012 as in 2006. The rest of the violent metro areas/big municipios which saw decreases are in Michoacán. Sadly, it doesn’t look like pattern will hold in 2013, according to [crimenmexico](http://crimenmexico.diegovalle.net/) Michoacán is experiencing a surge of violence and as October was at maximum.
Do note that the charts were made using the 2010 population according to the CONAPO that comes with mxmortalitydb, so homicides in 2012 were overestimated by a little bit and underestimated by a little bit in 2006. Also rather than using the raw homicide numbers I adjusted them by classifying deaths of unknown intent.
```{r fig.width=7, fig.height=6}
ll <- list("Acapulco", "Nuevo Laredo", "La Laguna",
"Chihuahua", "Tecomán", "Juárez", "Culiacán",
"Victoria", "Hidalgo del Parral", "Zihuatanejo de Azueta",
"El Mante",
"Ciudad Valles",
"Durango", "Cuernavaca", "Zacatecas-Guadalupe",
"Monterrey",
"Piedras Negras", "Mazatlán", "Veracruz",
"Tijuana", "Guadalajara",
"Tepic", "Coatzacoalcos")
names(ll) <- ll # make lapply print the names of the metro areas
lapply(ll, plotMetro, metro.areas.fake)
```
Check out the [source code](https://gist.github.com/diegovalle/7998868) as an R markdown file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment