Skip to content

Instantly share code, notes, and snippets.

@cokelly cokelly/USA Census Data
Last active May 1, 2018

Embed
What would you like to do?
---
title: Child Poverty and Race, USA 2016
author: Ciaran
date: '2018-05-01'
slug: child-poverty-and-race-usa
categories:
- rstats
tags:
- tidy_tuesday
- inequality
- race
header:
caption: ''
image: ''
---
```{r setup, echo=FALSE, include=FALSE}
knitr::opts_chunk$set(cache = TRUE,
echo = FALSE)
library(tidyverse)
library(cowplot)
library(kableExtra)
```
This week's [Tidy Tuesday](https://github.com/rfordatascience/tidytuesday) invited us to examine county-level census data from 2016 in the United States (sourced [here](https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml)). I have been poking around in the data and discovered an interesting visual artefact in the following otherwise unsurprising quick plot:
```{r poverty_unemployment}
county_data_with_region <- county_data %>% # Divide states by regions
mutate(Region = case_when(State == "Connecticut" |
State == "Maine" |
State == "Massachusetts" |
State == "New Hampshire" |
State == "Rhode Island" |
State == "Vermont" |
State == "New Jersey" |
State == "New York" |
State == "Pennsylvania" ~ "Northeast",
State == "Illinois" |
State == "Indiana" |
State == "Michigan" |
State == "Ohio" |
State == "Wisconsin" |
State == "Iowa" |
State == "Kansas" |
State == "Minnesota" |
State == "Missouri" |
State == "Nebraska" |
State == "North Dakota" |
State == "South Dakota" ~ "Midwest",
State == "Delaware" |
State == "Florida" |
State == "Georgia" |
State == "Maryland" |
State == "North Carolina" |
State == "South Carolina" |
State == "Virginia" |
State == "District of Columbia" |
State == "West Virginia" |
State == "Alabama" |
State == "Kentucky" |
State == "Mississippi" |
State == "Tennessee" |
State == "Arkansas" |
State == "Louisiana" |
State == "Oklahoma" |
State == "Texas" ~ "South",
State == "Arizona" |
State == "Colorado" |
State == "Idaho" |
State == "Montana" |
State == "Nevada" |
State == "New Mexico" |
State == "Utah" |
State == "Wyoming" |
State == "Alaska" |
State == "California" |
State == "Hawaii" |
State == "Oregon" |
State == "Washington" ~ "West",
TRUE ~ "Puerto Rico")) %>%
filter(Region != "Puerto Rico")
```
```{r make_plot}
poverty_plot <- ggplot(county_data_with_region, aes(White, ChildPoverty)) + # Compare Child Poverty to total white people, by region
geom_point(na.rm = TRUE, alpha = 0.2, colour = "lightblue") +
geom_smooth(method = loess, na.rm = TRUE) +
scale_color_brewer(type = "seq", palette = "Blues") +
facet_grid(. ~ Region) +
background_grid(major = "xy", minor = "none") +
ggtitle("Child poverty falls as the proportion of white people increases.", subtitle = "Note the 'tipping point' at about 75White (%) in the South and Northeast") +
xlab("White (%)") +
ylab("% children in poverty")
poverty_plot
```
I split the USA's `r prettyNum(nrow(county_data_with_region), big.mark = ",")` counties (excluding Puerto Rico) into regions, based on the US Census Bureau's [four statistical regions](https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf) (pdf). I excluded Puetro Rico because its racial breakdown does not fit the exercise here. And I chose to classify counties by White (%) because that seems the best way to capture the percentage of the population from racial minorities. No doubt this is a hugely crude way to go about the exercise.
Anyway, even with my basic familiarity with the United States I am hardly surprised to see child poverty correlate negatively with the proportion of the population that is white. What _is_ surprising I think is the way that child poverty seems higher in the Northeast and especially in the South in more homogeneous counties.
Still, we have to be careful. When we look at the Southern counties what we are seeing here _might_ relate to a combination of [rural poverty](https://www.prb.org/childpoverty/) and an uneven distribution of people by race. Indeed, it is very striking _how_ segregated the United States are.
```{r segregation}
county_data_south <- county_data_with_region %>% filter(Region == "South")
segregation <- county_data_south %>%
mutate(segregated = case_when(Hispanic >= 75 | #Test for any county having a 75% of more racial share
White >= 75 |
Black >= 75 |
Native >= 75 |
Asian >= 75 |
Pacific >= 75 ~ "Yes", TRUE ~ "No")) %>%
group_by(segregated, State) %>% # Group by whether segregated plus by state
summarise(Segregated2 = n()) %>%
tidyr::spread(., segregated, Segregated2) %>% # Spread to "yes" or "no" columns
mutate_all(funs(replace(., is.na(.), 0))) %>% # Replace NAs
mutate(Counties = (Yes+No)) %>% # Total number of counties
mutate(`Segregated (%)` = round((Yes/Counties)*100, digits = 0)) %>% #Get segregated %
select(State, Counties, Segregated = Yes, `Not Segregated` = No, `Segregated (%)`) %>%
arrange(desc(`Segregated (%)`))
knitr::kable(segregation, format = "html", align = "l", caption = "Counties with population composed of 75% or more from one racial group") %>%
kable_styling(full_width = FALSE, bootstrap_options = "striped")
```
But note also that each state (and I imagine at more granular levels than that) has its own population distributions:
```{r plot2, warning=FALSE}
south_plot <- ggplot(county_data_south, aes(White, ChildPoverty)) +
geom_point(na.rm = TRUE, alpha = 0.2, colour = "lightblue") +
geom_smooth(method = loess, na.rm = TRUE) +
facet_wrap(~ State, ncol = 4) +
scale_color_brewer(type = "seq", palette = "Blues") +
background_grid(major = "xy", minor = "none") +
xlab("White (%)") +
ylab("% children in poverty") +
theme(axis.ticks.y = element_blank(),
axis.text.y = element_blank())
south_plot
```
So, as always, more study required.
Gist [here](https://gist.github.com/cokelly/0f6a6cc73d372989408d0fab73c7de04).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.