Skip to content

Instantly share code, notes, and snippets.

@kpq
Last active December 25, 2015 20:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kpq/7034291 to your computer and use it in GitHub Desktop.
Save kpq/7034291 to your computer and use it in GitHub Desktop.
in class data cleaning
library(maptools)
get_second_element <- function(item) {
return (item[2])
}
get_first_element <- function(item) {
return (item[1])
}
# load the data
data <- read.delim("http://shancarter.github.io/ucb-dataviz-fall-2013/classes/data-practice/county-data.txt", header=F, stringsAsFactors=F)
# rename it like a human
names(data) <- c("county_orig", "guns_orig")
# split it up based on parenthesis
split <- strsplit(data$county_orig, split="\\(")
#make a new field for state
data$state_clean <- sapply(split, get_second_element)
#clean guns
data$state_clean <- gsub("\\)", "", data$state_clean)
#make a new county
data$county_clean <- sapply(split, get_first_element)
#clean guns
data$guns_clean <- as.numeric(gsub(",", "", data$guns_orig))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment