Skip to content

Instantly share code, notes, and snippets.

@Tai-Pach
Created May 7, 2018 03:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Tai-Pach/2c18c09d75bf5af9bed33c2004b79207 to your computer and use it in GitHub Desktop.
Save Tai-Pach/2c18c09d75bf5af9bed33c2004b79207 to your computer and use it in GitHub Desktop.
Prepare the Kickstarter dataset for analysis
# read the file
kickstarter <- fread(file = "./data/ks-projects-201801.csv", header = T)
# remove 7 observations that have incorrect launch dates (year says "1970")
kickstarter = kickstarter[c(-2843, -48148, -75398, -94580, -247914, -273780, -319003),]
# covert deadline values to date type
kickstarter$deadline <- as.Date(kickstarter$deadline, "%Y-%m-%d")
#covert launched values to date type
kickstarter$launched <- as.Date(kickstarter$launched, '%Y-%m-%d %H:%M:%S')
# add a new column for project duration
kickstarter$project_duration_days <- kickstarter$deadline - kickstarter$launched
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment