Skip to content

Instantly share code, notes, and snippets.

Created Jan 18, 2013
What would you like to do?
# install and load additional packages
# read in data
dt <- read.delim("movieReports/20130109_to_20120113.tsv",header=TRUE,sep="\t",comment.char="",quote="",na.strings="",fileEncoding="UTF-8")
# convert the tweet column from a factor (default) to a character string
# add a dummy variable for counts
dt<-transform(dt, dummy=1)
# create a zoo object (after many unsuccessful tries)
z <- zoo(dt$dummy,$postedTime),format="%Y-%m-%dT%H:%M:%OSZ",tz="GMT"))
# aggregate by passing a function that uses another R time functionality
aggregate(z,function(x) as.POSIXct(trunc(x,"hour")) )
# plot the time series (by default this will open a popup window
plot(aggregate(z,function(x) as.POSIXct(trunc(x,"hour")) ))
# this will change the times to Pacific time, from GMT (perhaps there is a better way)
plot(aggregate(z,function(x) as.POSIXct(trunc(x-(8/24),"hour")) ))
#this will save the
# I was interested in a spike that I saw at 2013-01-13 19:00:00:
spike<-dt[as.POSIXct(trunc(time(z),"hour"))==as.POSIXct("2013-01-13 19:00:00"),]
# this sorts the tweets from the spike by retweet count
# output of top retweetedTweets:
# [23340] "RT @goldenglobes: Best Actress in a Motion Picture - Drama - Jessica Chastain - Zero Dark Thirty - #GoldenGlobes"
#[23341] "RT @vuecinemas: Help stop the mob with #GangsterSquad on Jan 10th! To win one of 5 movie packs, follow us and retweet this message by 5p ..."
# [23342] "Woot! RT @Bad_Wobot1013: Awesome Jessica Chastain!! "
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment