Skip to content

Instantly share code, notes, and snippets.

@ryancatalani
Last active December 16, 2015 22:09
Show Gist options
  • Save ryancatalani/5504693 to your computer and use it in GitHub Desktop.
Save ryancatalani/5504693 to your computer and use it in GitHub Desktop.
Some methods to analyze your Twitter history.
require 'csv'
require 'net/http'
require 'uri'
require 'json'
require 'time'
replies = 0
replies_to = []
retweets = 0
retweets_from = []
sources = []
times = []
urls = []
words = []
source_web_times = []
source_tweetbot_times = []
source_twitter_app_times = []
csv = CSV.parse(File.read("tweets.csv"), {:headers => true})
csv.each do |row|
replies += 1 if row[1].length > 0
replies_to << row[2] if row[2].length > 0
retweets += 1 if row[3].length > 0
retweets_from << row[4] if row[4].length > 0
times << row[5]
sources << row[6]
words << row[7].downcase.split(' ')
urls << row[8] if row[8] && row[8].length > 0
# ---
# Uncomment this section to analyze when certain Twitter
# clients - in this case, Tweetbot and the Twitter apps -
# were used.
# ---
# source = row[6]
# time = row[5]
# case
# when source == "web"
# source_web_times << time
# when source.include?("Tweetbot")
# source_tweetbot_times << time
# when source.include?("Twitter for")
# source_twitter_app_times << time
# end
end
# ---
# Uncomment this section to analyze the most used significant words.
# ---
# words.flatten!
# dont_count = ["rt","the","be","to","of","and","a","in","that","have","I","i","it","for","not","on","with","he","as","you","do","at","this","but","his","by","from","they","we","say","her","she","or","an","will","my","one","all","would","there","their","what","so","up","out","if","about","who","get","which","go","me","when","make","can","like","time","no","just","him","know","take","person","into","year","your","good","some","could","them","see","other","than","then","now","look","only","come","its","over","think","also","back","after","use","two","how","our","work","first","well","even","new","want","because","any","these","give","day","most","us","are","is","were","was","has","having","had","did","does","doing","done","said","says","saying","goes","going","went","gone","made","making","could","likes","liked","liking","knew","known","knowing","sees","seeing","saw","seen","looks","looked","looking","came","coming","thought","thinking","gave","given","giving","find","found","finding","finds","tell","told","tells","telling","ask","asks","asking","asked","works","working","worked","seem","seems","seemed","seeming","feel","felt","feels","feeling","try","tries","trying","tried","leave","left","leaves","leaving","call","calling","called","calls","last","long","great","little","own","old","right","big","high","different","small","large","next","early","young","important","few","public","bad","same","able","many","beneath","under","above"]
# words = words.delete_if{|i| dont_count.include?(i) }
# puts words.group_by(&:inspect).map{|k,v| [k, v.length]}.sort_by{|k,v| v}
# ---
# Uncomment this section to analyze the who you retweeted the most.
# Or switch redirects_from to replies_to to see who you replied to the most.
# ---
# id_redirect_url = "https://twitter.com/account/redirect_by_id?id="
# puts retweets_from.group_by(&:inspect).map{|k,v| ["#{id_redirect_url}#{k.gsub('"','')}", v.length]}.sort_by{|k,v| v}
# ---
# Uncomment this section to analyze what websites you tweeted about the most.
# ---
# hosts = []
# urls.each do |url|
# hosts << URI.parse(url).host.split('.').last(2).join('.').downcase rescue next
# end
# puts hosts.group_by(&:inspect).map{|k,v| [k, v.length]}.sort_by{|k,v| v}
# ---
# Uncomment this section to analyze when you used certain Twitter apps.
# ---
# months = {}
# ["May '08", "Jun '08", "Jul '08", "Aug '08", "Sep '08", "Oct '08", "Nov '08", "Dec '08", "Jan '09", "Feb '09", "Mar '09", "Apr '09", "May '09", "Jun '09", "Jul '09", "Aug '09", "Sep '09", "Oct '09", "Nov '09", "Dec '09", "Jan '10", "Feb '10", "Mar '10", "Apr '10", "May '10", "Jun '10", "Jul '10", "Aug '10", "Sep '10", "Oct '10", "Nov '10", "Dec '10", "Jan '11", "Feb '11", "Mar '11", "Apr '11", "May '11", "Jun '11", "Jul '11", "Aug '11", "Sep '11", "Oct '11", "Nov '11", "Dec '11", "Jan '12", "Feb '12", "Mar '12", "Apr '12", "May '12", "Jun '12", "Jul '12", "Aug '12", "Sep '12", "Oct '12", "Nov '12", "Dec '12", "Jan '13", "Feb '13", "Mar '13", "Apr '13", "May '13"].each do |m|
# months[m] = 0
# end
# source_twitter_app_times.reverse.each do |t|
# time = Time.strptime(t,"%Y-%m-%d %H:%M:%S %z")
# month = time.strftime("%b '%y").to_s
# months[month] ||= 0
# months[month] += 1
# end
# ---
# Uncomment this section to analyze at what hours you use Twitter.
# ---
# hours = ["12 am", "1 am", "2 am", "3 am", "4 am", "5 am", "6 am", "7 am", "8 am", "9 am", "10 am", "11 am", "12 pm", "1 pm", "2 pm", "3 pm", "4 pm", "5 pm", "6 pm", "7 pm", "8 pm", "9 pm", "10 pm", "11 pm"]
# hours_res = Array.new(24, 0)
# times.reverse.each do |t|
# time = Time.strptime(t,"%Y-%m-%d %H:%M:%S %z")
# I used this to make a rough adjustment of the times before I moved to Boston.
# if time < Time.new(2011,9,1)
# time -= 4*3600
# end
# hour = time.strftime("%l %P").to_s.strip
# hours_res[hours.index(hour)] ||= 0
# hours_res[hours.index(hour)] += 1
# end
# p hours, hours_res
# ---
# Uncomment this section to unshorten the URLS you tweeted.
# ---
# unshortened_urls = []
# to_retry = []
# urls.each_slice(50) do |url_slice|
# begin
# res = Net::HTTP.get_response(URI.parse("http://api.unshort.tk/index.php?u=#{url_slice.join(';')}")).body
# puts 'http 50 done'
# json = JSON.parse(res)
# json.each do |k,v|
# unshortened_urls << v
# end
# rescue
# to_retry << url_slice
# end
# end
# to_retry.flatten!
# to_retry.each_slice(10) do |url_slice|
# res = Net::HTTP.get_response(URI.parse("http://api.unshort.tk/index.php?u=#{url_slice.join(';')}")).body
# puts 'http '10 done'
# json = JSON.parse(res)
# json.each do |k,v|
# unshortened_urls << v
# end
# end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment