Thanks to Kyle Mulka for scraping the data and putting it in a nice .json format!
Want to use Ruby to hack on H&H data? Good. I will teach you. Some assumptions I'm making about your computer:
- You have ruby
- You have rubygems
- You're on OS X or Linux.
If you don't have either of those things, google around. Now let's get going!
First thing you need to do is mkdir hnh-data
somewhere on your computer. We're going to keep all project files in this. Then cd hnh-data
and make sure you have the hackersandhustlers.json
file that Mulka uploaded to Hackers & Hustlers in that directory. Then run gem install json
.
Go into the ruby REPL by typing irb
at the command line. If it works you should see a different looking prompt than what you had before. Type require 'json'
to include the JSON library we're going to use to parse the data. Also run require 'pp'
so we can pretty print the json data. Then do the following things:
>> raw_hnhdata = File.read('hackersandhustlers.json')
>> hnh = JSON.parse(raw_hnhdata)
Now you've got all the data in a good format to play with.
Woot! We've made good progress! To see how many posts there have been in H&H type hnh.length
. This works because hnh
is an array of posts. To see what one of the post objects looks like just do pp hnh.first
. To see what the first post ever was, just type pp hnh.last
. It should look like this:
>> pp hnh.last
{"from"=>{"name"=>"Nathan Bashaw", "id"=>"1026690007"},
"actions"=>
[{"link"=>"http://www.facebook.com/160284404008688/posts/160285960675199",
"name"=>"Comment"},
{"link"=>"http://www.facebook.com/160284404008688/posts/160285960675199",
"name"=>"Like"}],
"updated_time"=>"2010-11-08T20:53:36+0000",
"to"=>
{"data"=>
[{"version"=>1, "name"=>"Hackers & Hustlers", "id"=>"160284404008688"}]},
"likes"=>
{"count"=>6,
"data"=>
[{"name"=>"Thomas Stewart", "id"=>"2350502"},
{"name"=>"Eric Jorgenson", "id"=>"1254810238"},
{"name"=>"Sean Moening", "id"=>"1154534495"},
{"name"=>"Trista Kempa", "id"=>"1267140058"}]},
"created_time"=>"2010-11-08T20:53:36+0000",
"message"=>
"More details coming soon - just wanted to have a place where people who are interested can start talking! After I pitched Hackers & Hustlers at Startup Weekend, I got a lot of interest, so I figured it was a good idea to put something on the internets!",
"type"=>"status",
"id"=>"160284404008688_160285960675199",
"comments"=>{"count"=>0}}
But you can do a lot more than just that.
We can find out cool things like how many comments there have been on average for each post. How? A couple easy lines of ruby:
>> post_comment_count = []
>> hnh.each {|p| post_comment_count << p['comments']['count']}
>> post_comment_count.inject{ |sum, el| sum + el }.to_f / post_comment_count.size
There's a lot more you can do. I'll maybe edit this gist as I think of more examples. Or if you have cool stuff tell me and I'll put it in here.
thanks