Skip to content

Instantly share code, notes, and snippets.

@patio11 patio11/slurper.rb
Last active Nov 15, 2015

Embed
What would you like to do?
require 'rubygems'
require 'httparty'
require 'fileutils'
require 'json'
USERNAME = ARGV[0] || "patio11"
MAX_TO_FETCH = ARGV[1]
puts "Username: #{USERNAME} max to fetch: #{MAX_TO_FETCH || "all"}"
user_url = "https://hacker-news.firebaseio.com/v0/user/#{USERNAME}.json"
comment_url = "https://hacker-news.firebaseio.com/v0/item/$ID.json"
user_results = HTTParty.get(user_url).parsed_response
comment_ids = user_results["submitted"]
puts comment_ids.inspect
FileUtils::mkdir_p "comments/#{USERNAME}"
to_fetch = MAX_TO_FETCH ? MAX_TO_FETCH.to_i : comment_ids.size
to_fetch = [to_fetch, comment_ids.size].min
sample_ids = comment_ids[0..(to_fetch - 1)]
count = 0
cached = 0
increment = (sample_ids.size / 1000.0 + 0.5).to_i
increment = 1 if increment < 1
sample_ids.map do |id|
unless File.exist?("comments/#{USERNAME}/#{id}")
comment_url_to_get = comment_url.sub("$ID", id.to_s)
response = HTTParty.get(comment_url_to_get).parsed_response #rescue nil
sleep 0.2
if response
count += 1
f = File.open("comments/#{USERNAME}/#{id}", "w")
f.write response.to_json
f.close
puts "Downloaded #{count} comments of #{comment_ids.size}. Cached: #{cached}" if count % increment == 0
end
else
cached += 1
end
end
@groovemonkey

This comment has been minimized.

Copy link

groovemonkey commented Oct 20, 2014

Thanks for this -- I adapted this script to make PDFs from peoples comments, because I wanted to read them like a book. https://github.com/groovemonkey/hackernews_books -- also released into the public domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.