Skip to content

Instantly share code, notes, and snippets.

@sxua
Created October 8, 2011 16:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sxua/1272545 to your computer and use it in GitHub Desktop.
Save sxua/1272545 to your computer and use it in GitHub Desktop.
Fetch posts from VK wall (./get.rb remixsid_cookie_value, group_id)
#!/usr/bin/env ruby
require 'rubygems'
require 'mechanize'
require 'fastercsv'
require 'progressbar'
cookie, gid = ARGV
url = URI.parse("http://vkontakte.ru/wall-#{gid}")
agent = Mechanize.new
agent.user_agent_alias = 'Mac Safari'
Mechanize::Cookie.parse(url, "remixsid=" + cookie) { |c| agent.cookie_jar.add(url, c) }
pages = agent.get(url.to_s).search('#fw_summary_wrap .pg_lnk:last').attr('href').value.split('=').last.to_i/20
FasterCSV.open('export_vk_wall.csv','w') do |csv|
csv << ['name', 'message', 'time']
(0...pages).to_a.each_with_progressbar('Progress') do |p|
url.query = "offset=#{p * 20}"
page = agent.get(url.to_s).search("#page_wall_posts")
page.search('.post .info').each do |post|
csv << [
post.search('a.author').text,
post.search('.wall_text div div').text,
post.search('.rel_date').text
]
end
sleep(5 + rand(11))
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment