Skip to content

Instantly share code, notes, and snippets.

@fabianuribe
Created July 24, 2013 22:31
Show Gist options
  • Save fabianuribe/6075226 to your computer and use it in GitHub Desktop.
Save fabianuribe/6075226 to your computer and use it in GitHub Desktop.
Scrapper Incomplete
require 'nokogiri'
doc = Nokogiri::HTML(File.open('post.html'))
def extract_usernames(doc)
doc.search('.comhead > a:first-child').map do |element|
p element.inner_text
end
end
# Tittle
p doc.search('.title > a:first-child').inner_text
# Post Id
p doc.search('.subtext > a:nth-child(3)').map{|link| link['href'] }.join.sub('item?id=', '')
# link for post
p doc.search('.title > a:first-child').map{|link| link['href'] }.join
# Number votes
p doc.search('.subtext > span').inner_text.sub(' points', '' )
# extract_usernames(doc)
p doc.search('.subtext > a:nth-child(2)').inner_text
# Url for the author
p doc.search('.subtext > a:nth-child(2)').map{|link| link['href'] }.join
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment