Created
February 13, 2013 12:23
-
-
Save CiaraBurkett/4944228 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# An attempt to scrape students.flatironschool.com and retrieve student names, links, taglines, and excerpts | |
# and create a Hash out of the information | |
require "open-uri" | |
require "nokogiri" | |
url = "http://students.flatironschool.com" | |
doc = Nokogiri::HTML(open("#{url}")) | |
names = "h2" | |
hrefs = ".one_third a" | |
tags = ".position" | |
excerpts = ".excerpt" | |
students = [] | |
links = [] | |
taglines = [] | |
intros = [] | |
doc.css("#{names}").each do |name| | |
students << name.text | |
end | |
doc.css("#{hrefs}").each do |href| | |
links << href.attr("href") | |
end | |
doc.css("#{tags}").each do |tag| | |
taglines << tag.text | |
end | |
doc.css("#{excerpts}").each do |exc| | |
intros << exc.text | |
end | |
flatiron = { | |
students => { | |
:links => links, | |
:taglines => taglines, | |
:intros => intros | |
} | |
} | |
puts flatiron | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment