Skip to content

Instantly share code, notes, and snippets.

@crm114
Last active August 29, 2015 13:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save crm114/9731721 to your computer and use it in GitHub Desktop.
Save crm114/9731721 to your computer and use it in GitHub Desktop.
SpaceApps Challenges ruby scraper
# gem install nokogiri
require 'nokogiri'
require 'open-uri'
base_uri = "https://2014.spaceappschallenge.org/"
category_links = Nokogiri::HTML(open("#{base_uri}/challenge")).css('#category_array a').map {|l| l.attributes['href'].text}
categories = category_links.map {|cat| cat.split('/').last}
category_hashes = {}
category_links.each_with_index do |link, index|
category = categories[index]
challenge_links = Nokogiri::HTML(open("#{base_uri}#{link}")).css('#challenge_array a').map {|l| l.attributes['href'].text}
category_hashes.merge!({category => {links: challenge_links} })
category_challenges = []
challenge_links.each do |challenge_link|
challange_html = Nokogiri::HTML(open("#{base_uri}#{challenge_link}"))
challenge_name = challange_html.at_css('h2').text
challenge_text = challange_html.at_css('#descriptionTab').inner_html
category_challenges << {challenge_name => challenge_text}
end
category_hashes[category][:challenges]=category_challenges
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment