Skip to content

Instantly share code, notes, and snippets.

@stevenhuey
Created May 2, 2013 20:57
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stevenhuey/5505421 to your computer and use it in GitHub Desktop.
Save stevenhuey/5505421 to your computer and use it in GitHub Desktop.
Given a page, use the Nokogiri gem to find all the HREFs (links) on the page.
# Given a page use the Nokogiri gem to find all the HREFs (links) on the page
require 'nokogiri'
require 'open-uri'
def getAllHrefsInPage(page)
doc = Nokogiri::HTML(open(page))
links = doc.css('a')
hrefs = links.map {|link| link.attribute('href').to_s}.uniq.sort.delete_if {|href| href.empty?}
return hrefs
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment