Skip to content

Instantly share code, notes, and snippets.

@moserrya
Last active August 29, 2015 14:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save moserrya/6abbbb02d518f545590a to your computer and use it in GitHub Desktop.
Save moserrya/6abbbb02d518f545590a to your computer and use it in GitHub Desktop.
The Baconator
require 'nokogiri'
require 'open-uri'
require 'set'
class Node < Struct.new(:value, :next); end
module WikiScraper
extend self
def links(uri)
snippets(uri).map do |doc|
doc.attributes["href"].value
end
end
def snippets(uri)
document(uri).search('a').select do |link|
if href = link.attributes["href"]
href.value =~ /\A\/wiki\//
end
end
end
def document(uri)
Nokogiri::HTML(open("http://wikipedia.org#{uri}"))
end
end
module Erdos
extend self
def links(page)
WikiScraper.links(page)
end
def search(starting_uri, target_uri)
node = Node.new(starting_uri, nil)
queue = [node]
visited = Set.new << starting_uri
while node = queue.shift
current_page = node.value
links(current_page).each do |link|
next if !visited.add?(link)
return node if link == target_uri
queue << Node.new(link, node)
end
end
end
end
if __FILE__ == $0
target_uri = "/wiki/Kevin_Bacon"
starting_uri = "/wiki/Secondhand_Lions:_A_New_Musical"
p Erdos.search starting_uri, target_uri
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment