Skip to content

Instantly share code, notes, and snippets.

@jakeonrails
Forked from Sjors/robot_user_agents.rb
Created October 15, 2012 21:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jakeonrails/3895706 to your computer and use it in GitHub Desktop.
Save jakeonrails/3895706 to your computer and use it in GitHub Desktop.
Recognize search engines and spammers using user-agents.org
require 'net/http'
require 'xmlsimple'
url = "http://www.user-agents.org/allagents.xml"
xml_data = Net::HTTP.get_response(URI.parse(url)).body
data = XmlSimple.xml_in(xml_data)
agents = data['user-agent'].select{|agent| type = agent["Type"].first; type.include?("R") || type.include?("S")}
agent_names = agents.collect {|agent| agent["String"].first}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment