Skip to content

Instantly share code, notes, and snippets.

@mmb
Created July 21, 2009 02:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mmb/151066 to your computer and use it in GitHub Desktop.
Save mmb/151066 to your computer and use it in GitHub Desktop.
show recent anonymous edits to a Wikipedia article from interesting domains
#!/usr/bin/ruby
# show recent anonymous edits to a Wikipedia article from interesting domains
require 'hpricot'
require 'open-uri'
require 'socket'
require 'yaml'
article = 'Henry_Paulson'
boring = [
'comcastbusiness.net',
'comcast.net',
'cox.net',
'coxfiber.net',
'rr.com',
]
name_cache_file = '/tmp/name_cache'
name_cache = File.exists?(name_cache_file) ?
YAML::load_file(name_cache_file) : {}
open("http://en.wikipedia.org/w/index.php?title=#{article}&action=history") do |f|
doc = Hpricot(f)
(doc/'ul#pagehistory li').each do |li|
whenn = Time.parse(li.inner_html.match(
/(\d\d:\d\d, \d{1,2} (?:January|February|March|April|May|June|July|August|September|October|November|December) \d{4})/)[1])
who = (li/'a[text()="talk"]')[0]['href'].match(/User_talk:(.*?)(&|$)/)[1]
if who.match(/^[\d\.]+$/)
name_cache[who] = Socket.getaddrinfo(who, nil)[0][2] unless
name_cache[who]
host = name_cache[who]
if host[/[a-z]/]
domain = host.match(/^.*?\.?([^.]+\.[^.]+)$/)[1]
puts "#{whenn.asctime} - #{who} (#{host}) (#{domain})" unless boring.include?(domain)
end
end
end
end
open(name_cache_file, 'w') { |f| f.write(YAML::dump(name_cache)) }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment