Skip to content

Instantly share code, notes, and snippets.

@bigblue
Created July 17, 2009 12:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bigblue/149025 to your computer and use it in GitHub Desktop.
Save bigblue/149025 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'anemone'
require 'builder'
sitemap = ""
xml = Builder::XmlMarkup.new(:target => sitemap, :indent=>2)
xml.instruct!
xml.urlset(:xmlns=>'http://www.sitemaps.org/schemas/sitemap/0.9') {
Anemone.crawl("http://www.example.com/", :discard_page_bodies => true) do |anemone|
#must be a better solution than this regex
anemone.skip_links_like /.csv$|.doc$|.docx$|.gif$|.jpg$|.JPG$|.jpeg$|.js$|.mp3$|.mp4$|.mpg$|.mpeg$|.pdf$|.png$|.ppt$|.rss$|.swf$|.txt$|.xls$|.xlst$|.xml$/i
anemone.on_every_page do |page|
xml.url {
xml.loc(page.url)
xml.lastmod(Time.now.utc.strftime("%Y-%m-%dT%H:%M:%S+00:00"))
xml.changefreq('weekly')
}
end
end
}
File.open('sitemap.xml', 'w') do |f|
f.write sitemap
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment