Skip to content

Instantly share code, notes, and snippets.

@kinopyo
Created February 2, 2012 06:26
Show Gist options
  • Save kinopyo/1721972 to your computer and use it in GitHub Desktop.
Save kinopyo/1721972 to your computer and use it in GitHub Desktop.
sitemap_generator with Heroku and Amazon S3 problem: Internal server error
# in gemfile i have these lines
gem 'sitemap_generator', '2.0.1.pre1'
gem 'carrierwave'
gem 'fog'

bundle install first, and run rake sitemap:install to create a config/sitemap.rb

Run `rake sitemap:install` to create a config/sitemap.rb
Run `rake sitemap:install` to create a config/sitemap.rb
# Set the host name for URL creation
SitemapGenerator::Sitemap.default_host = "http://MYSITE"
SitemapGenerator::Sitemap.sitemaps_host = "http://s3.amazonaws.com/MYBUCKET/"
SitemapGenerator::Sitemap.public_path = 'tmp/'
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
SitemapGenerator::Sitemap.create do
# Put links creation logic here.
#
# The root path '/' and sitemap index file are added automatically for you.
# Links are added to the Sitemap in the order they are specified.
#
# Usage: add(path, options={})
# (default options are used if you don't specify)
#
# Defaults: :priority => 0.5, :changefreq => 'weekly',
# :lastmod => Time.now, :host => default_host
#
# Examples:
#
# Add '/articles'
#
# add articles_path, :priority => 0.7, :changefreq => 'daily'
#
# Add all articles:
#
# Article.find_each do |article|
# add article_path(article), :lastmod => article.updated_at
# end
Classroom.find_each do |classroom|
add classroom_path(classroom), lastmod: classroom.updated_at
end
add '/about', changefreq: 'monthly'
add '/inquiry', changefreq: 'monthly'
add '/privacy', changefreq: 'never'
add '/terms', changefreq: 'never'
end
# in config/initializers/carrierwave.rb
CarrierWave.configure do |config|
config.cache_dir = "#{Rails.root}/tmp/"
config.storage = :fog
config.permissions = 0666
config.fog_credentials = {
:provider => 'AWS',
:aws_access_key_id => ENV['AWS_ACCESS_KEY_ID'],
:aws_secret_access_key => ENV['AWS_SECRET_ACCESS_KEY'],
}
config.fog_directory = 'MYBUCKET'
end
heroku config:add AWS_ACCESS_KEY_ID=aaaaaa AWS_SECRET_ACCESS_KEY=bbbbbbbbbbbbbbbbb
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
# User-Agent: *
# Disallow: /
Sitemap: http://s3.amazonaws.com/MYBUCKET/sitemaps/sitemap_index.xml.gz
# test for local
# it should be able to generate xml and upload to S3 if all setup right
export AWS_ACCESS_KEY_ID=aaaaaa
export AWS_SECRET_ACCESS_KEY=bbbbbbbbbbbbbbbbb
rake sitemap:refresh:no_ping
# after pushed to heroku...
# generate sitemap and ping search engines
heroku run rake sitemap:refresh
If you use Google Webmaster Tools and want to specify the url of you sitemap, that only allow relative path from your host, you may need to create a route and controller to point to the sitemap xml on the S3.
Further reading:
http://www.billrowell.com/2012/02/01/create-an-xml-sitemap-on-heroku-via-amazon-s3/
Docs:
https://github.com/kjvarga/sitemap_generator
https://github.com/kjvarga/sitemap_generator/wiki/Generate-Sitemaps-on-read-only-filesystems-like-Heroku
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment