Skip to content

Instantly share code, notes, and snippets.

@timcheadle
Last active January 26, 2023 00:56
Show Gist options
  • Save timcheadle/3761844 to your computer and use it in GitHub Desktop.
Save timcheadle/3761844 to your computer and use it in GitHub Desktop.
Make /robots.txt aware of the Rails environment

Make /robots.txt aware of the Rails environment

You probably don't want Google crawling your development staging app. Here's how to fix that.

$ mv public/robots.txt config/robots.production.txt
$ cp config/robots.production.txt config/robots.development.txt

Now edit config/routes.rb to add a route for /robots.txt, and add the controller code.

def robots
robots = File.read(Rails.root + "config/robots.#{Rails.env}.txt")
render :text => robots, :layout => false, :content_type => "text/plain"
end
# (moved from public/robots.txt)
#
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
User-Agent: *
Disallow: /
# (moved from public/robots.txt)
#
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
# User-Agent: *
# Disallow: /
get '/robots.txt' => 'home#robots'
@chenillen
Copy link

It cause that all robots request will be handled by rails server not http server?

ok with the cache

@tomfuertes
Copy link

robots.development.rb should be renamed robots.development.txt*

@elbartostrikesagain
Copy link

Your development staging app should be using the production environment, not development environment. I'd recommend setting a separate environment variable like DISABLE_ROBOTS=true and then using the following instead

class HomeController < ApplicationController
caches_page :robots

def robots
robot_type = ENV["DISABLE_ROBOTS"] == "true" ? "staging" : "production"
robots = File.read(Rails.root + "config/robots/robots.#{robot_type}.txt")
render :text => robots, :layout => false, :content_type => "text/plain"
end

end

In rails 4 you'll need this for page caching https://github.com/rails/actionpack-page_caching

@olliebennett
Copy link

This approach failed for me after updating to Rails 5.1, but the code below worked instead:

  # Robots.txt
  def robots
    robots = File.read(Rails.root.join('config', "robots.#{Rails.env}.txt"))
    render plain: robots
  end

@abhinavmathur
Copy link

abhinavmathur commented Mar 22, 2018

Instead of modifying the robots.txt file, we can simply use something like in the head section of application.html.erb. This will ban spiders from crawling the entire staging site.

<% if Rails.env.staging? %>
<meta name="robots" content="noindex,nofollow">
<% end %>

@coorasse
Copy link

I created a gem that uses a rack middleware: https://github.com/renuo/norobots

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment