Create a gist now

Instantly share code, notes, and snippets.

Embed
Make /robots.txt aware of the Rails environment

Make /robots.txt aware of the Rails environment

You probably don't want Google crawling your development staging app. Here's how to fix that.

$ mv public/robots.txt config/robots.production.txt
$ cp config/robots.production.txt config/robots.development.txt

Now edit config/routes.rb to add a route for /robots.txt, and add the controller code.

def robots
robots = File.read(Rails.root + "config/robots.#{Rails.env}.txt")
render :text => robots, :layout => false, :content_type => "text/plain"
end
# (moved from public/robots.txt)
#
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
User-Agent: *
Disallow: /
# (moved from public/robots.txt)
#
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
# User-Agent: *
# Disallow: /
get '/robots.txt' => 'home#robots'
@chenillen

This comment has been minimized.

Show comment
Hide comment
@chenillen

chenillen Oct 8, 2013

It cause that all robots request will be handled by rails server not http server?

ok with the cache

It cause that all robots request will be handled by rails server not http server?

ok with the cache

@tomfuertes

This comment has been minimized.

Show comment
Hide comment
@tomfuertes

tomfuertes Nov 5, 2013

robots.development.rb should be renamed robots.development.txt*

robots.development.rb should be renamed robots.development.txt*

@elbartostrikesagain

This comment has been minimized.

Show comment
Hide comment
@elbartostrikesagain

elbartostrikesagain Feb 26, 2014

Your development staging app should be using the production environment, not development environment. I'd recommend setting a separate environment variable like DISABLE_ROBOTS=true and then using the following instead

class HomeController < ApplicationController
caches_page :robots

def robots
robot_type = ENV["DISABLE_ROBOTS"] == "true" ? "staging" : "production"
robots = File.read(Rails.root + "config/robots/robots.#{robot_type}.txt")
render :text => robots, :layout => false, :content_type => "text/plain"
end

end

In rails 4 you'll need this for page caching https://github.com/rails/actionpack-page_caching

Your development staging app should be using the production environment, not development environment. I'd recommend setting a separate environment variable like DISABLE_ROBOTS=true and then using the following instead

class HomeController < ApplicationController
caches_page :robots

def robots
robot_type = ENV["DISABLE_ROBOTS"] == "true" ? "staging" : "production"
robots = File.read(Rails.root + "config/robots/robots.#{robot_type}.txt")
render :text => robots, :layout => false, :content_type => "text/plain"
end

end

In rails 4 you'll need this for page caching https://github.com/rails/actionpack-page_caching

@olliebennett

This comment has been minimized.

Show comment
Hide comment
@olliebennett

olliebennett Jan 12, 2018

This approach failed for me after updating to Rails 5.1, but the code below worked instead:

  # Robots.txt
  def robots
    robots = File.read(Rails.root.join('config', "robots.#{Rails.env}.txt"))
    render plain: robots
  end

This approach failed for me after updating to Rails 5.1, but the code below worked instead:

  # Robots.txt
  def robots
    robots = File.read(Rails.root.join('config', "robots.#{Rails.env}.txt"))
    render plain: robots
  end
@abhinavmathur

This comment has been minimized.

Show comment
Hide comment
@abhinavmathur

abhinavmathur Mar 22, 2018

Instead of modifying the robots.txt file, we can simply use something like in the head section of application.html.erb. This will ban spiders from crawling the entire staging site.

<% if Rails.env.staging? %>
<meta name="robots" content="noindex,nofollow">
<% end %>

abhinavmathur commented Mar 22, 2018

Instead of modifying the robots.txt file, we can simply use something like in the head section of application.html.erb. This will ban spiders from crawling the entire staging site.

<% if Rails.env.staging? %>
<meta name="robots" content="noindex,nofollow">
<% end %>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment