ydnar (owner)

Revisions

gist: 212977 Download_button fork
public
Description:
Test robots.txt with Cucumber & Robots
Public Clone URL: git://gist.github.com/212977.git
Embed All Files: show embed
robot_steps.rb #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
require 'robots'
 
# Cucumber steps for robots.txt
# Requires the robots gem: http://github.com/fizx/robots
# Note: url_for() returns a fully-qualified URL, similar to path_to()
 
When /^I am a robot named (.+)$/ do |user_agent|
  @robots = Robots.new user_agent
end
 
When /^I should be allowed to request (.+)$/ do |page_name|
  url = url_for(page_name)
  @robots.allowed?(url).should be_true
end
 
When /^I should not be allowed to request (.+)$/ do |page_name|
  url = url_for(page_name)
  @robots.allowed?(url).should be_false
end
 
robots.txt.feature #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Feature: robots.txt
  In order to limit access to certain pages by Google and other crawlers
  I want to have a robots.txt file with certain URLs masked out
 
  Scenario Outline: Allowed URLs
    When I am a robot named Google
    Then I should be allowed to request <resource>
 
    Examples:
      | resource |
      | the home page |
      | /foo-bar |
      | /favicon.ico |
 
  Scenario Outline: Disallowed URLs
    When I am a robot named Google
    Then I should not be allowed to request <resource>
 
    Examples:
      | resource |
      | the dashboard |
      | the edit page |