Skip to content

Instantly share code, notes, and snippets.

@budparr
Last active August 13, 2019 14:21
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save budparr/c2224b265564d1d5128c43a62ec58aa0 to your computer and use it in GitHub Desktop.
Save budparr/c2224b265564d1d5128c43a62ec58aa0 to your computer and use it in GitHub Desktop.
Jekyll Robots page to exclude from robots pages that are excluded from the sitemap
---
layout: null
permalink: robots.txt
---
# filter pages and documents for the noindex key
{% assign noindexPages = site.pages | where: 'sitemap', false %}
{% assign noindexDocuments = site.documents | where: 'sitemap', false %}
User-agent: *
# robotstxt.org - if _config production variable is false robots will be disallowed.
{% if site.production != true %}
Disallow: /
{% else %}
{% if noindexDocuments or noindexPages %}
{% for node in noindexPages %}
Disallow: {{ node.url }}
{% endfor %}
{% for node in noindexDocuments %}
Disallow: {{ node.url }}
{% endfor %}
{% else %}
Disallow:
{% endif %}
{% endif %}
@budparr
Copy link
Author

budparr commented Apr 7, 2016

In config:

production: [true/false]

Also, in your document head:

{% if site.production == true and page.sitemap != false  %}
    <META NAME="ROBOTS" CONTENT="INDEX, FOLLOW">
  {% else %}
    <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
{% endif %}

@sylvaindeschenes
Copy link

Would it be a good idea to add this line at the top of robots.txt to indicate the location of the sitemap?

Sitemap: {{ site.url }}/sitemap.xml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment