Skip to content

Instantly share code, notes, and snippets.

Forked from petskratt/robots.txt
Last active May 23, 2018 13:28
Show Gist options
  • Save simbus82/4fc712b160d1fa63d559ab1e384f83fe to your computer and use it in GitHub Desktop.
Save simbus82/4fc712b160d1fa63d559ab1e384f83fe to your computer and use it in GitHub Desktop.
Magento 1.9.x - robots.txt
# robots.txt for Magento 1.9.x / v0.1 2018-05-23 / Simone Bussoni / inspired by Peeter Marvet
# # based on:
# comment and clone at
# Keep in mind that by standard robots.txt should NOT contain empty lines, except between UA blocks!
# Sitemap (uncomment, change and add language/shop specific sitemaps, if running on multiple domains
# keep in mind sitemap can only point to own domain so something like sitemapindex.php is needed)
# Sitemap:
# Google Image Crawler Setup - having crawler-specific sections makes it ignore generic e.g *
User-agent: Googlebot-Image
# Yandex tends to be rather aggressive, may be worth keeping them at arms lenght
User-agent: YandexBot
Crawl-delay: 20
# Crawlers Setup
User-agent: *
# Allow paging (unless paging inside a listing with more params, as disallowed below)
Allow: /*?p=
# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /magento/
#Disallow: /media/
Disallow: /media/captcha/
#Disallow: /media/catalog/
Disallow: /media/customer/
Disallow: /media/dhl/
Disallow: /media/downloadable/
Disallow: /media/import/
Disallow: /media/pdf/
Disallow: /media/sales/
Disallow: /media/tmp/
#Disallow: /media/wysiwyg/
Disallow: /media/xmlconnect/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/
# Paths (if using shop id in URL must prefix with * or copy for each)
Disallow: */index.php/
Disallow: */catalog/product_compare/
Disallow: */catalog/category/view/
Disallow: */catalog/product/view/
Disallow: */catalog/product/gallery/
Disallow: */catalogsearch/
Disallow: */control/
Disallow: */contacts/
Disallow: */customer/
Disallow: */customize/
Disallow: */newsletter/
Disallow: */poll/
Disallow: */review/
Disallow: */sendfriend/
Disallow: */tag/
Disallow: */wishlist/
Disallow: */checkout/
Disallow: */onestepcheckout/
# Files
Disallow: /cron.php
Disallow: /
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
# Do not crawl sub category pages that are sorted or filtered.
# This would be very broad, could hurt (incl. SEO).
# Disallow: /*?*
# These are more specific, pick what you need - and do not forget to add your custom filters!
Disallow: /*?dir*
Disallow: /*?limit*
Disallow: /*?mode*
Disallow: /*?___from_store=*
Disallow: /*?___store=*
Disallow: /*?cat=*
Disallow: /*?q=*
Disallow: /*?price=*
Disallow: /*?availability=*
Disallow: /*?brand=*
Disallow: /*?min=*
Disallow: /*?max=*
Disallow: /*?formato=*
Disallow: /*?fragranza=*
# Paths that can be safely ignored (no clean URLs)
Disallow: /*?p=*&
Disallow: /*.php$
Disallow: /*?SID=
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment