Skip to content

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Official prerender.io .htaccess for Apache.
# Change YOUR_TOKEN to your prerender token and uncomment that line if you want to cache urls and view crawl stats
# Change http://example.com (at the end of the last RewriteRule) to your website url
<IfModule mod_headers.c>
#RequestHeader set X-Prerender-Token "YOUR_TOKEN"
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
<IfModule mod_proxy_http.c>
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Only proxy the request to Prerender if it's a request for HTML
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/http://example.com/$2 [P,L]
</IfModule>
</IfModule>
@benceg

A warning to anyone using CodeIgniter (or any PHP script that warrants a RewriteRule on index.php): you will need to add a capture group to the proxy rule, for index.php.

The line will read as follows:

RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(index\.php)?(.*) http://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$3 [P,L]
@thoop
Owner

Thanks @benceg!

@dobesv

For me, I had to change the proxy target to remove an extra "/", so

http://service.prerender.io/http://example.com$2

This is because the pattern being matched has a leading '/' already.

@baki250

There is an issue where the service is caching the / home page for all urls. Any idea what might be the issue

Thanks

@thoop
Owner

@baki250 just saw your comment. Following up with you via email.

@ioloie

@baki250 I had a similar issue except it was always caching /index.html. The cause was that I had another rewrite for pushstate that always returned index.html:

        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteRule .* index.html [L]

The problem was that if this rule is carried out first, then the second rule receives /index.html as its URI. So prerender is always being asked for example.com/index.html. The solution is to switch the order around.

@thoop
Owner

Thanks @iolo-matchbook!

@andykillen

your missing "visionutils" and "Facebot" for facebook user agents

@ianmstew

quora link preview on line 12 is causing error: RewriteCond: bad flag delimiters. Escaping the spaces fixes the problem: quora\ link\ preview

@thoop
Owner

Thanks @ianmstew!

@ianmstew

Hey @thoop, I ran into an issue with this .htaccess that had me stumped for days, until I realized it was the issue proxying the server root specifically using mod_rewrite described here and here. Both solutions recommended using the ProxyPass directive instead, which does not work in our case requiring special rewrite conditions.

I discovered a workaround, however. This solves the server root case where, by the time mod_rewrite rules are evaluated, the incoming blank ("/") root request has already been converted to "index.html" by Apache. I would love to find a more elegant solution, but for now this has solved my problem.

RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
RewriteCond %{REQUEST_URI} ^/index\.html$
# Proxy the server root
RewriteRule .* http://service.prerender.io/http://example.com/ [P,L]

RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Only proxy the request to Prerender if it's a request for HTML
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(.*) http://service.prerender.io/http://example.com/$2 [P,L]
@thoop
Owner

Hmm, interesting. Is this just because you have your server behind a path and not the root?

@davidkcyip

@baki250 @homerjam Thanks! I've been stumped on this for the past week and both of your solutions have helped me greatly! It's now indexing not just the index.

@vinceMeens

I couldn't get it working the first time round, it had to do with multiple things, the instructions on most sites were for html5 push state, while we used #! etc. So I pretty much tested every setting and configuration and it resulted in the following overview and solutions for both #! and html5 push state.

SEO / Prerender.io (Angular) etc.

How the solution works in general:

  1. Search engine notices that your page is rendered using Javascript instead of server side.
  2. Search engine requests your pages with a modified url instead of the original one (escaped_fragment).
  3. You return the prerendered HTML (Thanks to prerender.io) to the crawler.

Instructions depending on your routing setup:

Option 1: # Hash routing

Example:

  # http://www.example.com/#/user/123
  #! http://www.example.com/#!/user/123

Problem:
Nothing after the # (hash) in the url gets sent to your server.

Solution:
In angular:

      $locationProvider.hashPrefix('!');
      $locationProvider.html5Mode(false);

In html remove the following meta header if there:

    <head>
      <!-- REMOVE --> <meta name="fragment" content="!"> <!-- /REMOVE -->
    </head>

Everytime a search engine finds a URI like this:

  http://www.example.com/#!/user/123

It will send a request like this:

http://www.example.com/?_escaped_fragment_=/user/123

Configure Apache:

  RewriteEngine On
    # If requested resource exists as a file or directory
      # (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
        RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
        RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
      # Only exception is /index.htm, /index.html
        RewriteCond %{REQUEST_URI} !/index\.html?
      # Go to it as is
        RewriteRule ^ - [L]
    # If non existent
      # If path ends with / and is not just a single /, redirect to without the trailing /
        RewriteCond %{REQUEST_URI} ^.*/$
        RewriteCond %{REQUEST_URI} !^/$
        RewriteRule ^(.*)/$ $1 [R,QSA,L]
      # If path that is not empty or / or /index.htm or /index.html, redirect to /#!/path
        RewriteCond %{REQUEST_URI} !(/index\.html?|/|)$
        RewriteRule ^(.*)$ /#!$1 [R,QSA,NE,L]
      # If not /, redirect to it.
        RewriteCond %{REQUEST_URI} !^/$
        RewriteRule ^ / [R,QSA,L]

  # Handle Prerender.io
    RequestHeader set X-Prerender-Token "YOUR_TOKEN"

    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_
    RewriteCond %{QUERY_STRING} _escaped_fragment_=([^&]*)

  # Proxy the request
    RewriteRule ^ http://service.prerender.io/http://%{HTTP_HOST}/?_escaped_fragment_=%1 [P,L]

Option 2: HTML5 push state routing

Example:

http://www.example.com/user/123

Problem:
You need to tell the search engine that your HTML 5 state page uses javascript to generate content

Solution:

In angular:

$locationProvider.html5Mode(true);

In html, add this meta header:

<head>
    <meta name="fragment" content="!">
</head>

Everytime a search engine finds a URI like this:

http://www.example.com/user/123

It will send a request like this:

http://www.example.com/user/123?_escaped_fragment_= 

Configure Apache:

  RewriteEngine On
# If requested resource exists as a file or directory
  # (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
    # Go to it as is
    RewriteRule ^ - [L]

  # If non existent
    # If path ends with / and is not just a single /, redirect to without the trailing /
      RewriteCond %{REQUEST_URI} ^.*/$
      RewriteCond %{REQUEST_URI} !^/$
      RewriteRule ^(.*)/$ $1 [R,QSA,L]      

  # Handle Prerender.io
    RequestHeader set X-Prerender-Token "YOUR_TOKEN"

    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_

    # Proxy the request
    RewriteRule ^(.*)$ http://service.prerender.io/http://%{HTTP_HOST}$1 [P,L]

  # If non existent
    # Accept everything on index.html
    RewriteRule ^ /index.html

If submitted to prerender.io, how it sends requests to your server again

http://service.prerender.io/http://www.example.com/user/123 -> http://www.example.com/user/123
http://service.prerender.io/http://www.example.com/?_escaped_fragment_=/user/123 -> http://www.example.com/index.html#!/user/123
http://service.prerender.io/http://www.example.com/?_escaped_fragment_=/user/123&var1=val1&val2=val2 -> http://www.example.com/index.html?var1=val1&val2=val2#!/user/123

Other resources

http://www.prerender.io
http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io
@amergin

You may want to omit .ttf and .woff files as well (fonts). I noticed font-awesome was being requested from the prerender service.

@stevendeeds

Having trouble getting Google Bot (maybe others) cache my pages.

I'm using EmberJS and Prerender.io.

Here is my Apache .htaccess

AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript


RequestHeader set X-Prerender-Token "FyKfYC3YYBXiBJoUBmlY"


RewriteEngine On

<IfModule mod_proxy_http.c>
    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_

    # Only proxy the request to Prerender if it's a request for HTML
    RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(.*) http://service.prerender.io/http://dev6.lovelionstudio.com/$2 [P,L]
</IfModule>

And here is my index file head:



...

I have included the sitemap.xml file that Prerender generates, and have cached all the main pages of this site.

Any ideas on why this isn't working?
Here is the site for reference http://lovelionstudio.com/

Thanks for your help ahead of time.

@thoop
Owner

Send me an email at support@prerender.io if anyone has any issues. I don't get notified when someone comments on this gist.

@PdotRudy

Why is google bot not included?

@SystemDisc

Because I'm using CodeIgniter, I needed the URL that's passed to prerender.io to not include the hashbang, and include the _escaped_fragment_ parameter in the URL itself.

In other words, instead of pages looking like the following in "Cached Pages"

http://www.example.com/#!/user/123

I needed them to look like this

http://www.example.com/user/123

I used this:

RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_=(\%2F|/)*(.*)

RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/https://%{HTTP_HOST}/%2? [P,NE,L]
  • The first RewriteCond checks to make sure one of the user agents listed in requesting the page
  • The second RewriteCond checks for _escaped_fragment_ and puts its value, excluding any prefixed forward-slashes, into %2
  • The RewriteRule sends the request to http://service.prerender.io/https://HOST_NAME/VALUE_OF_ESCAPED_FRAGMENT and removes any query string

For reference, my full .htaccess for a site with CodeIgniter + AngularJS is as follows:

<IfModule mod_headers.c>
    # Change YOUR_TOKEN to your prerender token
    RequestHeader set X-Prerender-Token "YOUR_TOKEN"
</IfModule>

<IfModule mod_rewrite.c>
    RewriteEngine On
    # !IMPORTANT! Set your RewriteBase here and don't forget trailing and leading
    # slashes.
    # If your page resides at
    # http://www.example.com/mypage/test1
    # then use
    # RewriteBase /mypage/test1/
    RewriteBase /

    RewriteRule ^index.php/(.*)$ /$1 [R=302,L]

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ index.php?/$1 [L]

    <IfModule mod_proxy_http.c>
        RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
        RewriteCond %{QUERY_STRING} _escaped_fragment_=(\%2F|/)*(.*)

        # Only proxy the request to Prerender if it's a request for HTML
        RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/https://%{HTTP_HOST}/%2? [P,NE,L]
    </IfModule>
</IfModule>

<IfModule !mod_rewrite.c>
    # If we don't have mod_rewrite installed, all 404's
    # can be sent to index.php, and everything works as normal.
    # Submitted by: ElliotHaughin

    ErrorDocument 404 /index.php
</IfModule>

I have my location provider set as:

$locationProvider.hashPrefix('!');
$locationProvider.html5Mode(true);
@em0xi0nx

im currently working on a project and this is my first time using angularjs and i think its pretty powerful but there's a problem.
SEO problem. When i first use Angular i didn't know that its not visible to SEARCH ENGINES.

Right now im hosting my own Prerender Service. but whenever i request for a url after a #!

http-:// -->> http:// ------ because i have limited links

http-://prerender.host.com/http-://host.com/#!/

something it always renders the http-://host.com/ route and not the http-://host.com/#!/something.

please help me!! i dont really know whats happening. i also tried html5mode with a fragment meta but still the same.

SETUP
HASHBANG

http-://host.com/#!/

ROUTES

i do have hashprefix and explicitly turning off html5mode knowing its off by default. --- i thought it was the problem :))

$routeProvider
    .when('/', {
        templateUrl: '../views/tutorials.html',
        controller: 'TutorialController',


    })
    .when('/tutorial/:id', {
        templateUrl: '../views/tutorial_detail.html',
        controller: 'TutorialDetailController',

    })
    .when('/add-tutorial', {
        templateUrl: '../views/tutorial_add.html',
        controller: 'TutorialAddController',

    })

.otherwise('/');
$locationProvider.hashPrefix('!');
$locationProvider.html5Mode(false);

APACHE

im using the settings from prerender apache guide.

<IfModule mod_rewrite.c>
RewriteEngine On



<IfModule mod_proxy_http.c>
    # Enable prerendering for .html and directory index files
    RewriteCond %{HTTP_USER_AGENT} Googlebot|bingbot|Googlebot-Mobile|Baiduspider|Yahoo|YahooSeeker [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_
    RewriteRule ^ http://prerender.em0xi0nx.com/http://tutorial.em0xi0nx.com/%{REQUEST_URI} [P,L]
</IfModule>
</IfModule>

Do i need to do something else or is there anything wrong with my setup?? i've figuring out this for 3 days and now i really need help.

any suggestions?? i really need to finish this for our defense.

@thoop
Owner

Googlebot, bingbot, and many others are not included in the config because they support the _escaped_fragment_ parameter, which is checked for in the config. More information can be found here: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started

@sgasser

Google Snippet (for sharing on Google+) is missing.

Add this: Google\ \(\+https:\/\/developers.google.com\/\+\/web\/snippet\/\)

@avin77

Sorry if this question seems stupid
I have deployed prerender on my localhost and able to

open at localhost:3000/http://www.abc.com but if i try to open it http://www.abc.in/?_escaped_fragment_=/ or http://www.abc.in/?_escaped_fragment_= it redirects on the same home page with exactly same html .

Can anyone tell what to write in RedirectRule in .htaccess for localhost deployment?

@sirviejo

Just something i learned today, while trying to make this work with an angular. Apache by default tries to redirect your request to /index.php so you will need to set DirectoryIndex index.html before the request is proxied, hope this saves time to others.

This is how we have it now:

DirectoryIndex index.html

RequestHeader set X-Prerender-Token "MYPRERENDERIOTOKEN"


RewriteEngine On

<IfModule mod_proxy_http.c>
    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_

    # Only proxy the request to Prerender if it's a request for HTML
    RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/http://myprittieangularapp.com$2 [P,L]
</IfModule>

@philippbosch

You might want to add applebot to the list like here

@thelakshya

Hey Guys,

I have access to httpd.conf (/etc/httpd/conf/httpd.conf). Can I add those lines in this file directly? If yes, then do i need to change some other configuration also? like AllowOverride etc?

i have other conf files with virtualhost configuration:
<VirtualHost *:80>
<Proxy *>
Order deny,allow
Allow from all
</Proxy>

ProxyPass / http://localhost:8080/ retry=0
ProxyPassReverse / http://localhost:8080/
ProxyPreserveHost on

LogFormat "%h (%{X-Forwarded-For}i) %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
ErrorLog /var/log/httpd/elasticbeanstalk-error_log
TransferLog /var/log/httpd/elasticbeanstalk-access_log
</VirtualHost>

PS: it's an elastic beanstalk machine with tomcat stack and httpd server (httpd server is just there to forward request on port 80 to port 8080 of tomcat)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.