Skip to content

@thoop /nginx.conf
Last active

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Official prerender.io nginx.conf for nginx
# Change YOUR_TOKEN to your prerender token and uncomment that line if you want to cache urls and view crawl stats
# Change example.com (server_name) to your website url
# Change /path/to/your/root to the correct value
server {
listen 80;
server_name example.com;
root /path/to/your/root;
index index.html;
location / {
try_files $uri @prerender;
}
location @prerender {
#proxy_set_header X-Prerender-Token YOUR_TOKEN;
set $prerender 0;
if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
if ($uri ~ "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff)") {
set $prerender 0;
}
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
resolver 8.8.8.8;
if ($prerender = 1) {
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "service.prerender.io";
rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;
}
if ($prerender = 0) {
rewrite .* /index.html break;
}
}
}
@pvolyntsev

condition

    if ($uri ~ ".js|.css|.xml|.less|.png|.jpg|.jpeg|.gif|.pdf|.doc|.txt|.ico|.rss|.zip|.mp3|.rar|.exe|.wmv|.doc|.avi|.ppt|.mpg|.mpeg|.tif|.wav|.mov|.psd|.ai|.xls|.mp4|.m4a|.swf|.dat|.dmg|.iso|.flv|.m4v|.torrent") {

incorrect because

    $uri ~ ".ico"

mean "string consist of one any character THEN 'ico'" that matched the URL /icon/ and it is wrong

I suggest this one:

    if ($uri ~ "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent)$") {

that means "string is finished dot and one of strings 'js' OR 'css' etc"

@thoop
Owner

You're right, thanks. I'll update that.

@AttilaSATAN

How about html5mode of angularjs? Is prerender supports html5mode, would it be enough to add googlebot to the if statement?

        if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|googlebot") {
            set $prerender 1;
        }
@thoop
Owner

Google's recommended way is to use escaped_fragment. In theory, yes we should add it to the user agent check, but in reality we wouldn't want Google to request the same URL using another user agent and then think the user is cloaking because the content is different. So we try to stay on the safe side.

@sansmischevia

Confused... why isn't googlebot listed?

@Michal-sk

@sansmischevia : because googlebot uses the escaped_fragment, which is listed.

@jamiel

If users are using HTML5 pushState, surely Google will request the URL's without escaped_fragment ?

@toamitkumar

Search bots look for --> <meta name="fragment" content="!" /> in the head tag. Read https://developers.google.com/webmasters/ajax-crawling/docs/specification
Pages without hash fragments
It may be impossible or undesirable for some pages....

@sentient

I'm not too sure on

  rewrite .* /$scheme://example.com$request_uri? break;

Am I only replacing the example.com ? Could we not use the $server_name variable for this?

  rewrite .* /$scheme://$server_name$request_uri? break;
@akoumjian

If our "location /" block is where we typically specify a reverse proxy with proxy_pass, should I assume we would essentially add that to the "if (@prerender = 0)" section?

@thoop
Owner

@sentient good idea :)

@akoumjian yes, in that case you would do your own proxy_pass "if @prerender = 0"

@thoop
Owner

I added this since we were seeing issues where nginx was caching IPs and hitting servers that might have been taken out of our load balancer rotation:

#resolve using Google's DNS server
resolver 8.8.8.8;

#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "service.prerender.io";
rewrite .* /$scheme://$server_name$request_uri? break;
proxy_pass http://$prerender;
@patrickng

With nginx 1.4.6 (ubuntu) I get an error when I try to restart nginx service or reload the configuration.

nginx: [emerg] "resolver" directive is not allowed here in /etc/nginx/sites-enabled/nginx.conf

I had to move resolver out of the conditional for it to work but this isn't ideal. Any ideas?

@fergusg

I'm wondering if $host is not a better choice than $server_name in

rewrite .* /$scheme://$server_name$request_uri? break;

There can be multiple $server_name values in a single server{...} but $host is the one in the HTTP requests (and falls back to $server_name anyway)

@cjroebuck

@thoop @patrickng I too have the same issue ('resolver' directive is not allowed here)

@thoop
Owner

Sorry, I wish github would send notifications for comments on gists :(

@fergusg we used to have it set to $host but recently changed it to $server_name.

I'll look into a better solution for the resolver in the if-conditional.

@evityuk

Vkontakte social network uses this - "Mozilla/5.0 (compatible; vkShare; +http://vk.com/dev/Share)"
UserAgent for sharing functionality. Therefore you need to add 'vkshare' to detect list

pinterest|vkshare

See embeding docs. Sharing uses the same userAgent

@evandhoffman

If behind a load balancer, the $scheme var may not be set right - if the LB is doing SSL termination, the scheme on the machine behind the box may be http. Prerender service would then try to access http://, but get bounced to https://, which in my case did not work. I had to hard-code https:// in there.

@leorue

Any update on @patrickng resolver issue? I took out the conditional as well.

@thoop
Owner

@leorue can you try the new nginx config? I just updated it to move the resolver outside of the "if" statement.

@thoop
Owner

I just changed $server_name back to $host. Hopefully that clears up any issues with the server name not being the actual url of your site.

@shirokoweb

Hi, do I have to install something on my webserver running nginx as frontend HTTP server, or do I simply need to add this snippet to my vhost .conf ?

@thoop
Owner

You should just be able to add this snippet to your .conf file. Email me at todd@prerender.io if you're having any problems with it. Github doesn't send notifications on gists so I'll be able to help you more quickly over email.

@varuzhnikov

http://wiki.nginx.org/IfIsEvil

Directive if has problems when used in location context, in some cases it doesn't do what you expect but something completely different instead. In some cases it even segfaults. It's generally a good idea to avoid it if possible.

The only 100% safe things which may be done inside if in location context are:

return ...;
rewrite ... last;
Anything else may possibly cause unpredictable behaviour, including potential SIGSEGV.

It is important to note that the behaviour of if is not inconsistent, given two identical requests it will not randomly fail on one and work on the other, with proper testing and understanding ifs can be used. The advice to use other directives where available still very much apply, though.

@intellix

Google doesn't use ?escaped_fragment= for all of it's services. It might do for it's indexer but for instance when I use it from the Webmaster tools "Fetch as Google", it correctly renders it but the HTML it received was before rendering:

66.249.75.21 - - [08/Dec/2014:12:11:51 +0000] "GET /owner/ HTTP/1.1" 200 6292 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.37 - - [08/Dec/2014:12:11:51 +0000] "GET /owner/?_escaped_fragment_= HTTP/1.1" 200 9401 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

In fact, I am ALWAYS seeing two requests from Googlebot - One with and without the escaped fragment.
I'm seeing other services without the escaped fragment as well: Google Page Speed Insights, Google-StructuredDataTestingTool, Google Web Preview.

A few that I added myself: (tumblr|slackbot|xml-sitemaps|google-structureddatatestingtool). Definitely need tubmlr :) It's a shame we have to go with a whitelist approach, because there are so many services where my site is just invisible.

@csbenjamin

@intellix I guess that in the first request google find the and then do a second request with escaped_fragment. Before the first request it did not know that it needs to use escaped_fragment

@thoop
Owner

@varuzhnikov if we could refactor to remove if statements, that would be ideal. Any idea on the best way to do that? We haven't had any problems with this configuration yet.

@intellix @csbenjamin is correct. The first request see's your tag and re-requests the page with the escaped fragment parameter. "Fetch as Google" does not follow the escaped fragment, and that is a known bug. If you are seeing services that do not send two requests (one with escaped fragment), then we should add them to the whitelist of user agents.

@bastijan

I use self hosted prerender and nginx server/angularjs html application but when try share contentn on facebook or linked I always got root page (index.html) /

nginx conf
set $prerender "85.10.211.83:4000";

rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;

just for review:
http://85.10.211.83:4000/http://novi.bktvnews.com:3030/#!/
and single pages is same
http://85.10.211.83:4000/http://novi.bktvnews.com:3030/#!/grafika-iz-dalarne-na-putu-vas

in app.js
I have

$locationProvider.hashPrefix('!');

and in index.html template

@lethaldose

we had a AWS nginx proxy setup and were having timeout issues - service.prerender.io could not be resolved (60: Operation timed out). We were able to get around this by using nginx upstream:

upstream pre-render {
   server service.prerender.io;
}

and the if in location block changed to:

if ($prerender = 1) {
   rewrite ^(.*)$ /https://$server_name..... break;
   proxy_pass http://pre-render;
}
@creatorkuang

I use the official server :"service.prerender.io" and it works, but when i try to use my own server ,it didn't. I have test it with http://myserver.com/http://www.google.com and it work with my server which i think it mean the server work. But when i replace "service.prerender.io" with "myserver.com" , it didn't work and got an error with 502 Bad Gateway . Anyone know why?

@geolart

I use nginx like a reverse proxy and i would like to use prerender. I don't know how to adapt the nginx.conf especially this section:

location / {
try_files $uri @prerender;
}

I can't use the "try_files" instruction as i use "proxy_pass" instruction.

Do you know how to do ? Thank you

@rkulla

@creatorkuang it'd be hard to know without seeing your configuration. myserver.com needs to be running the open-source prerender node server and on the right port (by default it's 3000), and your nginx prerender middleware has to be configured to proxy there if the prerender variable = 1. E.g. proxy_pass http://myserver.com; But I've only tried the open-source server from localhost:3000.

@rkulla

@geolart I think some people above commented that you can do that by moving your proxy_pass line into the location @prerender block if the prerender variable = 0 (instead of rewrite .* /index.html break;). Let me know if it doesn't work for you though.

@geolart

@rkulla Thanks to you it's work for me !
Thank you Rhulla !

@rkulla

When also using nginx as a reverse proxy via proxy_pass http://myupstream, I had an issue where if I do proxy_set_header X-Prerender-Token XXX in the location @prerender block, it reset my other proxy_set_header lines, causing 'http://myupstream' to be treated as a literal URL. However, things work fine if I redefine my proxy_set_header 'Host', etc in the same block as X-Prerender-Token -- either all in the location / or all in the location @prerender, but not divided.

@thoop
Owner

Send me an email at support@prerender.io if anyone has any issues. I don't get notified when someone comments on this gist.

@BenjaminPrice

As per Facebook official documentation, you should also add the user agent 'facebot' to $http_user_agent

https://developers.facebook.com/docs/sharing/best-practices#crawl

@mrgamer

I have monitored many requests that come with Googlebot and AdsBot-Google-Mobile as User-Agent, so I added it to my list.

There is an exhaustive of those User-Agents somewhere?

@jstoiko

You might want to add 'svg' to the list of extensions not prerendered.

@ermakovich

@evandhoffman we have the same issue when hosting on Heroku, which in turn seems to be using Amazon ELB. $scheme is always HTTP. As an alternative to hardcoding we decided to use $http_x_forwarded_proto instead of $scheme.

@thoop
Owner

@mrgamer you don't want to add Googlebot (or any other crawlers that support the escaped fragment protocol) to the user agent list. You could get penalized for cloaking. You want Google to continue using the escaped fragment protocol

@ermakovich great idea!

Send me an email at support@prerender.io if anyone has any issues. I don't get notified when someone comments on this gist.

@ashishgupta2

For nginx proxy_pass users: do two things.

  1. Comment out: following two lines in server block

    #root /path/to/your/root;
    #index index.html;

  2. In the last if block ("if ($prerender = 0) {....."):

replace #rewrite .* /index.html break; with your proxy_pass as shown below (my application is running on port 3009).

proxy_pass http://127.0.0.1:3009;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.