This was originally posted on 2011-07-11 to http://andrewho.co.uk/weblog/clean-urls-on-jekyll-apache
I use a static site generator, specifically jekyll, to transform some
templates into a set of static *.html
files. However, I like to keep the URLs
looking clean, and not display the .html
extension both because I think it
looks better and also so that the URLs purely reflect the content and not the
underlying files or CMS used to serve that content. In short, whilst the file
being served might be $DOCUMENT_ROOT/weblog/title.html
, the canonical URL for
that resource should be /weblog/title
. Here's how I do that in .htaccess
.
The first step is to turn on mod_rewrite:
RewriteEngine On
Create 404.html and let .htaccess know about it:
ErrorDocument 404 /404.html
All resources should be accessed via the main domain name, not a subdomain:
RewriteCond %{HTTP_HOST} ^[^\.]+\.andrewho\.co\.uk$ [NC]
RewriteRule ^(.*)$ http://andrewho.co.uk/$1 [R=301,L]
Remove trailing slashes (note that mod_dir fiddles around with them so disable that behaviour here too):
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301]
Options -Indexes
DirectorySlash Off
Hide the *.html
files by redirecting all requests for foo.html
to foo
:
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ /.+\.html\ HTTP
RewriteRule ^(.+)\.html$ http://%{HTTP_HOST}/$1 [R=301,L]
If the client requests /foo
but that doesn't exist, then try /foo.html
:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^.+$ %{REQUEST_FILENAME}.html [L]
Throw that all into .htaccess
(in that order) and you should have clean URLs.
There is an error in your htaccess as it is missing an argument:
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,]