public
Created

File hash GET parameters for Django staticfiles

  • Download Gist
README.md
Markdown

About

A class for the static files app in Django that invalidates outdated browser cache.

How to use

You add the line

STATICFILES_STORAGE = 'path.to.hashpathstaticfilesstorage.HashPathStaticFilesStorage'

to your settings.py and everywhere you use the 'static' templatetag it will append a hash calculated form the contents of the file as a GET parameter at the end of the URL for the file. Example:

{% static "path/to/file.txt" %} -> /static/path/to/file.txt?4e1243

This guarantees that every time you update your static files, whether it is an image, a CSS file or anything else, all browsers fetch the new version of the file instead of using their cached versions.

Remember to change the import path to match your setup (replace the 'path.to' with your actual import path).

Caching

You can reduce the time it takes to return the URL by caching the hashes when they are first calculated. The cache framework that ships with Django is used to store the hashes, so make sure you configure that before caching your hashes.

When you start using the cache, and every time after that when you have to invalidate the cache, you simply create an object of the type HashPathStaticFilesStorage and call the method 'invalidate_cache' on that object:

h = HashPathStaticFilesStorage()
h.invalidate_cache()

This should happen every time you alter or update a file in one of your /static folders.

Configuration

This class introduces two new variables that you can put in your settings.py to configure it's behaviour:

  • STATICFILES_HASH_ACCURACY An integer that controls the number of characters used in the hash. This should not be set below 3, as that increases the possibility of a collision. The hash can be at most 40 characters long so values above 40 have the same effect as 40. Defaults to 6
  • STATICFILES_HASH_KEY_PREFIX A prefix for they keys in the cache. Defaults to 'staticfiles_hash'.

Requirements

Django 1.4 or newer is required.

hashpathstaticfilesstorage.py
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
# -*- coding: utf-8 -*-
 
import time
 
from hashlib import sha1
 
from django.conf import settings
from django.core.cache import cache
from django.contrib.staticfiles.storage import StaticFilesStorage
 
 
try:
ACCURACY = settings.STATICFILES_HASH_ACCURACY
except AttributeError:
ACCURACY = 6
 
try:
KEY_PREFIX = settings.STATICFILES_HASH_KEY_PREFIX
except AttributeError:
KEY_PREFIX = 'staticfiles_hash'
 
class HashPathStaticFilesStorage(StaticFilesStorage):
"""A static file storage that returns a unique url based on the contents
of the file. When a static file is changed the url will also change,
forcing all browsers to download the new version of the file.
The uniqueness of the url is a GET parameter added to the end of it. It
contains the first 6 characters of the SHA1 sum of the contents of the
file.
Example: {% static "image.jpg" %} -> /static/image.jpg?4e1243
The accuracy of the hash (number of characters used) can be set in
settings.py with STATICFILES_HASH_ACCURACY. Setting this value too low
(1 or 2) can cause different files to get the same hash and is not
recommended. SHA1 hashes are 40 characters long so all accuracy values
above 40 have the same effect as 40.
The values can be cached for faster performance. All keys in the cache have
the prefix specified in STATICFILES_HASH_KEY_PREFIX in setings.py. This
value defaults to 'staticfiles_hash'
"""
 
@property
def prefix_key(self):
return "%s:%s" % (KEY_PREFIX, 'prefix')
 
def invalidate_cache(self, nocache=False):
"""Invalidates the cache. Run this when one or more static files change.
If called with nocache=True the cache will not be used.
"""
value = int(time.time())
if nocache:
value = None
cache.set(self.prefix_key, value)
 
def get_cache_key(self, name):
hash_prefix = cache.get(self.prefix_key)
if not hash_prefix:
return None
key = "%s:%s:%s" % (KEY_PREFIX, hash_prefix, name)
return key
 
def set_cached_hash(self, name, the_hash):
key = self.get_cache_key(name)
if key:
cache.set(key, the_hash)
 
def get_cached_hash(self, name):
key = self.get_cache_key(name)
if not key:
return None
the_hash = cache.get(key)
return the_hash
 
def calculate_hash(self, name):
path = self.path(name)
try:
the_file = open(path, 'r')
the_hash = sha1(the_file.read()).hexdigest()[:ACCURACY]
the_file.close()
except IOError:
return ""
return the_hash
 
def get_hash(self, name):
the_hash = self.get_cached_hash(name)
if the_hash:
return the_hash
the_hash = self.calculate_hash(name)
self.set_cached_hash(name, the_hash)
return the_hash
 
def url(self, name):
base_url = super(HashPathStaticFilesStorage, self).url(name)
the_hash = self.get_hash(name)
if "?" in base_url:
return "%s&%s" % (base_url, the_hash)
return "%s?%s" % (base_url, the_hash)

I don't like the idea of keeping a lot of outdated copies around in my static files directory, but that's just my opinion. Each to his own, I guess :)

That definitely makes sense, thanks for clarifying, I didn't mean to imply that your solution is bad. I chose to have the hash in the filename in staticfiles storage backend because many CDNs and caching proxies (e.g. Amazon CloudFront) ignore the querystring and only look at the cache headers. Having separate files makes that easy enough to handle and allows serving old page caches gracefully.

That said, in case you want, you could pre-populate the cache when running collectstatic by hooking up the post_process method in your storage, see https://docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#django.contrib.staticfiles.storage.StaticFilesStorage.post_process That's the generic hook that the CachedStaticFileStorage also uses but can be used for different things.

Good point, I'll have to take a look at the hooks.

To be quite honest I had forgotten about the CachedStaticFilesStorage when I wrote this. I originally didn't like the idea of keeping many copies of the files around, but it doesn't really matter because the static files directory should be a "black box" managed by Django, I shouldn't have to be looking around there much. Maybe I'll just switch to using your solution (which is really better, as it's an officially supported part of Django).

Thanks for the comments though, feedback is always nice :)

I have used the static files url in that way /media/css/style.css?v={TIMESTAMP} which tells the browser always get a new version of the resource rather the cached one.
Isn't that the right way?

Then your users have to fetch the resources again every time they reload your site. This adds extra load on your servers and increases the time your users have to wait for the website to load, so I recommend that you change to using a counter that you increase every time you change your resources, or better, use the computed hash of the resources with CachedStaticFilesStorage or HashPathStaticFilesStorage.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.