Skip to content

Instantly share code, notes, and snippets.

@baldurthoremilsson
Created July 30, 2012 14:29
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save baldurthoremilsson/3207357 to your computer and use it in GitHub Desktop.
Save baldurthoremilsson/3207357 to your computer and use it in GitHub Desktop.
File hash GET parameters for Django staticfiles

About

A class for the static files app in Django that invalidates outdated browser cache.

How to use

You add the line

STATICFILES_STORAGE = 'path.to.hashpathstaticfilesstorage.HashPathStaticFilesStorage'

to your settings.py and everywhere you use the 'static' templatetag it will append a hash calculated form the contents of the file as a GET parameter at the end of the URL for the file. Example:

{% static "path/to/file.txt" %} -> /static/path/to/file.txt?4e1243

This guarantees that every time you update your static files, whether it is an image, a CSS file or anything else, all browsers fetch the new version of the file instead of using their cached versions.

Remember to change the import path to match your setup (replace the 'path.to' with your actual import path).

Caching

You can reduce the time it takes to return the URL by caching the hashes when they are first calculated. The cache framework that ships with Django is used to store the hashes, so make sure you configure that before caching your hashes.

When you start using the cache, and every time after that when you have to invalidate the cache, you simply create an object of the type HashPathStaticFilesStorage and call the method 'invalidate_cache' on that object:

h = HashPathStaticFilesStorage()
h.invalidate_cache()

This should happen every time you alter or update a file in one of your /static folders.

Configuration

This class introduces two new variables that you can put in your settings.py to configure it's behaviour:

  • STATICFILES_HASH_ACCURACY An integer that controls the number of characters used in the hash. This should not be set below 3, as that increases the possibility of a collision. The hash can be at most 40 characters long so values above 40 have the same effect as 40. Defaults to 6
  • STATICFILES_HASH_KEY_PREFIX A prefix for they keys in the cache. Defaults to 'staticfiles_hash'.

Requirements

Django 1.4 or newer is required.

# -*- coding: utf-8 -*-
import time
from hashlib import sha1
from django.conf import settings
from django.core.cache import cache
from django.contrib.staticfiles.storage import StaticFilesStorage
try:
ACCURACY = settings.STATICFILES_HASH_ACCURACY
except AttributeError:
ACCURACY = 6
try:
KEY_PREFIX = settings.STATICFILES_HASH_KEY_PREFIX
except AttributeError:
KEY_PREFIX = 'staticfiles_hash'
class HashPathStaticFilesStorage(StaticFilesStorage):
"""A static file storage that returns a unique url based on the contents
of the file. When a static file is changed the url will also change,
forcing all browsers to download the new version of the file.
The uniqueness of the url is a GET parameter added to the end of it. It
contains the first 6 characters of the SHA1 sum of the contents of the
file.
Example: {% static "image.jpg" %} -> /static/image.jpg?4e1243
The accuracy of the hash (number of characters used) can be set in
settings.py with STATICFILES_HASH_ACCURACY. Setting this value too low
(1 or 2) can cause different files to get the same hash and is not
recommended. SHA1 hashes are 40 characters long so all accuracy values
above 40 have the same effect as 40.
The values can be cached for faster performance. All keys in the cache have
the prefix specified in STATICFILES_HASH_KEY_PREFIX in setings.py. This
value defaults to 'staticfiles_hash'
"""
@property
def prefix_key(self):
return "%s:%s" % (KEY_PREFIX, 'prefix')
def invalidate_cache(self, nocache=False):
"""Invalidates the cache. Run this when one or more static files change.
If called with nocache=True the cache will not be used.
"""
value = int(time.time())
if nocache:
value = None
cache.set(self.prefix_key, value)
def get_cache_key(self, name):
hash_prefix = cache.get(self.prefix_key)
if not hash_prefix:
return None
key = "%s:%s:%s" % (KEY_PREFIX, hash_prefix, name)
return key
def set_cached_hash(self, name, the_hash):
key = self.get_cache_key(name)
if key:
cache.set(key, the_hash)
def get_cached_hash(self, name):
key = self.get_cache_key(name)
if not key:
return None
the_hash = cache.get(key)
return the_hash
def calculate_hash(self, name):
path = self.path(name)
try:
the_file = open(path, 'r')
the_hash = sha1(the_file.read()).hexdigest()[:ACCURACY]
the_file.close()
except IOError:
return ""
return the_hash
def get_hash(self, name):
the_hash = self.get_cached_hash(name)
if the_hash:
return the_hash
the_hash = self.calculate_hash(name)
self.set_cached_hash(name, the_hash)
return the_hash
def url(self, name):
base_url = super(HashPathStaticFilesStorage, self).url(name)
the_hash = self.get_hash(name)
if "?" in base_url:
return "%s&%s" % (base_url, the_hash)
return "%s?%s" % (base_url, the_hash)
@jezdez
Copy link

jezdez commented Jul 30, 2012

What about the CachedStaticFilesStorage included in Django 1.4? https://docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#cachedstaticfilesstorage

@baldurthoremilsson
Copy link
Author

I don't like the idea of keeping a lot of outdated copies around in my static files directory, but that's just my opinion. Each to his own, I guess :)

@jezdez
Copy link

jezdez commented Jul 30, 2012

That definitely makes sense, thanks for clarifying, I didn't mean to imply that your solution is bad. I chose to have the hash in the filename in staticfiles storage backend because many CDNs and caching proxies (e.g. Amazon CloudFront) ignore the querystring and only look at the cache headers. Having separate files makes that easy enough to handle and allows serving old page caches gracefully.

That said, in case you want, you could pre-populate the cache when running collectstatic by hooking up the post_process method in your storage, see https://docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#django.contrib.staticfiles.storage.StaticFilesStorage.post_process That's the generic hook that the CachedStaticFileStorage also uses but can be used for different things.

@baldurthoremilsson
Copy link
Author

Good point, I'll have to take a look at the hooks.

To be quite honest I had forgotten about the CachedStaticFilesStorage when I wrote this. I originally didn't like the idea of keeping many copies of the files around, but it doesn't really matter because the static files directory should be a "black box" managed by Django, I shouldn't have to be looking around there much. Maybe I'll just switch to using your solution (which is really better, as it's an officially supported part of Django).

Thanks for the comments though, feedback is always nice :)

@shehmed
Copy link

shehmed commented Jul 31, 2012

I have used the static files url in that way /media/css/style.css?v={TIMESTAMP} which tells the browser always get a new version of the resource rather the cached one.
Isn't that the right way?

@baldurthoremilsson
Copy link
Author

Then your users have to fetch the resources again every time they reload your site. This adds extra load on your servers and increases the time your users have to wait for the website to load, so I recommend that you change to using a counter that you increase every time you change your resources, or better, use the computed hash of the resources with CachedStaticFilesStorage or HashPathStaticFilesStorage.

@benjaminsavoy
Copy link

Unearthing this - but I just came across it and I think it is very nice.

I will see if I get time to update it for the latest versions of, well, everything, as well as making a version for S3 backends.

@Yoone
Copy link

Yoone commented May 15, 2016

Thank you for this snippet, it's been very useful! I just wanted to share a small edit I made on my side at line 79. I needed to load images using the static template tag. I added the b flag to open files as binaries instead of relying on the environment's encoding (which can and most likely will break for non-textual files): the_file = open(path, 'rb').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment