jbyers/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Taking back your Mandrill click-tracking links

My company, like many, has recently switched away from using Mandrill
for our transactional email.
We'd been using Mandrill's click-tracking feature,
and became worried about what would happen to all those old emailed links
after we cancel our Mandrill account.
Why should we care about links in emails sent a month or more ago?
Well, turns out a lot of our users treat emails as de facto bookmarks.
To get back to our site, they dig up an old email from us and click a
link in it. (This is more common than you might expect, particularly
for consumer-oriented sites.)
Mandrill hasn't stated what will happen to click tracking redirects
in cancelled accounts. There's no reason to believe they'd break them,
but to be safe, we decided to take back our click-tracking domain and
handle those redirects ourselves.
And you can too.
Prerequisite: custom tracking domain

This only works if you had set up your own custom tracking domain
in Mandrill -- because you'll be able to repoint that domain to your own server.
If all your old emails have links directly to mandrillapp.com, there's nothing
you can do about that now.
Say you'd been using click.example.com as your custom tracking domain, and you'd
CNAMEd that to mandrillapp.com. Our goal is to write our own server code that can handle
Mandrill's redirect links. We'll then change click.example.com
to point at our code, and we'll have no more dependencies on Mandrill.
Decoding a Mandrill tracking link

To do our own redirects, we'll need to figure out how Mandrill was encoding links.
(Not intrested in the gory details? Just skip to the code.)
Here's an actual Mandrill tracking link I extracted from one of my old emails:
http://go.planapple.com/track/click/11003603/www.planapple.com?p=eyJzIjoiZGk1ZDNtM2tHaFBjaXJvRWZKU2w3LXhqRnBzIiwidiI6MSwicCI6IntcInVcIjoxMTAwMzYwMyxcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL3d3dy5wbGFuYXBwbGUuY29tXFxcL3N1cHBvcnRcXFwvP3V0bV9tZWRpdW09ZW1haWwmdXRtX3NvdXJjZT10cmFuc2FjdGlvbmFsJnV0bV9jYW1wYWlnbj1wYXNzd29yZF9yZXNldFwiLFwiaWRcIjpcIjk5ZGIyYjNiOTM1MzQ4Mjc5OTg1ZDY4ZGI3MWU4ODI0XCIsXCJ1cmxfaWRzXCI6W1wiY2U2OTJhMTlkMmUyMjc5OWJiM2E2YzU5OGNlN2NkMmNmMWYxYzQ2ZFwiXX0ifQ

In the visible parts of that URL:

go.planapple.com is our custom tracking domain
11003603 was (I'm guessing) our Mandrill account id
www.planapple.com is the host portion (only) of the target link.
(I'm guessing Mandrill includes it so users can see some hint about
where the link will lead.)

But we want the full target link, not just the host. It must be in
that p parameter, which looks like it might be base64 (without
the trailing-equals padding). Sure enough, decoding
it gives a JSON string:
{
  "s": "di5d3m3kGhPciroEfJSl7-xjFps",
  "v": 1,
  "p": "{\"u\":11003603,\"v\":1,\"url\":\"https:\\\/\\\/www.planapple.com\\\/support\\\/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset\",\"id\":\"99db2b3b935348279985d68db71e8824\",\"url_ids\":[\"ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d\"]}"
}
This appears to be some sort of signed JSON blob, where the real JSON of interest
is in the p payload field, and the s field is a signature Mandrill can
use to validate the payload.
Validation is important. We don't want to create an open redirect vulnerability
on our site. We won't be able to verify Mandrill's signature (since we don't have their secret),
so we'll take a different approach. More about that later.
Parsing the JSON from the p payload gives us the actual redirect params
we were hoping for:
{
  "u": 11003603,
  "v": 1,
  "url": "https://www.planapple.com/support/?utm_medium=email&utm_source=transactional&utm_campaign=password_reset",
  "id": "99db2b3b935348279985d68db71e8824",
  "url_ids": [
    "ce692a19d2e22799bb3a6c598ce7cd2cf1f1c46d"
  ]
}

u is our Mandrill account ID, again
v is probably the version of the parameters format.
(I've only seen v:1, on recent emails. Hoping there wasn't a v:0 earlier.)
url is the target url we were looking for.
(It even has the Google Analytics params Mandrill added for us!)
id is the Mandrill message uuid -- which you could use
if you wanted to keep logging click tracking stats
from old emails. (We're not going to bother with that.)
url_ids is... I'm not sure. (Maybe related to Mandrill's url-tagging
feature. We're going to ignore it.)

Handling in Django

Now that we've decoded Mandrill's redirect data format, handling it is easy.
The code below is for Django, but it shouldn't be hard to adapt to other
environments.
The only (somewhat) tricky part is validating the target to make sure we don't
create an open redirect vulnerability on our site.
In my case, our emails only contained links back to our own site, so I can
simply check the targets using Django's is_safe_url helper with
my site's hostname. If your emails linked to a variety of domains,
you'll need to come up with some other way to validate the redirect targets.
Here's a simple Django view to handle the redirects:
import json
from django.core.exceptions import SuspiciousOperation
from django.http import HttpResponseRedirect
from django.utils.http import is_safe_url, urlsafe_base64_decode

TARGET_HOSTNAME="www.example.com"  # You expect all redirects to go here

def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
    """Handle a Mandrill click-tracking redirect link"""

    try:
        b64payload = request.GET['p']
        # (Django's urlsafe_base64_decode handles missing '=' padding)
        payload = json.loads(urlsafe_base64_decode(b64payload))
        assert payload['v'] == 1  # we've only seen v:1 signed payloads
        params = json.loads(payload['p'])
        assert params['v'] == 1  # we've only seen v:1 params
        target = params['url']
    except (AssertionError, KeyError, TypeError, ValueError):
        # Missing/unparseable query params/payload format
        raise SuspiciousOperation("tried to redirect with garbled payload")

    # Verify we're only redirecting to our own site (don't be an open redirect server):
    if not is_safe_url(target, TARGET_HOSTNAME):
        raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)
        
    # If you want to be extra-paranoid, you could also check that:
    #   mandrill_account == params['u']
    #   target_host == urlparse(target).netloc

    # Want to actually log the click for your own metrics?
    # This would be a good place to do it.
    
    return HttpResponseRedirect(target)
Add this view to your Django urlpatterns with something like this:
from django.conf.urls import patterns, url
from yourapp.views import legacy_mandrill_click_tracking_redirect

urlpatterns = patterns('',
    # ...
    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
        legacy_mandrill_click_tracking_redirect),
    # ...
)
You may also need to add your Mandrill click-tracking domain to Django's
ALLOWED_HOSTS setting.
This is a good time to do some local testing with redirect urls from your own Mandrill
messages. Edit your /etc/hosts file to point your Mandrill click-tracking domain to your
dev server, then try clicking some links in old Mandrill emails. (Don't forget to
edit /etc/hosts back when your'e done.)
Once it's working, you can edit your DNS to change your old Mandrill
click-tracking CNAME from mandrillapp.com to your live Django app.
Bonus: open-tracking

If you were using Mandrill open-tracking, its tracking pixels will now be loaded from
your site. You can just ignore them (users won't notice a 404 error in a 1x1
pixel image). Or you could write some code to handle them and return a transparent
image (and even log the open metrics, if you wanted).
The format of the open-tracking pixel is:
http://go.planapple.com/track/open.php?u=11003603&id=99db2b3b935348279985d68db71e8824

where:

go.planapple.com is your custom tracking domain
the u param is your Mandrill account id
the id param is the Mandrill message uuid

Code to serve a transparent GIF on /track/open.php is "left as an exercise to the reader."

  
## urls.py
from django.conf.urls import patterns, url
from yourapp.views import legacy_mandrill_click_tracking_redirect

urlpatterns = patterns('',
    # ...
    url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
        legacy_mandrill_click_tracking_redirect),
    # ...
)

## views.py
import json
from django.core.exceptions import SuspiciousOperation
from django.http import HttpResponseRedirect
from django.utils.http import is_safe_url, urlsafe_base64_decode

TARGET_HOSTNAME="www.example.com"  # You expect all redirects to go here

def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
    """Handle a Mandrill click-tracking redirect link"""

    try:
        b64payload = request.GET['p']
        # (Django's urlsafe_base64_decode handles missing '=' padding)
        payload = json.loads(urlsafe_base64_decode(b64payload))
        assert payload['v'] == 1  # we've only seen v:1 signed payloads
        params = json.loads(payload['p'])
        assert params['v'] == 1  # we've only seen v:1 params
        target = params['url']
    except (AssertionError, KeyError, TypeError, ValueError):
        # Missing/unparseable query params/payload format
        raise SuspiciousOperation("tried to redirect with garbled payload")

    # Verify we're only redirecting to our own site (don't be an open redirect server):
    if not is_safe_url(target, TARGET_HOSTNAME):
        raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)

    # If you want to be extra-paranoid, you could also check that:
    #   mandrill_account == params['u']
    #   target_host == urlparse(target).netloc

    # Want to actually log the click for your own metrics?
    # This would be a good place to do it.

    return HttpResponseRedirect(target)
	from django.conf.urls import patterns, url
	from yourapp.views import legacy_mandrill_click_tracking_redirect

	urlpatterns = patterns('',
	# ...
	url(r'^track/click/(?P<mandrill_account>(\d+))/(?P<target_host>([a-z0-9.-]+))/?$',
	legacy_mandrill_click_tracking_redirect),
	# ...
	)
	import json
	from django.core.exceptions import SuspiciousOperation
	from django.http import HttpResponseRedirect
	from django.utils.http import is_safe_url, urlsafe_base64_decode

	TARGET_HOSTNAME="www.example.com" # You expect all redirects to go here

	def legacy_mandrill_click_tracking_redirect(request, mandrill_account=None, target_host=None):
	"""Handle a Mandrill click-tracking redirect link"""

	try:
	b64payload = request.GET['p']
	# (Django's urlsafe_base64_decode handles missing '=' padding)
	payload = json.loads(urlsafe_base64_decode(b64payload))
	assert payload['v'] == 1 # we've only seen v:1 signed payloads
	params = json.loads(payload['p'])
	assert params['v'] == 1 # we've only seen v:1 params
	target = params['url']
	except (AssertionError, KeyError, TypeError, ValueError):
	# Missing/unparseable query params/payload format
	raise SuspiciousOperation("tried to redirect with garbled payload")

	# Verify we're only redirecting to our own site (don't be an open redirect server):
	if not is_safe_url(target, TARGET_HOSTNAME):
	raise SuspiciousOperation("tried to redirect to unsafe url '%s'" % target)

	# If you want to be extra-paranoid, you could also check that:
	# mandrill_account == params['u']
	# target_host == urlparse(target).netloc

	# Want to actually log the click for your own metrics?
	# This would be a good place to do it.

	return HttpResponseRedirect(target)