Skip to content

Instantly share code, notes, and snippets.

@briankung
Last active December 4, 2020 12:13
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save briankung/fad713332879f5842fd0a3f5fb8349e6 to your computer and use it in GitHub Desktop.
Save briankung/fad713332879f5842fd0a3f5fb8349e6 to your computer and use it in GitHub Desktop.
[RFC] Making sense of secure token authentication in Rails

/ be me

/ be working on new Rails API web app

/ ...I can't. I hate this format

...so I was looking up how to do simple, secure authentication in Rails for my super secret project which you definitely can't find by looking at recent activity on my GitHub account (Hail Mammon[0]) and realized that I hadn't implemented authentication in Rails in a long time. The closest I'd come to was throwing Devise into my app and grumpily remembering to configure the mailer after the fact.

This will seem unrelated, but bear with me: I learned how to calculate interest rates on loans in something like 6th grade. I have since forgotten how to do that without assistance. Much like how I would have been much better prepared to choose a mortgage back then, I would in some ways be much better prepared to figure out authentication a few years ago when I was just starting out in web dev - and had just implemented it from the Hartl tutorial. That said, just like how I now know more of what I'm looking for in a house, I now know more of what I'm looking for in an API.

I wanted an API-only Rails app for a mobile project I'm working on with my cousin. That meant that a great deal of Devise would be mostly useless to me, so I began to look for alternatives, which led me down a rabbit hole. After all, what was the issue with rolling your own authentication, again?


Why You Shouldn’t Roll Your Own Authentication

Basically the problems with rolling your authentication are a microcosm of the problems with rolling your own crypto, but not as severe. Don't get me wrong - a compromised system is a compromised system. But rolling your own authentication doesn't have quite the same level of difficulty as rolling your own crypto library, of which there are certain levels:

  1. Using cryptographic libraries - probably fine, given you know a few key points.
  2. Choosing your encryption algorithm - check that it's not already broken; once someone has a reliable crack for an encryption scheme, the whole thing is generally dead in the water. It's like being a little dead or a little pregnant - there's really no such thing.
  3. Implementing an encryption library or algorithm - you really have to know your stuff to write a safe, performant encryption library. There are a few layers involved in this level, but they're all subject to the same sensitivities:
    1. Wrappers around well-known libraries which are often implemented in low level languages
    2. Native (to the language you're using) implementations of well-known encryption algorithms and libraries to use them, and
    3. Mixed language implementations, often in a target language and assembly for performance and reliability. Apparently, compilers, smart little devils that they are, will change their assembler output, often for good reasons like optimization. However, that means that the resulting assembly code is unpredictable - and possibly insecure, as noted crypto researcher DJ Bernstein notes here. And no, DJ Bernstein does not drop phat beats, but his crypto is 🔥🔥🔥
  4. Designing an encryption algorithm - probably just apply, be accepted into, and enroll in a PhD program in cryptography to get started on this one, so probably not super approachable for most people

Rolling your own authentication, on the other hand, mostly just sits at level 1 on the above tier, if that. A few things to keep in mind:

  1. Never store plaintext passwords
  2. Don't pick broken encryption algorithms - better yet, don't get to the point where you're picking algorithms at all, use a peer reviewed community standard
  3. Know what a salt is for: randomizing password input before hashing it to defend against dictionary and rainbow-table attacks
  4. Be familiar with the idea of timing attacks - fun fact, the Meltdown and Spectre Attacks are timing attacks
  5. Know when you're in over your head - for me, that's anything over level 0.5 in the previous list.

Clearly, I'm not terribly confident, but that's because I shouldn't be.

About that API...

Now, where was I? Ah yes, writing an API with token authentication. That's where I started and I ended up somewhere in between Just Enough Crypto for the Web and deciding to implement my own crypto library...someday. Maybe. Put it on the list of things to do, anyway.

But what I wanted to do was write a simple API. So what options did I have?

  1. Roll my own authentication using bcrypt
  2. Use Devise, but with a sprinkle of customization
  3. Use a Devise-based derivative like simple_token_authentication or devise_token_auth

Since I don't trust myself with security, I ruled out #1. I'm not against becoming proficient enough someday, but today's another day to find me shying away (🎶)...that is, I'm not ready to commit to taking that path yet.

Besides security, however, there was an additional consideration. Solutions 2 and 3 are based off of Jose Valim's 2nd example, wherein you supply both an API key and the user's email address, which is mildly distasteful to me. Distasteful enough to get me moderately over my distrust of my own security chops.

The reason Jose Valim recommends that the client supply a user email and a token to the server is to prevent timing attacks - slight differences in execution time that would result in measurable differences when comparing, for example, a completely wrong password to one that is partially correct:

# Given a password:
"ABCDE"
# And an attempted password:
"12345"
# In most languages, a string comparison would compare the first characters
# first and then halt execution:
"A" == "1" # False
# And then return false right away, essentially. The only problem, from a
# security standpoint, is that it takes less time to do that than to compare two
# almost-identical strings:
# Given passwords "ABCDE" and "ABCD5"
"A" == "A" # True
"B" == "B" # True
"C" == "C" # True
"D" == "D" # True
"E" == "5" # False
# The latter example takes roughly 5 times as long as the first example - good
# evidence that the password the attacker supplied was almost right!

In this case, the timing attack revolves around the #find_by method in the Jose Valim's gist. I've copied the offending bits from both the insecure and the secure implementations in aforementioned gist:

# I've included both insecure and secure versions in this gist. In Ruby, this
# would lead to only the last implementation actually working, so it's just for
# purposes of illustration that they they're both in the same class.
class ApplicationController < ActionController::Base
before_filter :authenticate_user_from_token!
# ...
private
# Insecure version in which user_token is the only authentication parameter:
def authenticate_user_from_token!
user_token = params[:user_token].presence
# Note that `#find_by_#{attr_name}` has since been deprecated/removed.
# The way to do it now is `#find_by(attr_name: value)`
user = user_token && User.find_by_authentication_token(user_token.to_s)
if user
# ...
end
end
# Secure version in which user_token and user_email are both used as auth params:
def authenticate_user_from_token!
user_email = params[:user_email].presence
user = user_email && User.find_by_email(user_email)
if user && Devise.secure_compare(user.authentication_token, params[:user_token])
# ...
end
end
end
# Note the operations here that will take a variable amount of time:
User.find_by_authentication_token(user_token.to_s)
# and
User.find_by_email(user_email)
# Meanwhile, Devise#secure_compare is implemented so that it will run in constant time,
# thereby giving our attacker no hints as to how correct their guess is. But why is the
# first solution weak to timing attacks? Or to put it another way, why is the second
# example resistant to timing attacks?

Simply put, the first example has a high degree of timing variability when it comes to the result of User.find_by_authentication_token(user_token.to_s).

But Brian, doesn't the second example do a User lookup with User.find_by_email(user_email), as well?

The answer is yes, but:

  1. It's a required parameter, so you must know the email ahead of time
  2. You can throttle fishy requests based on the user email
  3. It doesn't matter because you can discover emails by trying to sign up for them, anyway

As @danielfone says (emphasis mine),

@dnagir the intent is to specifically protect against the timing attacks on User.find_by_authentication_token.

If we look up the record with a known key (in this case email) and then do a safe compare on the token, the attacker can't gain any information about the validity of the token by studying timings. This is the only attack this is mitigating.

The important part when protecting the token is to do a constant time comparison between the supplied token and an actual known token.

At some point, I thought I was clever and wanted to supply a separate (either global default or a newly generated) token to compare against if the user couldn't be found. But that actually doesn't work. It's easier visualize if we flatten the logic into steps.

Secure method Insecure method "Clever" insecure method
1. Look up user by email 1. Look up user by token 1. Look up user by token
2. Constant time comparison of tokens 2. ¯\_(ツ)_/¯ 2. Generate new or fetch global token if lookup fails
3. 3. 3. Constant time comparison of tokens just to look busy

And now that I think about it, it's even more broken than I thought. Generating a token or fetching another token just to run a secure compare introduces more detectable timing differences, not to mention the fact that it doesn't even defend against the initial problem, which is that the initial user lookup by token has a runtime that's directly related to the validity of the token.


Still, the idea of requiring clients to provide an email and a token was gross to me. If I could get around it, I would.

My other idea was throttling. If I could deny an attacker the opportunities to get a fine-grained resolution between attack attempts, or at least make it extremely painful to do so, then I could have a good-enough defense against rogue scripts, for this kind of attack, anyway.

I started by looking around to see how many requests you would need to pull off a successful timing attack. Though the first response I found gave me a number (49,000 requests for 15 nanosecond resolution), but skimming the paper made me less sure about how much resolution was actually necessary to distinguish between control flow paths in a typical Rails server (to pick a totally random technology).

A flaw in my plans? Haven't seen any round these parts. Moving on...😬

I googled around a bit and found Kickstarter's rack-attack middleware. It seemed active, and since it was backed by a big company, I figured it was fine. And that's where I am at the time of writing. I'm probably going to throw something insecure together involving using just a token, or doing something silly like two-way hashing the user email and token together (not assuming that's secure) on token generation to hand off to the API user and then reversing the hash during the API request to run the same constant-time algorithm that Valim proposes.

In the end, "I figured it was fine" is probably fine, in the end - this project is unlikely to go anywhere, and at the end of the day I still have to actually write the damn thing. My threat model doesn't include...anyone, currently. But it was a fine rabbit hole to find myself down, and I'm glad I took the time to write it up for future reference.

@galliani
Copy link

galliani commented Dec 4, 2020

Thanks for this very-detailed educative gist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment