Skip to content

Instantly share code, notes, and snippets.

@knwang
Last active December 14, 2015 03:28
Show Gist options
  • Save knwang/5020668 to your computer and use it in GitHub Desktop.
Save knwang/5020668 to your computer and use it in GitHub Desktop.
How Does a Web Application Keep Track of User Identity?

How Does a Web Application Keep Track of its Users?

If you have an Gmail account, you have probably observed that when you visit http://www.gmail.com, sometimes it asks for your username and password, but sometimes it takes you directly into your inbox; Similarly, when you visit http://www.amazon.com, you may be surprised that the front page has your first name on it! How do websites like Google and Amazon keep track of user identity and information? In this article, we are going to explore this topic, and explain how web applications use cookies to track their users over time.

Web applications do not track users by default

The core technologies that power the Internet are designed such that when you make requests to web servers, for example, Google's front page http://www.google.com, the web server does not know who you are, nor does it have any memory of its past interactions with you. The protocol that governs the transportation of data between your browser and Google's server, which is called HTTP (HyperText Transfer Protocol), is designed to have the server only focus on the current job at hand, without memory of the past interaction with the user. In other words, there is no ongoing connection between the client and the server.

This "statelessness" nature of the protocol allows web servers to handle a lot of traffic, however, what if there is a real need that the web server has to know the user that it is interacting with? For example, an online banking application should really track every user so they can only move their own money!

Cookies as a solution to tracker users

One popular solution for this challenge is for the server to create a "cookie", which is a small piece of data that contains user's information, send it to the user and store it on the user's computer. Then every time the user visits the same site, they will carry the "cookie" with them, and the server can extract all the data stored in the cookie, including the user's identity.

We are going to use a user's interaction with Gmail as an example to walk through this process, and explain the process and related technologies in that context.

A case study with Gmail

User visits gmail and signs in

Let's say I want to check my emails in Gmail. I open up my browser and request for http://www.gmail.com. All modern browsers support cookies, so my browser will look through my local cookies and see if I have any cookies that match this domain, if it finds any, they will be sent along with my request to Google. Here because I do not have a cookie with Google yet, I am redirected to https://accounts.google.com to sign in.

Google sets the cookie

If I put in the wrong username and password, Google will just redirect me back to the sign in page with an error message, without sending along any cookies. But If I do put in the correct username and password, Google will send me back a cookie that contains a token that it can uniquely identify as me.

(screen shot of the cookie)

As you can see, the data in the cookie is encrypted by Google to ensure that nobody else can interpret and modify this data other than Google itself. When setting this cookie, Google has the option of setting an expiration date to tell my local system when this cookie should be invalidated. Cookies without an expiration date are called "transcient cookies", and they will be destroyed after the user closes the browser. Cookies with expiration date are going to be persisted in the user's computer beyond one interaction session, and they are called "persistent cookies". In our example, Google sets the cookie to be a persistent cookie.

I work with my emails

Once I am signed in with Gmail, I am going to check some emails, reply to some and compose some new ones. Those are going to be a series of interactions between me and Google. Because all subsequent requests from me are now carrying the cookie that Google set for me, Google knows exactly who I am and it does not require me to sign in every single time. When I compose a new email and send it out, it will show that it's from me.

I close my browser window

When I finish working with my emails, I close my browser window. Because the cookie that Google sent to me is a persistent cookie, it will persist on my disk and will be used for the future. Not all applications behave this way, for example, most banking applications only sets transient cookies to ensure that their customers have to sign in every time for a financial transaction for extra safety.

I visit Gmail again

This time because I have a cookie locally, my browser will and send it along with my request. Google reads this data and knows who I am, and allows me to enter my inbox immediately.

I sign out from Gmail

When I explicitly sign out of Gmail, Google will set my local cookie to expire and redirect me to the sign in page. My browser destroys the local cookie. Now if I were to make a request to Gmail, I will be challenged to sign in again.

Conclusion:

In this article, we explained the stateless nature of the HTTP protocol, and a few ways to implement user sessions. We explained cookie based session implementation in particular, which is a very popular client side session implementation technique.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment