Skip to content

Instantly share code, notes, and snippets.

@kevinburke
Last active August 29, 2015 14:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kevinburke/6ad68982010cd6e41604 to your computer and use it in GitHub Desktop.
Save kevinburke/6ad68982010cd6e41604 to your computer and use it in GitHub Desktop.
Clicky / Google Analytics Traffic Discrepancy

Why does Clicky report more traffic than Google Analytics?

(I'm not sure how familiar you are with this, but thought I would try and provide some background. Feel free to skip to below if this is familiar). Analytics tracking consists of 3 main tasks:

  1. Download the tracking javascript from the source webpage, in this case http://www.google-analytics.com/ga.js and http://static.getclicky.com/js. This code runs on your site and gathers various pieces of information about the user.

  2. Set a cookie in the user's browser so that multiple page views can be tracked to the same user

  3. Make a request back to the data warehouse with tracking data. For example, Google Analytics (GA) requests a 1x1 pixel image. But included in the request is a whole bunch of tracking data - the URL, the Google keywords used to get to the page, etc. Here are the parameters Google sends on first page load.

A handy guide to these keywords can be found here

Usual causes of traffic discrepancies

  • Latency between a page loading in a user's browser and the event getting to GA/Clicky. These all have to happen to register a page view:

    • GA has to download its tracking code
    • Your browser executes the tracking code
    • Once GA wants to phone home with the request data, it has to figure out where to send the data to, translating google-analytics.com into an IP address (DNS lookup)
    • Browser must send the data (HTTP request)

    If someone navigates away while a request is in progress, GA/Clicky may not register a page view.

  • How trackers handle automated traffic (crawlers, bots). Bots are getting more sophisticated and some (including Google's search crawler) can run Javascript. Trackers may vary in their ability to detect and filter automated traffic.

  • How trackers handle users without Javascript; roughly 1% of people browse the web with Javascript disabled and tracking tools try, with varying degrees of success, to track these users. I believe GA does not try to track them at all, where Clicky does.

  • How trackers handle site administrators. Clicky for example discards views from users who are simultaneously logged into Clicky and visiting Givewell. There are also plugins that provide this functionality for GA.

Possible reasons for discrepancies

Roughly, from what I believe to be the most likely to the least likely.

  • GA is discarding pageviews that Clicky counts. There is lots of automated traffic that hits a website. Particularly, Clicky tries to get browsers without Javascript (including bots) to make a request for a tracking image, so these visitors can be counted in overall page view reports, where Google Analytics does not. However, Clicky states that it filters bot traffic. A way to test this would be to set up a test account in Clicky, pretend to be Googlebot or similar and make a request for the test image. If Clicky registers these image requests as page views, then this may explain some of the variance in traffic.
  • There is some Drupal plugin that hides GA tracking code for logged in Givewell staff, but not Clicky. In this case the increased Clicky traffic could be due to staff visits not counting toward GA traffic, but counting toward Clicky. A way to test this may be to

  • The Google Analytics (GA) loading code is located near the beginning of Givewell's web page, while Clicky is loaded at the end, so people who don't load the whole page only register for GA. The theory is that people on slow connections, or who only visit the page for a short period of time, register a visit with Google Analytics, but don't fully download the contents of the page, so Clicky does not register a hit. However this hypothesis is probably false, because Clicky has registered more traffic.

  • Givewell page architecture has changed It is possible that the page layout has changed, eg in February 2014 someone moved the GA tracking code to the head and Clicky to the footer, or something. This may have had an effect on analytics for the reason stated above.

  • Clicky is "quicker to the draw" than GA, somehow, and capturing people who visit/leave the page quickly. I tested this by setting the Internet connection on my computer to be extremely slow, clearing cookies and loading Givewell 3 times.

    Attempt 1:

    • Clicky begins request at ~13s, finishes at 22s.
    • GA begins at ~23s, finishes in under 1 second.

    Attempt 2:

    • Clicky begins at 9s, finishes at 19s
    • GA begins at 21s, finishes at 22s

    Attempt 3:

    • Clicky begins at 11s, finishes at 21s
    • GA begins at 24s, finishes at 25s

    Why this big discrepancy? Google's Javascript component (see #1 above) is very large and takes 12 seconds to download on my fake-slow machine. However, many sites load ga.js, and once it's downloaded on any site you visit, you don't need to download it again.

    I tried a second experiment where I visited the Givewell page once to prime the cache, then clicked on the "About" page and measured request timings on that second page.

    Attempt 1:

    • GA starts at 0.7s, ends at 1.3s
    • Clicky starts at 1.2s, ends at 1.6s

    Attempt 2:

    • GA starts at 0.7s, ends at 1.2s
    • Clicky starts at 1.2s, ends at 1.7s

    Attempt 3:

    • GA starts at 0.6s, ends at 1.2s
    • Clicky starts at 1.2s, ends at 1.6s

    So Clicky may have a slight advantage for users loading Givewell without a ga.js file in their cache. However, GA is so prevalent around the web that I don't believe this to be a very big factor. When both files were downloaded and in the user cache, GA was quicker to send tracking requests than Clicky.

    Note: I had to discard several attempts at Experiment 2 because I mis-measured in one case and failed to download the entire ga.js file to the cache in the first page load, in a second case. I am reasonably confident that more observations though would confirm the data that's shown above.

  • The Clicky tracking code is installed on some pages that GA is not. I don't have the data to process this hypothesis, short of crawling the entire Givewell site and checking for the code.

  • Some visitors may block requests to the Google Analytics tracking domain, but not Clicky. There are browser plugins that block analytics traffic, for example, the Ghostery plugin for Firefox, which has 1.1 million users. As GA is more common than Clicky, it's presumable that more people who are interested in blocking tracking cookies would have it blocked.

Summary

There are lots of reasons that page view counts can differ. I have outlined some above, though I did not find any smoking guns. It is hard to measure web traffic, in the same way that you might take a census of a village or the USA multiple times and come up with different counts each time, or find that individuals differ in how they measure the world. Frankly, looking at the numbers from the two tools, I was surprised that the numbers were so close together, especially in the last six months, and I wouldn't be too worried about the discrepancy between them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment