Skip to content

Instantly share code, notes, and snippets.

@karlcow
Last active August 29, 2015 14:06
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save karlcow/ee0e3983074fa81d8d35 to your computer and use it in GitHub Desktop.
Save karlcow/ee0e3983074fa81d8d35 to your computer and use it in GitHub Desktop.
Tracked by redirections from a simple tweet.

How tracking is working.

Let's start a first request.

→ http HEAD http://t.co/b6O9bSv5XO
HTTP/1.1 301 Moved Permanently
cache-control: private,max-age=300
content-length: 0
date: Wed, 17 Sep 2014 21:30:09 UTC
expires: Wed, 17 Sep 2014 21:35:09 GMT
location: http://trib.al/Lp10bGH
server: tsa_a
set-cookie: muc=bee6c71e-25e4-4652-8cec-e04bd00d4339; Expires=Mon, 29 Aug 2016 21:30:09 GMT; Domain=t.co
x-connection-hash: 6dc8fbd4b7f6288e6448551b9b2c5788
  1. Let's follow.
→ http HEAD http://trib.al/Lp10bGH
HTTP/1.1 301 Moved Permanently
Connection: keep-alive
Content-Type: text/html;charset=utf-8
Content-length: 497
Date: Wed, 17 Sep 2014 21:30:24 GMT
Location: http://rss.feedsportal.com/c/499/f/413865/s/3e908114/sc/37/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/story01.htm
Server: nginx/1.4.4
Set-Cookie: tribal="F2KiViWNQzm729I0Kweh8w=="; expires=Fri, 06 Jan 2034 23:13:40 GMT; Path=/; Version=2

Following the thread, my excitment starts to…

→ http HEAD http://rss.feedsportal.com/c/499/f/413865/s/3e908114/sc/37/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/story01.htm
HTTP/1.1 301 OK
Connection: close
Content-Length: 0
Content-Type: text/plain; charset=iso-8859-1
Date: Wed, 17 Sep 2014 21:30:53 GMT
Location: http://da.feedsportal.com/c/499/f/413865/s/3e908114/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/ia1.htm
Server: FeedsPortal
Set-Cookie: MF2=17kyltn1q1dka; domain=.feedsportal.com; expires=Fri, 16-Sep-16 21:30:54 GMT; path=/

Ok on the road again

→ http HEAD http://da.feedsportal.com/c/499/f/413865/s/3e908114/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/ia1.htm
HTTP/1.1 301 OK
Connection: close
Content-Length: 0
Content-Type: text/plain; charset=iso-8859-1
Date: Wed, 17 Sep 2014 21:31:26 GMT
Location: http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065
Server: FeedsPortal

Really!?

→ http HEAD http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065
HTTP/1.1 200 OK
Cache-Control: no-cache, must-revalidate
Content-Encoding: gzip
Content-Length: 20
Content-Type: text/html
Date: Wed, 17 Sep 2014 21:31:41 GMT
Expires: 0
Last-Modified: Wed, 17 Sep 2014 21:31:41 GMT
Pragma: no-cache
Server: Apache
Set-Cookie: lastpage=%2Ftech-medias%2Fhightech; expires=Wed, 17-Sep-2014 21:37:41 GMT; path=/; domain=.lesechos.fr
Set-Cookie: pw201409=MQ%3D%3D%7C4c05c42a93e25ec85a1bd30e30f036fd; expires=Sat, 18-Oct-2014 21:31:41 GMT; path=/; domain=lesechos.fr
Vary: Accept-Encoding

Pfew… done. hmm… wait Content-Length: 20.

→ http --print hH GET http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065

Let's make a GET instead of a HEAD.

GET /tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Host: www.lesechos.fr
User-Agent: HTTPie/0.8.0

Ah gotcha. They have wrongly configured servers not able to send the same headers for HEAD and GET.

HTTP/1.1 200 OK
Cache-Control: no-cache, must-revalidate
Content-Encoding: gzip
Content-Length: 23215
Content-Type: text/html
Date: Wed, 17 Sep 2014 21:38:07 GMT
Expires: 0
Last-Modified: Wed, 17 Sep 2014 21:38:07 GMT
Pragma: no-cache
Server: Apache
Set-Cookie: lastpage=%2Ftech-medias%2Fhightech; expires=Wed, 17-Sep-2014 21:44:07 GMT; path=/; domain=.lesechos.fr
Set-Cookie: pw201409=MQ%3D%3D%7C34d96bd36ac0c280beb40f2a6c2c9a94; expires=Sat, 18-Oct-2014 21:38:07 GMT; path=/; domain=lesechos.fr
Vary: Accept-Encoding

What about the content?

  • 456 HTTP requests
  • 47,11s
  • 8,149 KB

The body of the page is 23215 chars, but this before the scripts kick in and modify the DOM.

  •  23,215 HTTP raw body
  • 128,136 after DOM initialization
  •   5,781 useful content of the article (text+html)

Which means:

  • 4.5% of the page is the article content on the DOM page.
  • 0.07% of the total transferred content (all HTTP requested bytes counted for this single article)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment