How tracking is working.
Let's start a first request.
→ http HEAD http://t.co/b6O9bSv5XO
HTTP/1.1 301 Moved Permanently
cache-control: private,max-age=300
content-length: 0
date: Wed, 17 Sep 2014 21:30:09 UTC
expires: Wed, 17 Sep 2014 21:35:09 GMT
location: http://trib.al/Lp10bGH
server: tsa_a
set-cookie: muc=bee6c71e-25e4-4652-8cec-e04bd00d4339; Expires=Mon, 29 Aug 2016 21:30:09 GMT; Domain=t.co
x-connection-hash: 6dc8fbd4b7f6288e6448551b9b2c5788
- Let's follow.
→ http HEAD http://trib.al/Lp10bGH
HTTP/1.1 301 Moved Permanently
Connection: keep-alive
Content-Type: text/html;charset=utf-8
Content-length: 497
Date: Wed, 17 Sep 2014 21:30:24 GMT
Location: http://rss.feedsportal.com/c/499/f/413865/s/3e908114/sc/37/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/story01.htm
Server: nginx/1.4.4
Set-Cookie: tribal="F2KiViWNQzm729I0Kweh8w=="; expires=Fri, 06 Jan 2034 23:13:40 GMT; Path=/; Version=2
Following the thread, my excitment starts to…
→ http HEAD http://rss.feedsportal.com/c/499/f/413865/s/3e908114/sc/37/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/story01.htm
HTTP/1.1 301 OK
Connection: close
Content-Length: 0
Content-Type: text/plain; charset=iso-8859-1
Date: Wed, 17 Sep 2014 21:30:53 GMT
Location: http://da.feedsportal.com/c/499/f/413865/s/3e908114/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/ia1.htm
Server: FeedsPortal
Set-Cookie: MF2=17kyltn1q1dka; domain=.feedsportal.com; expires=Fri, 16-Sep-16 21:30:54 GMT; path=/
Ok on the road again
→ http HEAD http://da.feedsportal.com/c/499/f/413865/s/3e908114/l/0L0Slesechos0Bfr0Ctech0Emedias0Chightech0C0A20A37836387270Edeux0Eans0Eapres0Ele0Etriste0Ebilan0Edu0Ecloud0Efrancais0E10A438330Bphp0Dxtor0FRSS0E20A65/ia1.htm
HTTP/1.1 301 OK
Connection: close
Content-Length: 0
Content-Type: text/plain; charset=iso-8859-1
Date: Wed, 17 Sep 2014 21:31:26 GMT
Location: http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065
Server: FeedsPortal
Really!?
→ http HEAD http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065
HTTP/1.1 200 OK
Cache-Control: no-cache, must-revalidate
Content-Encoding: gzip
Content-Length: 20
Content-Type: text/html
Date: Wed, 17 Sep 2014 21:31:41 GMT
Expires: 0
Last-Modified: Wed, 17 Sep 2014 21:31:41 GMT
Pragma: no-cache
Server: Apache
Set-Cookie: lastpage=%2Ftech-medias%2Fhightech; expires=Wed, 17-Sep-2014 21:37:41 GMT; path=/; domain=.lesechos.fr
Set-Cookie: pw201409=MQ%3D%3D%7C4c05c42a93e25ec85a1bd30e30f036fd; expires=Sat, 18-Oct-2014 21:31:41 GMT; path=/; domain=lesechos.fr
Vary: Accept-Encoding
Pfew… done. hmm… wait Content-Length: 20
.
→ http --print hH GET http://www.lesechos.fr/tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065
Let's make a GET instead of a HEAD.
GET /tech-medias/hightech/0203783638727-deux-ans-apres-le-triste-bilan-du-cloud-francais-1043833.php?xtor=RSS-2065 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Host: www.lesechos.fr
User-Agent: HTTPie/0.8.0
Ah gotcha. They have wrongly configured servers not able to send the same headers for HEAD
and GET
.
HTTP/1.1 200 OK
Cache-Control: no-cache, must-revalidate
Content-Encoding: gzip
Content-Length: 23215
Content-Type: text/html
Date: Wed, 17 Sep 2014 21:38:07 GMT
Expires: 0
Last-Modified: Wed, 17 Sep 2014 21:38:07 GMT
Pragma: no-cache
Server: Apache
Set-Cookie: lastpage=%2Ftech-medias%2Fhightech; expires=Wed, 17-Sep-2014 21:44:07 GMT; path=/; domain=.lesechos.fr
Set-Cookie: pw201409=MQ%3D%3D%7C34d96bd36ac0c280beb40f2a6c2c9a94; expires=Sat, 18-Oct-2014 21:38:07 GMT; path=/; domain=lesechos.fr
Vary: Accept-Encoding
What about the content?
- 456 HTTP requests
- 47,11s
- 8,149 KB
The body of the page is 23215 chars, but this before the scripts kick in and modify the DOM.
23,215
HTTP raw body128,136
after DOM initialization5,781
useful content of the article (text+html)
Which means:
- 4.5% of the page is the article content on the DOM page.
- 0.07% of the total transferred content (all HTTP requested bytes counted for this single article)