Skip to content

Instantly share code, notes, and snippets.

@rgrove
Created August 9, 2011 23:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rgrove/1135432 to your computer and use it in GitHub Desktop.
Save rgrove/1135432 to your computer and use it in GitHub Desktop.
A feeble attempt to improve the HTTP Archive's detection code for YUI
/*
I have a *lot* of issues with the way HTTPArchive detects JavaScript libraries,
but this is a feeble attempt to at least improve its detection of YUI to include
both YUI 3.x and YUI 2.x, while minimizing false positives.
I haven't touched the patterns for other libraries, but I think they're very
broken as well. The jQuery pattern, for instance, assumes that any URL
containing the string "jquery" is the jQuery JavaScript library, which means it
will return false positives for plugins ("jquery-pluginname.js"), paths
"/jquery/bogus-file.js", and a variety of other URLs.
This applies to the patterns for Dojo, Quantcast, Twitter, and ShareThis as
well.
The original HTTPArchive JS lib detection code can be found at:
http://code.google.com/p/httparchive/source/browse/trunk/interesting-images.js#331
*/
// Old code:
$hCond = array();
$hCond["jQuery"] = "rt.url like '%jquery%'";
$hCond["YUI"] = "rt.url like '%/yui/%'";
$hCond["Dojo"] = "rt.url like '%dojo%'";
$hCond["Google Analytics"] = "(rt.url like '%/ga.js%' or rt.url like '%/urchin.js%')";
$hCond["Quantcast"] = "rt.url like '%quant.js%'";
$hCond["AddThis"] = "rt.url like '%addthis.com%'";
$hCond["Facebook"] = "(rt.url like '%facebook.com/plugins/%' or rt.url like '%facebook.com/widgets/%' or rt.url like '%facebook.com/connect/%')";
$hCond["Google +1"] = "rt.url like '%google.com/js/plusone.js%'";
$hCond["Twitter"] = "rt.url like '%twitter%'";
$hCond["ShareThis"] = "rt.url like '%sharethis%'";
// New code (better YUI detection):
$hCond = array();
$hCond["jQuery"] = "rt.url like '%jquery%'";
$hCond["YUI"] = "(rt.url like '%/yui-min.js%' or rt.url like '%/yui.js%' or rt.url like '%/yui-debug.js%' or rt.url like '%/yui-base-min.js%' or rt.url like '%/yui-base.js%' or rt.url like '%/yui-core-min.js%' or rt.url like '%/yui-core.js%' or rt.url like '%/simpleyui.js%' or rt.url like '%/simpleyui-min.js%' or rt.url like '%/yahoo.js%' or rt.url like '%/yahoo-min.js%' or rt.url like '%/yahoo-debug.js%' or rt.url like '%/yahoo-dom-event.js%' or rt.url like '%/yuiloader-dom-event.js%')";
$hCond["Dojo"] = "rt.url like '%dojo%'";
$hCond["Google Analytics"] = "(rt.url like '%/ga.js%' or rt.url like '%/urchin.js%')";
$hCond["Quantcast"] = "rt.url like '%quant.js%'";
$hCond["AddThis"] = "rt.url like '%addthis.com%'";
$hCond["Facebook"] = "(rt.url like '%facebook.com/plugins/%' or rt.url like '%facebook.com/widgets/%' or rt.url like '%facebook.com/connect/%')";
$hCond["Google +1"] = "rt.url like '%google.com/js/plusone.js%'";
$hCond["Twitter"] = "rt.url like '%twitter%'";
$hCond["ShareThis"] = "rt.url like '%sharethis%'";
@manuel-jasso
Copy link

Do you think it is enough to detect YUI seed files only? How about detecting '%/yui.yahooapis.com/%' for any YUI file, at least coming from Yahoo's CDN?

@rgrove
Copy link
Author

rgrove commented Aug 9, 2011

Not everyone loads YUI from the Yahoo! CDN. Detecting only seed files ensures that we don't treat every YUI module as a single use of YUI, which would be unfair since there could be multiple modules requested on a single page.

@manuel-jasso
Copy link

Fair enough. So to be accurate, this snippet would have to be adjusted to detect the seed file(s) for the other libraries, which you mentioned already. Thanks!

@lsmith
Copy link

lsmith commented Aug 10, 2011

New with 3.4.0:

or rt.url like '%/yui-core-min.js%' or rt.url like '%/yui-core.js%'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment