Skip to content

Instantly share code, notes, and snippets.

@remusao
Created February 16, 2019 19:13
Show Gist options
  • Save remusao/ca53fc9facaf882704044064e4b2d83f to your computer and use it in GitHub Desktop.
Save remusao/ca53fc9facaf882704044064e4b2d83f to your computer and use it in GitHub Desktop.

Answer to comments on uBlock Origin thread: https://github.com/gorhill/uBlock/commit/5733439f629da948cfc3cae74afa519f6cff7b7f as it seems I do not have permission to comment.

Hi,

First of all I'd like to personnaly thank you for all the work you do on uBlock Origin and other extensions, the source code of which have been an inspiration to me personally many times in the past.

I am also really excited that there are multiple people pushing for more accurate measurements of the efficiency of content-blockers and I think sharing methodologies, data and results is a great start!

It is interesting that the results you obtained diverge from the study published yesterday. If I understand correctly you got similar timings for uBlock Origin itself, but the numbers for Adblock Plus do not seem to match (45µs instead of ~19µs). I'd really like to understand where this difference could come from.

The setup we used for the (synthetic) benchmark was the following:

  1. The version of uBlock Origin we used was commit 29b10d215184aef1a9a12b715b47de9656ecdc3c
  2. The version of Adblock Plus we used was commit 34c49bbf029e586226220c067c50cec6e8bf8842 of the adblockpluscore repository
  3. The code used to run the benchmark for Adblock Plus is the following: https://github.com/cliqz-oss/adblocker/blob/master/bench/comparison/adblockplus.js

We initialized an instance of the CombinedMatcher class using all the network filters (as it seems to be the case in the extension), then used the matchesAny method of the matcher as an entry-point. Moreover, the parsing of the URLs were performed using tldts and not included in the measurement. It could be that the parsing and preparation of requests in Adblock Plus is less efficient than in uBlock Origin (which I know is extremely efficient).

The focus of the study was specifically on the network matching engine of the content-blockers and it seems likely that other parts of the extensions are introducing overhead. That's why I really like the in-browser measurement you have setup in uBlock Origin. In the end I guess all of these can be valuable in some way.

@remusao
Copy link
Author

remusao commented Apr 27, 2019

@gorhill, that is an interesting point. I was a bit reluctant to warm-up the different blockers too much since some of them implement internal caching in one form or another (AdblockPlus will cache results of matching for the last N requests for example). If warming up is applied, should we clear caches before running the benchmark? Should caching be left as is and considered representative enough of a normal workload? Should warming up before benchmarking considered representative?

With regard to uBO, would there be a way to trigger the initialization work artificially before benchmarking without having to run through the dataset once?

Thanks for taking the time to look in the the benchmark into more details, your opinions and feedback are very appreciated!

Edit 1: By the way, congratz for the recent optimizations added to uBO.

@gorhill
Copy link

gorhill commented Apr 27, 2019

@remusao I edited my comment above, thinking more about it, I now think it's fine to measure the lazy initialization because of the sheer amount of URLs being matched in the benchmark. If ever the lazy initialization becomes such an issue that it does affect the benchmark timings a lot, then it's the responsibility of the developers to improve that part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment