-
-
Save chutten/e4ccd0d5a46b782bae53 to your computer and use it in GitHub Desktop.
Firefox users' SSE2 support, with detractors grouped by their OS
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### SSE2 on Release Firefox" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Part of the [environment block](http://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/environment.html) on Telemetry pings is a list of CPU extensions we detected on the user's machine.\n", | |
"\n", | |
"How many Firefox users have SSE2?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Populating the interactive namespace from numpy and matplotlib\n" | |
] | |
} | |
], | |
"source": [ | |
"import ujson as json\n", | |
"import matplotlib.pyplot as plt\n", | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"import plotly.plotly as py\n", | |
"\n", | |
"from moztelemetry import get_pings, get_pings_properties, get_one_ping_per_client, get_clients_history\n", | |
"\n", | |
"%pylab inline" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"I sampled 20% of the pings reported on January 21 of this year. Why the 21st? It was a Thursday, and I never could get the hang of Thursdays." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"pings = get_pings(sc, app=\"Firefox\", channel=\"release\", submission_date=\"20160121\", fraction=0.25)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"subset = get_pings_properties(pings, [\"clientId\",\n", | |
" \"environment/build/version\",\n", | |
" \"environment/system/os/name\",\n", | |
" \"environment/system/os/version\",\n", | |
" \"environment/system/cpu/extensions\"])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To prevent pseudoreplication, let's consider only a single submission for each client. As this step requires a distributed shuffle, it should always be run only after extracting the attributes of interest with *get_pings_properties*." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": false | |
}, | |
"outputs": [], | |
"source": [ | |
"subset = get_one_ping_per_client(subset)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"cached = subset.cache()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"How many pings are we looking at?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"332289" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"cached.count()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Alrighty, time to find out who has SSE2. (Note that the absence of hasSSE2 doesn't necessarily mean that the client doesn't have SSE2. This just puts a bound on things)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"330737" | |
] | |
}, | |
"execution_count": 18, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hasSSE2 = cached.filter(lambda p: \"hasSSE2\" in p[\"environment/system/cpu/extensions\"])\n", | |
"hasSSE2.count()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.9953293669065181" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hasSSE2.count() * 1.0 / cached.count()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Okay, let's abandon statistical rigour and see what the distribution of OSs is like on the pings that came across without \"hasSSE2\". This doesn't mean that the clients _don't_ have SSE2 and that this represents a breakdown of such clients' OS versions. But it might approximate it. Not that we can necessarily prove it." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"defaultdict(int,\n", | |
" {u'Linux3.11.10-29-default': 2,\n", | |
" u'Linux4.4.0-03lx': 1,\n", | |
" u'Windows_NT4.0': 4,\n", | |
" u'Windows_NT5.0': 4,\n", | |
" u'Windows_NT5.1': 1472,\n", | |
" u'Windows_NT6.0': 5,\n", | |
" u'Windows_NT6.1': 64})" | |
] | |
}, | |
"execution_count": 20, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"noSSE2 = cached.filter(lambda p: not (\"hasSSE2\" in p[\"environment/system/cpu/extensions\"]))\n", | |
"noSSE2.count()\n", | |
"version_to_ping = noSSE2.map(lambda p: (p[\"environment/system/os/name\"] + p[\"environment/system/os/version\"], p))\n", | |
"version_to_ping.countByKey()" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.10" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment