Skip to content

Instantly share code, notes, and snippets.

@chutten
Created Jan 29, 2016
Embed
What would you like to do?
SSE2 support in Firefox Desktop release-channel users on January 21, 2016.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### SSE2 on Release Firefox"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Part of the [environment block](http://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/environment.html) on Telemetry pings is a list of CPU extensions we detected on the user's machine.\n",
"\n",
"How many Firefox users have SSE2?"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unable to parse whitelist (/home/hadoop/anaconda2/lib/python2.7/site-packages/moztelemetry/bucket-whitelist.json). Assuming all histograms are acceptable.\n",
"Populating the interactive namespace from numpy and matplotlib\n"
]
}
],
"source": [
"import ujson as json\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import numpy as np\n",
"import plotly.plotly as py\n",
"\n",
"from moztelemetry import get_pings, get_pings_properties, get_one_ping_per_client, get_clients_history\n",
"\n",
"%pylab inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I sampled 20% of the pings reported on January 21 of this year. Why the 21st? It was a Thursday, and I never could get the hang of Thursdays."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"pings = get_pings(sc, app=\"Firefox\", channel=\"release\", submission_date=\"20160121\", fraction=0.2)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"subset = get_pings_properties(pings, [\"clientId\",\n",
" \"environment/build/version\",\n",
" \"environment/system/cpu/extensions\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To prevent pseudoreplication, let's consider only a single submission for each client. As this step requires a distributed shuffle, it should always be run only after extracting the attributes of interest with *get_pings_properties*."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"subset = get_one_ping_per_client(subset)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cached = subset.cache()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How many pings are we looking at?"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"264898"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cached.count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alrighty, time to find out who has SSE2. (Note that the absence of hasSSE2 doesn't necessarily mean that the client doesn't have SSE2. This just puts a bound on things)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"263612"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hasSSE2 = cached.filter(lambda p: \"hasSSE2\" in p[\"environment/system/cpu/extensions\"])\n",
"hasSSE2.count()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9951453012102771"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hasSSE2.count() * 1.0 / cached.count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Okay, let's abandon statistical rigour and see what the distribution of versions is like on the pings that came across without \"hasSSE2\". This doesn't mean that the clients _don't_ have SSE2 and that this represents a breakdown of such clients' versions. But it might approximate it. Not that we can necessarily prove it."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(int,\n",
" {u'40.0.2': 3,\n",
" u'40.0.3': 10,\n",
" u'41.0': 2,\n",
" u'41.0.1': 8,\n",
" u'41.0.2': 30,\n",
" u'42.0': 106,\n",
" u'43.0': 3,\n",
" u'43.0.1': 147,\n",
" u'43.0.2': 12,\n",
" u'43.0.3': 4,\n",
" u'43.0.4': 961})"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"noSSE2 = cached.filter(lambda p: not (\"hasSSE2\" in p[\"environment/system/cpu/extensions\"]))\n",
"noSSE2.count()\n",
"version_to_ping = noSSE2.map(lambda p: (p[\"environment/build/version\"], p))\n",
"version_to_ping.countByKey()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment