Skip to content

Instantly share code, notes, and snippets.

View saptarshiguha's full-sized avatar

Saptarshi Guha saptarshiguha

View GitHub Profile
@saptarshiguha
saptarshiguha / f.sql
Created November 9, 2019 00:26
c.sql
REATE OR REPLACE FUNCTION analysis.sg_histogram_aggregate(r ARRAY<STRUCT<v STRING>>) RETURNS Array<STRUCT<k STRING, v INT64>>
LANGUAGE js AS \"\"\"
// called as sg_histogram_aggregate(ARRAY_AGG(Struct( JSON_EXTRACT(A HIstogram,'$.values') as v)))
var arrayLength = r.length;
if( arrayLength == 0) {
return( [] );
}
var d = 0;
var accum = { };
for (var i = 0; i < arrayLength; i++) {
output
html_document
keep_md
true

Approaches to Estimating Samples Sizes and Power

Background

As data scientist at Mozilla one of your responsibilities will be to analyze

output
html_document
keep_md
true

Approaches to Estimating Samples Sizes and Power

Background

As data scientist at Mozilla one of your responsibilities will be to analyze

w=spark.sql("""
select
submission_date_s3,
client_id as cid,
sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0)) as turi,
case when sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0)) >=5 then 1 else 0 end as adau,
cast(sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0))/(sum(active_ticks*5.0/3600)) as float) as turihr
from main_summary
where submission_date_s3>='20180701' and submission_date_s3<='20180707'
(custom-set-variables '(package-archives
(quote
(
;; ("marmalade" . "http://marmalade-repo.org/packages/")
("melpa" . "http://melpa.org/packages/")
("gnu" . "http://elpa.gnu.org/packages/")))))
(setq package-enable-at-startup nil)
(package-initialize)
---
title: Detecting Changes in Histograms
description: |
Detecting Useful Changes in Histograms with small to many bins
author:
- name: Saptarshi Guha
affiliation: Product Metrics
date: "`r Sys.Date()`"
output:
radix::radix_article:
---
title: Have we missed profiles on ESR and Linux?
author: Saptarshi Guha <joy@mozilla.com>
date: "`r format(Sys.time(), '%H:%M %B %d, %Y',tz='America/Los_Angeles',usetz=TRUE)`"
output:
html_document:
mathjax: default
self_contained: false
theme: readable
highlight: haddock
```{r}
releases <- local({
x <- fromJSON(file='https://product-details.mozilla.org/1.0/firefox_history_major_releases.json')
f <- rbindlist(Map(function(a,b){
data.table(version=a, from=b)
},names(x), x))
f$from<- as.Date(f$from)
f$to <- c(tail(f$from,-1)-1,as.Date('2025-01-01'))
f
})
## For Session Hours Per Day Per Profile
### For profiles on 56
with a as (
select client_id, submission_date_s3, sum(subsession_length)/3600 as hours
from main_summary
where app_name='Firefox'
and normalized_channel='release'
and substring(app_version,1,2)='56'
## For Session Hours Per Day Per Profile
### For profiles on 56
with a as (
select client_id, submission_date_s3, sum(subsession_length)/3600 as hours
from main_summary
where app_name='Firefox'
and substring(app_version,1,2)='56'
and submission_date_s3 >= '20170925'