Skip to content

Instantly share code, notes, and snippets.

import matplotlib.pyplot as plt
import numpy
from scipy import stats as sts
def tippett_plot(lrs_on_target, lrs_off_target):
lr_sorted_on_target = (sorted(lrs_on_target)) # sort for nice plotting
lr_sorted_off_target = (sorted(lrs_off_target))
@mlopatka
mlopatka / gist:dc1c75015bd5ce27131913517a4bb28b
Created July 9, 2019 10:35
parsing and operating the alexa list
import numpy as np
from matplotlib import pyplot as plt
a = []
with open('/Users/mlopatka/Documents/backup_downloads/alexa-top-1m.csv', 'rb') as f:
for line in f:
if len(line) == 43:
print(line)
a.append(len(line)+3)
@mlopatka
mlopatka / gist:99c90fbbbeecfb20efb07a0620b72eae
Created July 9, 2019 10:11
Facebook.com web requests log
200
GET
www.facebook.com / document html 125.25 KB 1.35 MB
54 ms
200
GET
static.xx.fbcdn.net o7MAb-mcGwh.css stylesheet css 1.56 KB (raced) 3.41 KB
15 ms
200
GET
w=spark.sql("""
select
submission_date_s3,
client_id as cid,
sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0)) as turi,
case when sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0)) >=5 then 1 else 0 end as adau,
cast(sum(coalesce(scalar_parent_browser_engagement_total_uri_count,0))/(sum(active_ticks*5.0/3600)) as float) as turihr
from main_summary
where submission_date_s3>='20180701' and submission_date_s3<='20180707'
@mlopatka
mlopatka / human-web-overview.md
Created October 18, 2017 07:55 — forked from solso/human-web-overview.md
Human Web Overview

HumanWeb Overview

Konark Modi, Alex Catarineu, Philipp Claßen and Josep M. Pujol at Cliqz

München, October 2016 [edited on October 2017]

Motivation

Human Web is a methodology and system developed by Cliqz to collect data from users while protecting their privacy and anonymity.