This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def deduplicate(list_of_objects, key_function): | |
uniques = dict() | |
for o in list_of_objects: | |
key = key_function(o) | |
if not key in uniques: | |
uniques[key] = o | |
return uniques.values() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# fill in u and p to the proper usernames and passwords | |
username=u | |
password=p | |
htuser=u | |
htpass=p | |
curl --data "username=$username&password=$password" https://$htuser:$htpass@metrics.wikimedia.org/login -c ~/umapi.session | |
for cohort in test e2_aft5_cta4 e3_ob2b_gettingstarted_page-impression e3_ob4b_gettingstarted-addlinks_page-impression e3_ob4b_gettingstarted-clarify_page-impression e3_ob4b_gettingstarted-copyedit_page-impression | |
do |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/srv/debugging.wmflabs.org/ | |
/srv/dev-reportcard.wmflabs.org/ | |
/srv/ee-dashboard.wmflabs.org/ | |
/srv/gerrit-stats.wmflabs.org/ | |
/srv/gp.wmflabs.org/ | |
/srv/mobile-reportcard-dev.wmflabs.org/ | |
/srv/mobile-reportcard.wmflabs.org/ | |
/srv/test-reportcard.wmflabs.org/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
REGISTER 'kraken-pig-0.0.2-SNAPSHOT.jar' | |
REGISTER 'kraken-generic-0.0.2-SNAPSHOT-jar-with-dependencies.jar' | |
REGISTER 'geoip-1.2.5.jar' | |
IMPORT 'include/load_webrequest.pig'; | |
SET default_parallel 2; | |
DEFINE TO_HOUR org.wikimedia.analytics.kraken.pig.ConvertDateFormat('yyyy-MM-dd\'T\'HH:mm:ss', 'yyyy-MM-dd_HH'); | |
DEFINE EXTRACT org.apache.pig.builtin.REGEX_EXTRACT_ALL(); | |
DEFINE ZERO org.wikimedia.analytics.kraken.pig.Zero(); | |
LOG_FIELDS = LOAD_WEBREQUEST('/wmf/raw/webrequest/webrequest-wikipedia-mobile/dt=2013-05-01*'); | |
LOG_FIELDS = FILTER LOG_FIELDS BY (x_cs != '-'); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
self.create_test_cohort( | |
editor_count=4, | |
revisions_per_editor=3, | |
revision_timestamps=[ | |
[ | |
datetime(2012, 12, 31, 23, 0, 0), | |
datetime(2013, 1, 1, 0, 30, 0), | |
datetime(2013, 1, 1, 1, 0, 0), | |
], | |
[ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DROP TABLE IF EXISTS milimetric_pagecounts_daily; | |
CREATE TABLE IF NOT EXISTS milimetric_pagecounts_daily( | |
project string, | |
page string, | |
views int, | |
bytes int, | |
year int, | |
month int, | |
day int | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datetime import datetime | |
def diff_datewise(left, right, left_format=None, right_format=None): | |
""" | |
Parameters | |
left : a list of datetime strings or objects | |
right : a list of datetime strings or objects | |
left_format : None if left contains datetimes, or strptime format | |
right_format : None if right contains datetimes, or strptime format |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
select p1.page_title as child_title | |
,p1.page_id as child_id | |
,p2.page_title as parent_title | |
,p2.page_id as parent_id | |
from categorylinks cl | |
inner join | |
page p1 on p1.page_id = cl.cl_from | |
inner join | |
page p2 on p2.page_title = cl.cl_to | |
and p2.page_id <> cl.cl_from |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
*swp |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
daemonize yes | |
pidfile /var/run/redis.pid | |
port 6379 | |
timeout 0 | |
loglevel debug | |
logfile /var/log/redis/redis-server.log | |
databases 16 | |
save 900 1 | |
save 300 10 | |
save 60 20 |