Skip to content

Instantly share code, notes, and snippets.

View milimetric's full-sized avatar

Dan Andreescu milimetric

  • Wikimedia Foundation
  • New York, NY
View GitHub Profile
milimetric /
Created March 9, 2022 17:02
Thoughts on DAG generation
projectview_ready = HiveTriggeredHQLTaskFactory(
archive = ArchiveTaskFactory(...)
projectview_ready.sensors() >> projectview_ready.etl() >> archive()
milimetric / query.sql
Created February 22, 2019 16:51
example query to mess around with
use wmf;
-- new data
select coalesce(, g.country_code) as country,
sum(edit_count) as edits,
sum(namespace_zero_edit_count) as namespace_zero_edits
from geoeditors_edits_monthly g
inner join
(select distinct dbname
select ar_id, ar_namespace, ar_title, NULL as ar_text, NULL as ar_comment, NULL as ar_comment_id,
case when ar_deleted&4 != 0 then null when ar_actor = 0
then ar_user else COALESCE( actor_user, 0 ) END AS ar_user,
case when ar_deleted&4 != 0 then null when ar_actor = 0
then ar_user_text else actor_name END AS ar_user_text,
if(ar_deleted&4 <> 0,0,ar_actor) as ar_actor, ar_timestamp, ar_minor_edit, NULL as ar_flags, ar_rev_id,
case when ar_deleted&1 != 0 then null when content_id is NULL then ar_text_id
else content_id end as ar_text_id,
ar_deleted, if(ar_deleted&1 <> 0,null,ar_len) as ar_len,
milimetric / basic signal
Last active December 2, 2023 23:13
Set a timeout for executing python code in a with statement
import signal
import re
class TimeoutError(Exception):
class timeout:
milimetric / survive-1.sql
Created June 13, 2018 20:28
history query example
with users_with_revisions as (
select event_user_id,
from mediawiki_history
where event_entity = 'revision'
and event_type = 'create'
and snapshot = '2018-05'
and wiki_db = 'enwiki'
milimetric / OojsUiCheckBoxInputWidget.vue
Created May 24, 2017 21:14
This is a quick example that shows how to wrap an oojs-ui component in a Vue component. It's nasty because of the lack of componentization of oojs-ui, but it's just a proof of concept.
<!-- In Vue, $el is this root element defined in the template section -->
// the script-loader webpack plugin has to be used to hack the oojs-ui files directly into script tags
// because they have no modularization whatsoever (AMD, ES6, etc.)
import 'script-loader!oojs/dist/oojs.jquery'
import 'oojs-ui/dist/oojs-ui-core'
"dataSources" : [
"spec" : {
"dataSchema" : {
"dataSource" : "pageviews-hourly",
"metricsSpec" : [
"name" : "view_count",
"type" : "longSum",
# NOTE: required for the following to work:
# !pip install pymysql\n",
# !git clone\n",
# !cd mediawiki-config && git pull origin master"
import pymysql
import ipaddress
import os
connection = pymysql.connect(
# download /srv/reportupdater/output/metrics/sessions in the working folder as sessions.old, then run:
# python
import csv
from path import glob
from collections import OrderedDict, defaultdict
from datetime import datetime, timedelta
milimetric / get daily edits and pages created.sql
Last active September 27, 2016 16:53
Queries to get simple metrics from mediawiki_history
select substring(event_timestamp, 0, 8) day,
count(*) `All namespaces`,
sum(if( page_namespace_latest = 0
,1, 0)) `Namespace Zero`,
sum(if( page_namespace_latest = 0
and revision_deleted_timestamp is null
,1, 0)) `Namespace Zero not Deleted`
from milimetric.mediawiki_history