Skip to content

Instantly share code, notes, and snippets.

@wlach
Created August 12, 2021 13:50
Show Gist options
  • Save wlach/9d12e4bed0bd85ea025e6341eb981071 to your computer and use it in GitHub Desktop.
Save wlach/9d12e4bed0bd85ea025e6341eb981071 to your computer and use it in GitHub Desktop.
# To generate an sqlite3 table from a GLAM CSV, do something like:
# cat glam-file-XXX.csv.gz |gunzip | sqlite3 -csv -separator ',' glam4.db '.import /dev/stdin glam'
# This script runs *much* faster if you create an index on the table:
# create index metrics_index on glam(metric, channel, process, os)
import json
import os
import sqlite3
sqliteConnection = sqlite3.connect('glam.db')
cursor1 = sqliteConnection.cursor()
cursor2 = sqliteConnection.cursor()
for (metric, channel, process, operating_system) in cursor1.execute("SELECT distinct metric, channel, process, os from glam"):
outdir = os.path.join('data', metric, channel, process, operating_system)
print(outdir)
os.makedirs(outdir, exist_ok=True)
aggregates = cursor2.execute("SELECT app_build_id, aggregates from glam where metric=? AND channel=? AND process=? AND os=?", (metric, channel, process, operating_system)).fetchall()
open(os.path.join(outdir, 'data.json'), 'w').write(json.dumps(aggregates))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment