Skip to content

Instantly share code, notes, and snippets.

@kohlmeier
kohlmeier / tasks_per_mission.py
Last active January 4, 2016 15:49
Compute number of MasteryTask updates per day per mission type
"""This script is hack to get a quick idea of how many mastery challenges
are being done under each mission. It's hacky for a lot of reasons.. I
don't have time to list them all. But here's a few. :)
*) The user_mission (and thus the mission) associated with each LearningTask
is the mission at the time of the creation. For MasteryChallenges, this
may not be a problem-- I'm not sure. But if this was extended to work
on, say, PracticeTasks, a user could create the task in one mission, then
switch missions and actually do the problems in another misison. This
script would not understand that.
@kohlmeier
kohlmeier / compressed_features.py
Last active March 2, 2024 18:08
Example of computing compressed features. NOTE: If you want to want to create such features consistently across process, you will need to persist the random components. Easy enough, but I've written the code for that, too, here: https://github.com/Khan/analytics/blob/master/map_reduce/py/random_features.py
import collections
import numpy as np
class CompressedFeatures:
def __init__(self, num_features=50):
self.random_components = collections.defaultdict(
self._generate_component)
self.num_features = num_features
@kohlmeier
kohlmeier / ka_bnet_pandas.py
Created March 26, 2012 22:32
Bayes net example (Pandas version)
#!/usr/bin/env python
from pandas import DataFrame, Series
import numpy as np
import math
import random
import copy
NaN_Flag = -1 # pandas uses np.nan, but that coerces ints to floats :(
@kohlmeier
kohlmeier / ka_bnet_numpy.py
Created March 26, 2012 21:59
Bayes net example in Python with Khan Academy data
#!/usr/bin/env python
from numpy import asmatrix, asarray, ones, zeros, mean, sum, arange, prod, dot, loadtxt
from numpy.random import random, randint
import pickle
MISSING_VALUE = -1 # a constant I will use to denote missing integer values
def impute_hidden_node(E, I, theta, sample_hidden):