Skip to content

Instantly share code, notes, and snippets.

@CamDavidsonPilon
Last active September 13, 2019 01:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CamDavidsonPilon/b8147288160f728eb965878a5e65e956 to your computer and use it in GitHub Desktop.
Save CamDavidsonPilon/b8147288160f728eb965878a5e65e956 to your computer and use it in GitHub Desktop.

The Delta Method and Autograd

One of the reasons I'm really excited about autograd is because it enables me to be able to transform my abstract parameters into business-logic. Let me explain with an example. Suppose I am modeling customer churn, and I have fitted a Weibull survival model using maximum likelihood estimation. I have two parameter estimates: lambda-hat and rho-hat. I also have their covariance matrix, which tells me how much uncertainty is present in the estimates (in lifelines, this is under the variance_matrix_ property. From this, I can plot the survival curve, which is fine, but what I really want is a measure of lifetime value. Customers give us $10 each time period for the first 3 time periods, and then $30 a month afterwards. For a single user, the average LTV (up to timeline) calculation might look like:

# create a Weibull model with fake data
wf = WeibullFitter().fit(np.arange(1, 100))

from autograd import numpy as np

def LTV(params, timeline):
    lambda_, rho_ = params
    sf = np.exp(-(timeline/lambda_) ** rho_)
    clv = 10 * (timeline <= 3) * sf + 30 * (timeline > 3) * sf
    return clv.sum()

timeline = np.arange(10)
params = np.array([wf.lambda_, wf.rho_])

LTV(params, timeline)
# 214.76

This gives me a point estimate. But I also want the variance, since there is uncertainty in lambda-hat and rho-hat. There is a technique called the delta-method that allows me to transform variance from one domain to another. But to use that, I need the gradient of LTV w.r.t to lambda and rho. For this problem, I could probably write out the gradient, but it would be messy and error-prone. I can use autograd instead to do this.

from autograd import jacobian
gradient_LTV = grad(LTV)


gradient_LTV(params, timeline)
# array([ 0.15527043, 10.78900217])

The output of the above is the gradient at the params value. Very easy. Now, if I want the variance of the LTV, the delta methods tells me to multiple the above by the covariance matrix and the gradient's transpose (check: this should return a scalar and not a vector/matrix):

var = gradient_LTV(params, timeline).dot(wf.variance_matrix_).dot(gradient_LTV(params, timeline))
# 3.127

So, my best estimate of LTV (at timeline=10) is 214.76 (3.127). From this, we can build confidence intervals, etc.

@CamDavidsonPilon
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment