Skip to content

Instantly share code, notes, and snippets.

@AustinRochford
Last active September 10, 2024 00:21
Show Gist options
  • Save AustinRochford/afe6862e622c31494b2f to your computer and use it in GitHub Desktop.
Save AustinRochford/afe6862e622c31494b2f to your computer and use it in GitHub Desktop.
Bayesian Survival analysis with PyMC3
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ppope
Copy link

ppope commented Apr 19, 2016

Same here! @AustinRochford any advice on how to get this to work?

@gkalman1
Copy link

I get the same. Updated all installs, etc. Looks like a bug now is not being addressed.
Values in trace['beta'] all come out to be 0.0
Interestingly, trace['mu_beta'] and trace['tau'] all look to have values. So, seemingly, the model description is not being processed correctly.
Anyway, something looks to be quite broken.

@ibarrien
Copy link

ibarrien commented Jan 14, 2017

I managed to reproduce all posted results, including:

np.exp(trace['beta'].mean())
1.6455921480844671

Some info:
Spyder Python 2.7 on a Mac OS X
package, version:
numpy, 1.11.2
pymc3, 3.0
statsmodels, 0.6.1
theano, 0.8.2
(pandas, 0.19.1)

@LeoHdez
Copy link

LeoHdez commented Jul 28, 2017

Same problem:
np.exp(trace['beta'].mean()) = 1.0404104237122036
Beta plot, autocorrelation plot, Cumulative hazard and Survival function are different from your notebook (although consistent with each other.
All results from section "Time varying effects" are identical to yours

python version: 2.7.13 |Anaconda custom (x86_64)| (default, Dec 20 2016, 23:05:08)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
numpy version: 1.12.1
pymc3 version: 3.0
statsmodels version: 0.8.0
theano version: 0.9.0

@jacoblevine
Copy link

First of all, thank you @AustinRochford for a wonderful demo.

Like others, I also had trouble reproducing the result for the first hazard regression (i.e. not the time-varying coefficient model, which I found to be reproducible). I was able to get similar results with a different model specification that is perhaps more typical for Bayesian regression:

with pm.Model() as model:
    lambda0 = pm.Gamma('lambda0', 0.01, 0.01, shape=n_intervals)

    tau = pm.Gamma('tau', 10., 10.)
    mu_beta = pm.Normal('mu_beta', 0., 10 ** -2)
    beta = pm.Normal('beta', mu_beta, tau)
    
    lambda_ = pm.Deterministic('lambda_', T.outer(T.exp(beta * df.metastized), lambda0))
    mu = pm.Deterministic('mu', exposure * lambda_)
    
    obs = pm.Poisson('obs', mu, observed=death)

The gamma prior on tau produces a distribution for beta that looks reasonable for a regression model (i.e. centered on zero with a fair amount of density between -2 and 2).

Since beta is generated by a Gaussian random walk with fixed tau=1 in the time-varying model, this can explain why someone running this demo could have a problem with the first but not second example. Incidentally, a gamma distribution with parameters (a=10, b=10) as in the code written above produces a distribution for tau with a mean of 1.

Of course, what I can't explain is why the model specification as it appears in the notebook worked in the first place. Anyway, hope this helps anyone else struggling with it.

And FWIW I'm using Python 3.6.1 on Mac OS X, pymc3 3.1, Theano 0.9.0, numpy 1.12.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment