Skip to content

Instantly share code, notes, and snippets.

@rawkintrevo
Created March 11, 2015 01:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rawkintrevo/31cb13fc017a723ccf33 to your computer and use it in GitHub Desktop.
Save rawkintrevo/31cb13fc017a723ccf33 to your computer and use it in GitHub Desktop.
Single Variate Hierarchical- PyMC
"""
Bayesian Statistics and Marketing: Rossi, Allenby, McCullough
Section 3.7
This is a remarkably poorly defined model. Almost better off going
straight for the deep wizardry on his code than dicking with the
text. (At least for model definition, because the code is crap too.)
D:
The 'cheese' data set has 5555 observations of 88 retailers.
For each observation there are VOLUME, DISP, PRICE, this is
all weekly and (theoretically) there are 65 weeks for each
store.
VOLUME: Presumable units of volume
DISP: The percentage of units on display
PRICE: The unit price
Model:
Y = ln(VOLUME)
X = ln(PRICE) and DISPLAY
Given X, Y
Y[i] = X_i * beta_i + err_i
err_i ~ N( 0, sigma2_i )
sigma2_i = the std_dev of the ith group
I_n_i ~ Uniform(0, 15) (intercept?)
beta_i ~ delta[i]*X[i] + v_i
v_i ~ N( 0, Var( group ) )
delta_i ~ Normal(delta_bar, delta_var)
delta_var ~ Uniform(0, 15) <- NO LOGIC TO THESE PRIORS RIGHT NOW
delta_bar ~ Uniform(0, 15)
In other words: You have a big regression, but each group has its own
unique beta estimate (variance is V).
"""
import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
from pprint import pprint
import pandas as pd
import rpy2.robjects as robjects
r= robjects.r
#r("install.packages('bayesm')")
import pandas.rpy.common as com
cheese = com.load_data("cheese", package='bayesm')
X_data = np.log(cheese['PRICE'])
Y_data = np.log(cheese['VOLUME'])
stores = list(cheese['RETAILER'].unique())
N = len(cheese['RETAILER'])
fixed_fx = pm.Normal('fixed_fx', mu=0, tau= X_data.var() )
X = pm.Normal('X', mu= X_data.mean(), tau= X_data.var(), value=X_data.values, observed=True )
random_fx = np.empty(N, dtype=object)
local_err = np.empty(N, dtype=object)
for store in stores:
index = (cheese['RETAILER']==store).values
local_var= X_data[index].var()
random_fx[index]= pm.Normal('random_fx_%s' % store, mu= 0, tau= local_var)
local_err[index]= pm.Uniform('local_err_%s' % store, lower= 0, upper=500, value= local_var)
@pm.deterministic
def beta(fixed_fx= fixed_fx, random_fx= random_fx):
return fixed_fx + random_fx
@pm.deterministic
def y_hat(beta= beta, X= X, local_err= local_err):
return beta * X + local_err
err = pm.Uniform('err', lower=0, upper=500)
Y = pm.Normal('Y', mu=y_hat, tau=err, value=Y_data.values, observed=True) #
model = pm.MCMC([Y, beta, local_err, random_fx, X, fixed_fx, err])
model.sample(100)
binwidth=.005
plot_data= model.trace('local_err_ROANOKE (NEW) - KROGER CO')[:]
plot_data= model.trace('random_fx_ROANOKE (NEW) - KROGER CO')[:]
plt.hist(plot_data, color='c', bins=np.arange(min(plot_data), max(plot_data) + binwidth, binwidth))
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment