-
-
Save jessegrabowski/ccda08b8a758f882f5794b8b89ace07a to your computer and use it in GitHub Desktop.
Yes, the idea is to model both the regression and the GARCH
sigmas simultaneously. Then the model "ideally" attempts to explain as much as possible the observed data with the regression, and then (hopefully) uses the GARCH sigma
to absorb any unexplained variance.
I have had good success with this type of model (in particular for say modeling sales at a retailer): prior to COVID19, the GARCH sigmas
are really small, since the regression has the "right" features to explain the observed data, but during COVID19, (or at least the start of it) you see a large increase in the GARCH std
because the model "doesn't know" how to explain the observed data (the features don't explain it).
I just want to ensure that I am proceeding on sound footing here. (The reasoning behind a lot of this is that I am in the process of converting my Stan model into PYMC, and want to replicate what I have in loops in Stan as scan
in the PYMC code)
That said, I really appreciate your comments!
Ah ok! Yeah it definitely makes sense to have some kind of COVID adjustment. I saw a paper on LinkedIn that did this kind of adjustment in a Bayesian VAR (I dug around for the post but couldn't find it). Basically they just used an indicator variable for COVID/not COVID, then modeled the residuals of that regression with a normal VAR. You would basically be doing the same thing in a GARCH framework by having your exog_data
be a COVID indicator?
So we had no indicator originally. And COVID is a really simple example (we absolutely knew what was going on) but generically, if you see large std
in certain time regimes, it is a motivator to go see why the std
is growing (i.e. why aren't the features capturing the variance?). Then you can apply the COVID indicator (for example) and (hopefully) see the variance diminish during the same time period (when retrained). Then you can be confident that the uncaptured variance was in fact due to COVID (and not a bunch of other things).
Alright, sorry to keep pestering you @jessegrabowski but one last question (maybe):
Here is my implementation, but I don't think things are correct for some reason (actually for two reasons: when I try to run with JAX
backend, it is complaining about numpy generators, and if I don't run with JAX
, it is using a bunch of samplers other than NUTS for different parameters (which should all be able to be sampled by NUTS since they are continuous)).
Anyway, here is the relevant part of the model (I dropped the part that computes obs_mean
):
# GARCH parameters
# Inital values for scan
sigma_sq_init = pm.Exponential('sigma_sq_init', 1)
epsilon_init = pm.Normal('epsilon_init')
omega = pm.Uniform('omega',0,1)
alpha1 = pm.Uniform('alpha1',0,1)
beta1 = pm.Uniform('beta1',0, (1-alpha))
def step(*args):
epsilon_tm1, sigma_sq_tm1, omega, alpha1, beta1 = args
# GARCH process
sig2 = omega + alpha * at.square(epsilon_tm1) + beta1 * sigma_sq_tm1
return sig2, collect_default_updates(args, [sig2])
sigmas, updates = pytensor.scan(
fn=step,
sequences=[{'input':at.concatenate([epsilon_init[None], (obs_data-obs_mean)]), 'taps':[-1]}],
outputs_info=[sigma_sq_init],
non_sequences=[omega, alpha1, beta1],
strict=True,
mode = get_mode(None) # use get_mode("JAX") if you try to sample with numpyro
)
sig = pm.Deterministic('sig',at.sqrt(sigmas))
lik = pm.Normal(name='lik', mu=obs_mean, sigma=sig, observed=obs_data, shape=obs_data.shape)
where obs_data
is the observed data and obs_mean
is the calculated mean from the model (same shape as obs_data
).
Can you see anything amiss here?
They way you've written it, I don't think you need to collect updates at all in the scan function. This is because you're not actually making any random variables inside the scan, so you don't need to do anything special with the underlying random number generator (this is what collect_default_updates
does.
If you did do the regression first, the mean of the residuals will be zero by construction, so you could just omit the mean model from inside the scan entirely and replace
mu
with0
everywhere. You wouldn't need the 0th tap fory_true
, you could just use the -1 tap asepsilon_tm1
directly. There would be some other subtleties with setting up the likelihood too, because you can't use the residuals from the first regression as observed in the second regression. You would need to provide the actual data as observed, with mu=regression_mu
and sigma=GARCH_sigmas
. Not positive, you'll have to play around with it