-
-
Save MatsuuraKentaro/952b3301686c10adcb13 to your computer and use it in GitHub Desktop.
data { | |
int N; | |
int M; | |
real<lower=0> Y[N]; | |
} | |
parameters { | |
real<lower=0> mu; | |
real<lower=0> phi; | |
real<lower=1, upper=2> theta; | |
} | |
transformed parameters { | |
real lambda = 1/phi*mu^(2-theta)/(2-theta); | |
real alpha = (2-theta)/(theta-1); | |
real beta = 1/phi*mu^(1-theta)/(theta-1); | |
} | |
model { | |
mu ~ cauchy(0, 5); | |
phi ~ cauchy(0, 5); | |
for (n in 1:N) { | |
if (Y[n] == 0) { | |
target += -lambda; | |
} else { | |
vector[M] ps; | |
for (m in 1:M) | |
ps[m] = poisson_lpmf(m | lambda) + gamma_lpdf(Y[n] | m*alpha, beta); | |
target += log_sum_exp(ps); | |
} | |
} | |
} |
library(rstan) | |
library(tweedie) | |
stanmodel <- stan_model(file='model/model.stan') | |
N1 <- 200 | |
M1 <- 30 | |
Y1 <- rtweedie(N1, power=1.3, mu=1, phi=1) | |
data1 <- list(N=N1, M=M1, Y=Y1) | |
fit1 <- sampling(stanmodel, data=data1) | |
N2 <- 1000 | |
M2 <- 30 | |
Y2 <- rtweedie(N2, power=1.01, mu=3, phi=1) | |
data2 <- list(N=N2, M=M2, Y=Y2) | |
fit2 <- sampling(stanmodel, data=data2) |
Mr vchernat, did you find the answer?? I have this question and I'm confused with it.
I appreciate any suggestion and help.
Many thanks
Sorry for my late reply. Here is the explanation:
http://statmodeling.hatenablog.com/entry/tweedie-distribution
Could you translate and read it using some translator?
Finally I could translate this text but I couldn't understand this part:
今回あてはめたいデータの分布と上記の密度関数の図を見比べると、Tweedie分布のパラメータthetaの値は1.3以下になりそうだと判断できたとします。すると、パラメータlambdaは15程度、すなわちMとしては1~30までを見ておけば十分によく近似できそうです。
Here the author says "M" should be fixed to get a higher estimation speed, but I couldn't find how?
Many thanks in advance for your help
I'm not Mr. Vchernat, I'm Kentaro Matsuura, and the author of that blog.
Suppose you are used to summing out. If not, please read "Log Sum of Exponentials" section in the Stan manual.
Under the assumption that theta <1.3
, lambda
will be less than 15
. When m
exceeds 30
, poisson_lpmf(m | lambda)
will be much smaller. So, even if we ignore m > 30
, log_sum_exp(ps)
will not change largely.
For your information, dpois(1:50, lambda=15)
(equivalent to exp(poisson_lpmf(m | 15))
) shows:
R> dpois(1:50, lambda=15)
[1] 4.589e-06 3.441e-05 1.721e-04 6.453e-04 1.936e-03
[6] 4.839e-03 1.037e-02 1.944e-02 3.241e-02 4.861e-02
[11] 6.629e-02 8.286e-02 9.561e-02 1.024e-01 1.024e-01
[16] 9.603e-02 8.474e-02 7.061e-02 5.575e-02 4.181e-02
[21] 2.986e-02 2.036e-02 1.328e-02 8.300e-03 4.980e-03
[26] 2.873e-03 1.596e-03 8.551e-04 4.423e-04 2.211e-04
[31] 1.070e-04 5.016e-05 2.280e-05 1.006e-05 4.311e-06
[36] 1.796e-06 7.282e-07 2.874e-07 1.105e-07 4.146e-08
[41] 1.517e-08 5.417e-09 1.890e-09 6.442e-10 2.147e-10
[46] 7.002e-11 2.235e-11 6.983e-12 2.138e-12 6.413e-13
Mr. Matsuura, I'm so sorry for my mistake. your comments and your blog content were really helpful for me.
May you introduce me, a book or an article with a more complete description about this sentence: "Under the assumption that theta <1.3 , lambda will be less than 15"
Thanks a million
First, check the data distribution. Next, look at which of the following figures is similar.
https://cdn-ak.f.st-hatena.com/images/fotolife/S/StatModeling/20201106/20201106201645.png
Then you can see the approximate range of parameters.
I got it, many thanks.
How to chose M1 or M2? What't the meaning of vector ps of length M?