We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

Bob Carpenter • 6 years ago

Thanks for sharing. I have a few suggestions on improving the speed of the Stan code.

1. Declare mu0 in the transformed data block and set with one-liner

transformed data {
vector[T] mu0 = rep_vector(0, T);

That'll save autodiffing through the parameter mu0, which isn't needed because it's constant.

2. In defining Sigma, it would help to define local variables for both sigmalev1^2 and rho * sigmalev1^2 and then reuse them in the loops. That will saves a whole bunch of duplicate autodiff by using only two variables in Sigma rather than T^2. Then if Sigma is big, it helps to fill column major as the data's stored. All in, this will look like:

{
real sigmalev1_sq = sigmalev1^2;
real rho_sigmalev1_sq = rho * sigmalev1^2;
for (j in 1:T) {
for (i in 1:(j - 1))
Sigma[i, j] = sigmalev1_sq
Sigma[j, j] = rho_sigmalev1_sq
for (i in (j+1):T)
Sigma[i, j] = sigmalev1_sq
}
}

The bracs make the variables sigmalev1_sq and rho_sigmalev1_sq local variables so they're not saved. Though it looks like you want to save sgiamlev1_sq as sigmalev12 later, so it could be declared as a block variable in the transformed parameters block.

3. This is the big one. The multi-normal can be vectorized, which means Sigma will only have to be factored once. This is by far the most expensive operation in your program. The way to do this is redeclare ran and then just vectorize,

parameters {
vector[T] ran[J];
...
model {
...
ran ~ multinormal(mu0, Sigma);

4. For convenience, those generated quantities can be declared and defined in one line (we added that fairly late in the game, so people still don't know about it),

real sigma2 = sigma^2;
real sigmalev12 = sigmalev1^2;
real iccInPeriod = sigmalev12 / (sigmalev12 + sigma2);
real iccBetPeriod = rho * iccInPeriod;

5. I really wish we could vectorize the yhat definition, but we don't have a way to deal with the indexing for `ran`. What we need to add is something like a select function that'd let us define yhat as x * beta + select(ran, jj, tt); I don't have a good idea of what to call that function, but it'd be defined as:

vector select(vector[] ran, int[] ii, int[] jj) {
vector[size(jj)] y;
for (n in 1:size(jj)) y[i] = ran[ii[n], jj[n]];
return y;
}

6. If you have an interval prior as you have for sigma ~ uniform(0, 10), the variable sigma should be declared with lower = 0, upper = 10. The upper = 10 is missing in this model. The reason here is that every set of parameter values that satisfy the declared constraints should have support in the model.

Having said that, if you do it that way, initialization will be around the midpoint, as we initialize in (-2, 2) and then transform to the uniform interval declared for the variable. If there's only a lower bound, we transform using exp(), which puts the median at 1 and the range between roughly 1/8 and 8.

Finally, we don't recommend hard interval constraints unless motivated by logical constraints, such as rho being a correlation which must be between -1 and 1. We recommend soft constraints to represent the scale of the prior, whic is what I think is being indicated with uniform(0, 10). So rather than that, we'd recommend something much softer like normal(5, 5); if you think a value of 15 or 20 is not out of the question, or normal(5, 2). Even so, normal priors like this aren't that strong. The main advantage is that they avoid the problem of probability mass piling up at a boundary and pushing posterior means and medians away from the boundary where the likelihood is providing support. If the values don't get anywhere near 10, you won't have problems with the way you've specified it, at least with our default initializations and the NUTS sampler.

Keith Goldfeld • 6 years ago

Thanks so much. Hopefully sometime soon, I will be able implement your suggestions and when I do, I will report the results.