Skip to contents

Introduction

When Bayesian models are estimated with a Markov-Chain Monte Carlo (MCMC) sampler, the model estimation doesn’t stop when it has achieved some convergence criteria. It will run as long as desired (determined by the burnin and sample arguments), and then you need to evaluate the convergence and efficiency of the estimated posterior distributions. You should only analyze the results if convergence has been achieved, as judged by the metrics described below.

For this example we will use the Industrialization and Political Democracy example (Bollen 1989).

model <- '
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ a*y1 + b*y2 + c*y3 + d*y4
     dem65 =~ a*y5 + b*y6 + c*y7 + d*y8

  # regressions
    dem60 ~ ind60
    dem65 ~ ind60 + dem60

  # residual correlations
    y1 ~~ y5
    y2 ~~ y4 + y6
    y3 ~~ y7
    y4 ~~ y8
    y6 ~~ y8
'

fit <- bsem(model, data=PoliticalDemocracy,
            std.lv=T, meanstructure=T, n.chains=3,
            burnin=500, sample=1000)

Convergence

The primary convergence diagnostic is R̂\hat{R}, which compares the between- and within-chain samples of model parameters and other univariate quantities of interest (Vehtari et al. 2021). If chains have not mixed well (ie, the between- and within-chain estimates don’t agree), R̂\hat{R} is larger than 1. We recommend running at least three chains by default and only using the posterior samples if R̂<1.05\hat{R} < 1.05 for all the parameters.

blavaan presents the R̂\hat{R} reported by the underlying MCMC program, either Stan or JAGS (Stan by default). We can obtain the R̂\hat{R} from the summary() function, and we can also extract it with the blavInspect() function

blavInspect(fit, "rhat")
##   ind60=~x1   ind60=~x2   ind60=~x3           a           b           c 
##   1.0006454   1.0022602   1.0008597   1.0009582   0.9996429   1.0001085 
##           d           a           b           c           d dem60~ind60 
##   1.0011349   1.0009582   0.9996429   1.0001085   1.0011349   1.0001846 
## dem65~ind60 dem65~dem60      y1~~y5      y2~~y4      y2~~y6      y3~~y7 
##   0.9998377   1.0003543   1.0020154   0.9997783   0.9994687   1.0004405 
##      y4~~y8      y6~~y8      x1~~x1      x2~~x2      x3~~x3      y1~~y1 
##   0.9993376   1.0009761   1.0005309   1.0026742   1.0002919   1.0009539 
##      y2~~y2      y3~~y3      y4~~y4      y5~~y5      y6~~y6      y7~~y7 
##   0.9997066   1.0001153   0.9998728   1.0001351   0.9999265   1.0016508 
##      y8~~y8        x1~1        x2~1        x3~1        y1~1        y2~1 
##   1.0019534   1.0009929   1.0000789   0.9995749   1.0022080   1.0016835 
##        y3~1        y4~1        y5~1        y6~1        y7~1        y8~1 
##   1.0014417   1.0017104   1.0025350   1.0018565   1.0019824   1.0028073

With large models it can be cumbersome to look over all of these entries. We can instead find the largest R̂\hat{R} to see if they are all less than 1.051.05

max(blavInspect(fit, "psrf"))
## [1] 1.002807

If all R̂<1.05\hat{R} < 1.05 then we can establish that the MCMC chains have converged to a stable solution. If the model has not converged, you might increase the number of burnin iterations

fit <- bsem(model, data=PoliticalDemocracy,
            std.lv=T, meanstructure=T, n.chains=3,
            burnin=1000, sample=1000)

and/or change the model priors with the dpriors() function. These address issues where the model failed to converge due to needing more iterations or due to a model misspecification (such as bad priors). As a rule of thumb, we seldom see a model require more than 1,000 burnin samples in Stan. If your model is not converging after 1,000 burnin samples, it is likely that the default prior distributions clash with your data. This can happen, e.g., if your variables contain values in the 100s or 1000s.

Efficiency

We should also evaluate the efficiency of the posterior samples. Effective sample size (ESS) is a useful measure for sampling efficiency, and is well defined even if the chains do not have finite mean or variance (Vehtari et al. 2021).

In short, the posterior samples produced by MCMC are autocorrelated. This means that, if you draw 500 posterior samples, you do not have 500 independent pieces of information about the posterior distribution, because the samples are autocorlated. The ESS metric is like a currency conversion, telling you how much your autocorrelated samples are worth if we were to convert them to independent samples. In blavaan we can print it from the summary function with the neff argument

summary(fit, neff=T)

We can also extract only those with the blavInspect() function

blavInspect(fit, "neff")
##   ind60=~x1   ind60=~x2   ind60=~x3           a           b           c 
##   1440.3331   1391.0584   1467.8560   1853.4166   2060.4794   1974.7427 
##           d           a           b           c           d dem60~ind60 
##   1503.7848   1853.4166   2060.4794   1974.7427   1503.7848   2088.5744 
## dem65~ind60 dem65~dem60      y1~~y5      y2~~y4      y2~~y6      y3~~y7 
##   2888.5675   2598.2084   1910.1173   2355.8862   3329.6436   2361.1681 
##      y4~~y8      y6~~y8      x1~~x1      x2~~x2      x3~~x3      y1~~y1 
##   2283.7571   1672.0076   2122.9198   1520.9724   3080.7677   2272.7977 
##      y2~~y2      y3~~y3      y4~~y4      y5~~y5      y6~~y6      y7~~y7 
##   3341.2752   3149.6557   2092.2675   2132.3200   2690.6789   2309.5735 
##      y8~~y8        x1~1        x2~1        x3~1        y1~1        y2~1 
##   1690.2879   1017.6287    983.1243   1104.6522   1235.2910   1469.8196 
##        y3~1        y4~1        y5~1        y6~1        y7~1        y8~1 
##   1250.4709   1099.6591   1021.8526   1033.0079    952.9122    911.5821

ESS is a sample size, so it should be at least 100 (optimally, much more than 100) times the number of chains in order to be reliable and to indicate that estimates of the posterior quantiles are reliable. In this example, because we have 3 chains, we would want to see at least neff=300 for every parameter.

And we can easily find the lowest ESS with the min() function:

min(blavInspect(fit, "neff"))
## [1] 911.5821

References

Bollen, Kenneth A. 1989. Structural Equations with Latent Variables. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc.
Vehtari, Aki, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner. 2021. Rank-Normalization, Folding, and Localization: An Improved R̂\widehat{R} for Assessing Convergence of MCMC (with Discussion).” Bayesian Analysis 16 (2): 667–718. https://doi.org/10.1214/20-BA1221.