Model Estimation • blavaan

Models are specified using lavaan syntax, and prior distribution specification has already been covered on the previous page. The specified model can then be estimated via the bsem() command, with other models being estimated via bcfa(), bgrowth(), or blavaan(). Regardless of the command, there are many arguments that allow you to tailor the model estimation to your needs. We discuss here some of the most popular arguments, as well as some easy-to-miss arguments.

Primary arguments

Primary arguments to the model estimation commands include burnin, sample, n.chains, and target. The burnin and sample arguments are used to specify the desired number of burn-in iterations and posterior samples for each of n.chains chains (and the burnin argument controls the warm-up iterations in Stan). The target argument, on the other hand, is used to specify the MCMC strategy used for estimation. The default, target = "stan", tends to be fastest and most efficient. Other options are slightly more flexible, including target = "stanclassic" and target = "jags". Both of these approaches sample latent variables as if they are model parameters, whereas target = "stan" marginalizes out the latent variables. For more detail of these approaches, see the JSS paper.

Secondary arguments

Noteworthy secondary arguments include save.lvs, mcmcfile, mcmcextra, and inits.

The save.lvs argument controls whether or not latent variables are sampled during model estimation. It defaults to FALSE because the latent variable sampling can take a large amount of memory, and can slow down some post-estimation summaries. But setting save.lvs = TRUE allows for model summaries of latent variables and observed variable predictions using blavPredict() and other functions.

By setting mcmcfile = TRUE, users can obtain the Stan (JAGS) code and data for the specified model. These files are written to the lavExport folder within a user’s working directory. One file has extension .jag or .stan, and the second file is an R data file (extension .rda). The rda file can be loaded in R (via load()) and will be a list including elements data, monitors, and inits. These elements can be supplied to stan() for model estimation outside of blavaan.

The mcmcextra argument is used to supply extra information to Stan or JAGS. Users can supply a list with element names monitor, data, syntax, or llnsamp. These elements are respectively used to specify extra parameters to monitor, extra data to pass to the model estimation, extra syntax to include in the model file (JAGS only), and the number of importance samples for likelihood approximation (which is only relevant to models with ordinal variables).

The inits argument is used to control the starting values for MCMC estimation. It can sometimes salvage a model that immediately crashes. The default, inits = "simple", initializes model parameters to 0 and 1 in fashion similar to lavaan’s use of this argument. A second option, inits = "prior", draws initial values from the prior distributions. The user can also specify a list of their own initial values via this argument, though the required list format is somewhat cumbersome. We recommend exporting the model and data using mcmcfile = TRUE, loading the resulting rda file, and looking at the format of the initial values that blavaan created there.

Parallelization

Speed is always an issue when we sample via MCMC, especially using software like Stan or JAGS. For computers with multiple cores, the estimation can be sped up by sending each MCMC chain to a separate core. This is accomplished with the bcontrol argument, which is a list whose elements correspond to stan() or run.jags() arguments. For parallelizing the chains in Stan, we would want to use the argument bcontrol = list(cores = 3). Many other arguments are available here to control other aspects of estimation; see ?stan or ?run.jags for all the possibilities.

Parallelization can also be helpful to speed up post-estimation computations. The future package controls this parallelization, which requires an extra command prior to estimation. The most common commands would be

library("future")
plan("multicore") ## mac or linux
plan("multisession") ## windows