Posterior Predictive Model Checks
ppmc.RdThis function allows users to conduct a posterior predictive model check to assess the global or local fit of a latent variable model using any discrepancy function that can be applied to a lavaan model.
Usage
ppmc(object, thin = 1, fit.measures = c("srmr","chisq"), discFUN = NULL,
conditional = FALSE)
# S4 method for class 'blavPPMC'
summary(object, ...)
# S3 method for class 'ppmc'
summary(object, discFUN, dist = c("obs","sim"),
central.tendency = c("mean","median","mode"),
hpd = TRUE, prob = .95, to.data.frame = FALSE, diag = TRUE,
sort.by = NULL, decreasing = FALSE)
# S3 method for class 'blavPPMC'
plot(x, ..., discFUN, element, central.tendency = "",
hpd = TRUE, prob = .95, nd = 3)
# S3 method for class 'blavPPMC'
hist(x, ..., discFUN, element, hpd = TRUE, prob = .95,
printLegend = TRUE, legendArgs = list(x = "topleft"),
densityArgs = list(), nd = 3)
# S3 method for class 'blavPPMC'
pairs(x, discFUN, horInd = 1:DIM, verInd = 1:DIM,
printLegend = FALSE, ...)Arguments
- object,x
An object of class
blavaan.- thin
Optional
integerindicating how much to thin each chain. Default is1L, indicating not to thin the chains inobject.- fit.measures
charactervector indicating the names of global discrepancy measures returned byfitMeasures. Ignored unlessdiscFUNisNULL, but users may includefitMeasuresin thelistof discrepancy functions indiscFUN. For ordinal models, the"logl"or"chisq"computations are done via lavaan.- discFUN
function, or alistof functions, that can be called on an object of class lavaan. Each function must return an object whosemodeisnumeric, but may be avector,matrix, or multidimensionalarray. In thesummaryandplotmethods,discFUNis acharacterindicating which discrepancy function to summarize.- conditional
logicalindicating whether or not, during artificial data generation, we should condition on the estimated latent variables. Requires the model to be estimated withsave.lvs = TRUE.- element
numericorcharacterindicating the index (in eachdimension of thediscFUNoutput, if multiple) to plot.- horInd,verInd
Similar to
element, but anumericorcharactervector indicating the indices of amatrixto plot in a scatterplot matrix. IfhorInd==verInd, histograms will be plotted in the upper triangle.- dist
characterindicating whether to summarize the distribution ofdiscFUNon either theobserved orsimulated data.- central.tendency
characterindicating which statistics should be used to characterize the location of the posterior (predictive) distribution. By default, all 3 statistics are returned for thesummarymethod, but none for theplotmethod. The posterior mean is labeledEAPfor expected a posteriori estimate, and the mode is labeledMAPfor modal a posteriori estimate.- hpd
logicalindicating whether to calculate the highest posterior density (HPD) credible interval fordiscFUN.- prob
The "confidence" level of the credible interval(s).
- nd
The number of digits to print in the scatter
plot.- to.data.frame
logicalindicating whether thesummaryof a symmetric 2-dimensionalmatrixreturned bydiscFUNshould have its unique elements stored in rows of adata.framethat can be sorted for convenience of identifying large discrepancies. IfdiscFUNreturns an asymmetric 2-dimensionalmatrix, the list of matrices returned by thesummarycan also be converted to adata.frame.- diag
Passed to
lower.triifto.data.frame=TRUE.- sort.by
character. Ifsummaryreturns adata.frame, it can be sorted by this column name usingorder. Note that ifdiscFUNreturns an asymmetric 2-dimensionalmatrix, eachdata.framein the returnedlistwill be sorted independently, so the rows are unlikely to be consistent across summary statistics.- decreasing
Passed to
orderif!is.null(sort.by).- ...
Additional
graphical parametersto be passed toplot.default.- printLegend
logical. IfTRUE(default), a legend will be printed with the histogram- legendArgs
listof arguments passed to thelegendfunction. The default argument is a list placing the legend at the top-left of the figure.- densityArgs
listof arguments passed to thedensityfunction, used to obtain densities for thehistmethod.
Value
An S4 object of class blavPPMC consisting of 5 list slots:
@discFUNThe user-supplied
discFUN, or thecalltofitMeasuresthat returnsfit.measures.@dimsThe dimensions of the object returned by each
discFUN.@PPPThe posterior predictive p value for each
discFUNelement.@obsDistThe posterior distribution of realize values of
discFUNapplied to observed data.@simDistThe posterior predictive distribution of values of
discFUNapplied to data simulated from the posterior samples.
The summary() method returns a numeric vector if discFUN
returns a scalar, a data.frame with one discrepancy function per row
if discFUN returns a numeric vector, and a list with
one summary statistic per element if discFUN returns a matrix
or multidimensional array.
The plot and pairs methods invisibly return NULL,
printing a plot (or scatterplot matrix) to the current device.
The hist method invisibly returns a list or arguments that can
be passed to the function for which the list element is named. Users
can edit the arguments in the list to customize their histograms.
Author
Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)
References
Levy, R. (2011). Bayesian data–model fit assessment for structural equation modeling. Structural Equation Modeling, 18(4), 663–685. doi:10.1080/10705511.2011.607723
Examples
if (FALSE) { # \dontrun{
data(HolzingerSwineford1939, package = "lavaan")
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
## fit single-group model
fit <- bcfa(HS.model, data = HolzingerSwineford1939,
n.chains = 2, burnin = 1000, sample = 500)
## fit multigroup model
fitg <- bcfa(HS.model, data = HolzingerSwineford1939,
n.chains = 2, burnin = 1000, sample = 500, group = "school")
## Use fit.measures as a shortcut for global fitMeasures only
## - Note that indices calculated from the "df" are only appropriate under
## noninformative priors, such that pD approximates the number of estimated
## parameters counted under ML estimation; incremental fit indices
## introduce further complications)
AFIs <- ppmc(fit, thin = 10, fit.measures = c("srmr","chisq","rmsea","cfi"))
summary(AFIs) # summarize the whole vector in a data.frame
hist(AFIs, element = "rmsea") # only plot one discrepancy function at a time
plot(AFIs, element = "srmr")
## define a list of custom discrepancy functions
## - (global) fit measures
## - (local) standardized residuals
discFUN <- list(global = function(fit) {
fitMeasures(fit, fit.measures = c("cfi","rmsea","srmr","chisq"))
},
std.cov.resid = function(fit) lavResiduals(fit, zstat = FALSE,
summary = FALSE)$cov,
std.mean.resid = function(fit) lavResiduals(fit, zstat = FALSE,
summary = FALSE)$mean)
out1g <- ppmc(fit, discFUN = discFUN)
## summarize first discrepancy by default (fit indices)
summary(out1g)
## some model-implied correlations look systematically over/underestimated
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP")
hist(out1g, discFUN = "std.cov.resid", element = c(1, 7))
plot(out1g, discFUN = "std.cov.resid", element = c("x1","x7"))
## For ease of investigation, optionally export summary as a data.frame,
## sorted by size of average residual
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP",
to.data.frame = TRUE, sort.by = "EAP")
## or sorted by size of PPP
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP",
to.data.frame = TRUE, sort.by = "PPP_sim_LessThan_obs")
## define a list of custom discrepancy functions for multiple groups
## (return each group's numeric output using a different function)
disc2g <- list(global = function(fit) {
fitMeasures(fit, fit.measures = c("cfi","rmsea","mfi","srmr","chisq"))
},
cor.resid1 = function(fit) lavResiduals(fit, zstat = FALSE,
type = "cor.bollen",
summary = FALSE)[[1]]$cov,
cor.resid2 = function(fit) lavResiduals(fit, zstat = FALSE,
type = "cor.bollen",
summary = FALSE)[[2]]$cov)
out2g <- ppmc(fitg, discFUN = disc2g, thin = 2)
## some residuals look like a bigger problem in one group than another
pairs(out2g, discFUN = "cor.resid1", horInd = 1:3, verInd = 7:9) # group 1
pairs(out2g, discFUN = "cor.resid2", horInd = 1:3, verInd = 7:9) # group 2
## print all to file: must be a LARGE picture. First group 1 ...
png("cor.resid1.png", width = 1600, height = 1200)
pairs(out2g, discFUN = "cor.resid1")
dev.off()
## ... then group 2
png("cor.resid2.png", width = 1600, height = 1200)
pairs(out2g, discFUN = "cor.resid2")
dev.off()
} # }