Posterior Predictive Model Checks
ppmc.Rd
This function allows users to conduct a posterior predictive model check to assess the global or local fit of a latent variable model using any discrepancy function that can be applied to a lavaan model.
Usage
ppmc(object, thin = 1, fit.measures = c("srmr","chisq"), discFUN = NULL,
conditional = FALSE)
# S4 method for class 'blavPPMC'
summary(object, ...)
# S3 method for class 'ppmc'
summary(object, discFUN, dist = c("obs","sim"),
central.tendency = c("mean","median","mode"),
hpd = TRUE, prob = .95, to.data.frame = FALSE, diag = TRUE,
sort.by = NULL, decreasing = FALSE)
# S3 method for class 'blavPPMC'
plot(x, ..., discFUN, element, central.tendency = "",
hpd = TRUE, prob = .95, nd = 3)
# S3 method for class 'blavPPMC'
hist(x, ..., discFUN, element, hpd = TRUE, prob = .95,
printLegend = TRUE, legendArgs = list(x = "topleft"),
densityArgs = list(), nd = 3)
# S3 method for class 'blavPPMC'
pairs(x, discFUN, horInd = 1:DIM, verInd = 1:DIM,
printLegend = FALSE, ...)
Arguments
- object,x
An object of class
blavaan
.- thin
Optional
integer
indicating how much to thin each chain. Default is1L
, indicating not to thin the chains inobject
.- fit.measures
character
vector indicating the names of global discrepancy measures returned byfitMeasures
. Ignored unlessdiscFUN
isNULL
, but users may includefitMeasures
in thelist
of discrepancy functions indiscFUN
. For ordinal models, the"logl"
or"chisq"
computations are done via lavaan.- discFUN
function
, or alist
of functions, that can be called on an object of class lavaan. Each function must return an object whosemode
isnumeric
, but may be avector
,matrix
, or multidimensionalarray
. In thesummary
andplot
methods,discFUN
is acharacter
indicating which discrepancy function to summarize.- conditional
logical
indicating whether or not, during artificial data generation, we should condition on the estimated latent variables. Requires the model to be estimated withsave.lvs = TRUE
.- element
numeric
orcharacter
indicating the index (in eachdim
ension of thediscFUN
output, if multiple) to plot.- horInd,verInd
Similar to
element
, but anumeric
orcharacter
vector indicating the indices of amatrix
to plot in a scatterplot matrix. IfhorInd==verInd
, histograms will be plotted in the upper triangle.- dist
character
indicating whether to summarize the distribution ofdiscFUN
on either theobs
erved orsim
ulated data.- central.tendency
character
indicating which statistics should be used to characterize the location of the posterior (predictive) distribution. By default, all 3 statistics are returned for thesummary
method, but none for theplot
method. The posterior mean is labeledEAP
for expected a posteriori estimate, and the mode is labeledMAP
for modal a posteriori estimate.- hpd
logical
indicating whether to calculate the highest posterior density (HPD) credible interval fordiscFUN
.- prob
The "confidence" level of the credible interval(s).
- nd
The number of digits to print in the scatter
plot
.- to.data.frame
logical
indicating whether thesummary
of a symmetric 2-dimensionalmatrix
returned bydiscFUN
should have its unique elements stored in rows of adata.frame
that can be sorted for convenience of identifying large discrepancies. IfdiscFUN
returns an asymmetric 2-dimensionalmatrix
, the list of matrices returned by thesummary
can also be converted to adata.frame
.- diag
Passed to
lower.tri
ifto.data.frame=TRUE
.- sort.by
character
. Ifsummary
returns adata.frame
, it can be sorted by this column name usingorder
. Note that ifdiscFUN
returns an asymmetric 2-dimensionalmatrix
, eachdata.frame
in the returnedlist
will be sorted independently, so the rows are unlikely to be consistent across summary statistics.- decreasing
Passed to
order
if!is.null(sort.by)
.- ...
Additional
graphical parameters
to be passed toplot.default
.- printLegend
logical
. IfTRUE
(default), a legend will be printed with the histogram- legendArgs
list
of arguments passed to thelegend
function. The default argument is a list placing the legend at the top-left of the figure.- densityArgs
list
of arguments passed to thedensity
function, used to obtain densities for thehist
method.
Value
An S4 object of class blavPPMC
consisting of 5 list
slots:
@discFUN
The user-supplied
discFUN
, or thecall
tofitMeasures
that returnsfit.measures
.@dims
The dimensions of the object returned by each
discFUN
.@PPP
The posterior predictive p value for each
discFUN
element.@obsDist
The posterior distribution of realize values of
discFUN
applied to observed data.@simDist
The posterior predictive distribution of values of
discFUN
applied to data simulated from the posterior samples.
The summary()
method returns a numeric
vector if discFUN
returns a scalar, a data.frame
with one discrepancy function per row
if discFUN
returns a numeric
vector, and a list
with
one summary statistic per element if discFUN
returns a matrix
or multidimensional array
.
The plot
and pairs
methods invisibly return NULL
,
printing a plot (or scatterplot matrix) to the current device.
The hist
method invisibly returns a list
or arguments that can
be passed to the function for which the list
element is named. Users
can edit the arguments in the list to customize their histograms.
Author
Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)
References
Levy, R. (2011). Bayesian data–model fit assessment for structural equation modeling. Structural Equation Modeling, 18(4), 663–685. doi:10.1080/10705511.2011.607723
Examples
if (FALSE) { # \dontrun{
data(HolzingerSwineford1939, package = "lavaan")
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
## fit single-group model
fit <- bcfa(HS.model, data = HolzingerSwineford1939,
n.chains = 2, burnin = 1000, sample = 500)
## fit multigroup model
fitg <- bcfa(HS.model, data = HolzingerSwineford1939,
n.chains = 2, burnin = 1000, sample = 500, group = "school")
## Use fit.measures as a shortcut for global fitMeasures only
## - Note that indices calculated from the "df" are only appropriate under
## noninformative priors, such that pD approximates the number of estimated
## parameters counted under ML estimation; incremental fit indices
## introduce further complications)
AFIs <- ppmc(fit, thin = 10, fit.measures = c("srmr","chisq","rmsea","cfi"))
summary(AFIs) # summarize the whole vector in a data.frame
hist(AFIs, element = "rmsea") # only plot one discrepancy function at a time
plot(AFIs, element = "srmr")
## define a list of custom discrepancy functions
## - (global) fit measures
## - (local) standardized residuals
discFUN <- list(global = function(fit) {
fitMeasures(fit, fit.measures = c("cfi","rmsea","srmr","chisq"))
},
std.cov.resid = function(fit) lavResiduals(fit, zstat = FALSE,
summary = FALSE)$cov,
std.mean.resid = function(fit) lavResiduals(fit, zstat = FALSE,
summary = FALSE)$mean)
out1g <- ppmc(fit, discFUN = discFUN)
## summarize first discrepancy by default (fit indices)
summary(out1g)
## some model-implied correlations look systematically over/underestimated
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP")
hist(out1g, discFUN = "std.cov.resid", element = c(1, 7))
plot(out1g, discFUN = "std.cov.resid", element = c("x1","x7"))
## For ease of investigation, optionally export summary as a data.frame,
## sorted by size of average residual
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP",
to.data.frame = TRUE, sort.by = "EAP")
## or sorted by size of PPP
summary(out1g, discFUN = "std.cov.resid", central.tendency = "EAP",
to.data.frame = TRUE, sort.by = "PPP_sim_LessThan_obs")
## define a list of custom discrepancy functions for multiple groups
## (return each group's numeric output using a different function)
disc2g <- list(global = function(fit) {
fitMeasures(fit, fit.measures = c("cfi","rmsea","mfi","srmr","chisq"))
},
cor.resid1 = function(fit) lavResiduals(fit, zstat = FALSE,
type = "cor.bollen",
summary = FALSE)[[1]]$cov,
cor.resid2 = function(fit) lavResiduals(fit, zstat = FALSE,
type = "cor.bollen",
summary = FALSE)[[2]]$cov)
out2g <- ppmc(fitg, discFUN = disc2g, thin = 2)
## some residuals look like a bigger problem in one group than another
pairs(out2g, discFUN = "cor.resid1", horInd = 1:3, verInd = 7:9) # group 1
pairs(out2g, discFUN = "cor.resid2", horInd = 1:3, verInd = 7:9) # group 2
## print all to file: must be a LARGE picture. First group 1 ...
png("cor.resid1.png", width = 1600, height = 1200)
pairs(out2g, discFUN = "cor.resid1")
dev.off()
## ... then group 2
png("cor.resid2.png", width = 1600, height = 1200)
pairs(out2g, discFUN = "cor.resid2")
dev.off()
} # }