In this hands-on, we are going to further validate our model based on the EFA findings. The same data set, “Attitude_Statistics v3.sav” will be used.
The evidence of internal structure will be provided by
Confirmatory factor analysis
Composite reliability
We are going to use psych
(Revelle, 2023),
lavaan
version 0.6.15 (Rosseel, Jorgensen, &
Rockwood, 2023), semTools
version 0.5.6 (Jorgensen, Pornprasertmanit, Schoemann, &
Rosseel, 2022) and semPlot
version 1.1.6 (Epskamp,
2022) in our analysis. These packages must be installed from
downloaded packages from CRAN at https://cran.r-project.org/.
Make sure you already installed all of them before loading the packages.
library(foreign)
library(psych)
library(lavaan) # for CFA
library(semTools) # for additional functions in SEM
library(semPlot) # for path diagram
We include only good items from PA1 and
PA2 in data.cfa
data frame.
data = read.spss("Attitude_Statistics v3.sav", F, T) # Shortform
# Include selected items from PA1 & PA2 in "data.cfa"
data.cfa = data[c("Q4","Q5","Q6","Q7","Q8","Q9","Q10","Q11")]
str(data.cfa)
## 'data.frame': 150 obs. of 8 variables:
## $ Q4 : num 3 3 1 4 2 3 3 2 4 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q5 : num 4 4 1 3 5 4 4 3 3 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q6 : num 4 4 1 2 1 4 3 3 4 5 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q7 : num 3 4 1 2 4 4 4 2 3 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q8 : num 3 3 4 2 5 3 3 3 3 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q9 : num 3 3 4 2 5 4 3 4 5 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q10: num 3 3 5 2 3 4 3 4 4 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
## $ Q11: num 4 4 1 3 4 4 3 3 4 4 ...
## ..- attr(*, "value.labels")= Named num [1:5] 5 4 3 2 1
## .. ..- attr(*, "names")= chr [1:5] "STRONGLY AGREE" "AGREE" "NEUTRAL" "DISAGREE" ...
dim(data.cfa)
## [1] 150 8
names(data.cfa)
## [1] "Q4" "Q5" "Q6" "Q7" "Q8" "Q9" "Q10" "Q11"
head(data.cfa)
## Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11
## 1 3 4 4 3 3 3 3 4
## 2 3 4 4 4 3 3 3 4
## 3 1 1 1 1 4 4 5 1
## 4 4 3 2 2 2 2 2 3
## 5 2 5 1 4 5 5 3 4
## 6 3 4 4 4 3 4 4 4
Descriptive statistics
Check minimum/maximum values per item, and screen for any missing values,
describe(data.cfa)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Q4 1 150 2.81 1.17 3 2.78 1.48 1 5 4 0.19 -0.81 0.10
## Q5 2 150 3.31 1.01 3 3.32 1.48 1 5 4 -0.22 -0.48 0.08
## Q6 3 150 3.05 1.09 3 3.05 1.48 1 5 4 -0.04 -0.71 0.09
## Q7 4 150 2.92 1.19 3 2.92 1.48 1 5 4 -0.04 -1.06 0.10
## Q8 5 150 3.33 1.00 3 3.34 1.48 1 5 4 -0.08 -0.12 0.08
## Q9 6 150 3.44 1.05 3 3.48 1.48 1 5 4 -0.21 -0.32 0.09
## Q10 7 150 3.31 1.10 3 3.36 1.48 1 5 4 -0.22 -0.39 0.09
## Q11 8 150 3.35 0.94 3 3.37 1.48 1 5 4 -0.31 -0.33 0.08
Note that all n = 150, no missing values.
min
–max
cover the whole range of response
options.
% of response to options per item,
response.frequencies(data.cfa)
## 1 2 3 4 5 miss
## Q4 0.140 0.280 0.30 0.19 0.093 0
## Q5 0.040 0.167 0.35 0.33 0.113 0
## Q6 0.080 0.233 0.33 0.26 0.093 0
## Q7 0.133 0.267 0.23 0.29 0.080 0
## Q8 0.047 0.100 0.48 0.23 0.147 0
## Q9 0.047 0.093 0.42 0.25 0.187 0
## Q10 0.073 0.107 0.42 0.23 0.167 0
## Q11 0.027 0.153 0.35 0.39 0.087 0
All response options are used, and there are no missing values.
Multivariate normality
This is done to check the multivariate normality of the data. If the
data are normally distributed, we may use maximum likelihood (ML)
estimation method for the CFA. In lavaan
, we have a number
of alternative estimation methods (the full list is available at http://lavaan.ugent.be/tutorial/est.html or by typing
?lavOptions
). Two common alternatives are:
mardia(data.cfa)
## Call: mardia(x = data.cfa)
##
## Mardia tests of multivariate skew and kurtosis
## Use describe(x) the to get univariate tests
## n.obs = 150 num.vars = 8
## b1p = 11.37 skew = 284.27 with probability <= 2.3e-15
## small sample skew = 291.24 with probability <= 2.9e-16
## b2p = 96.9 kurtosis = 8.18 with probability <= 2.2e-16
the data are not multivariate normal (kurtosis
> 5,
P < 0.05). We will use MLR in our
analysis.
Specify the measurement model
Specify the measurement model according to lavaan
syntax.
model = "
PA1 =~ Q4 + Q5 + Q6 + Q7 + Q11
PA2 =~ Q8 + Q9 + Q10
"
=~
indicates “measured by”, thus the items represent the
factor.
By default, lavaan
will correlate PA1 and PA2
(i.e. PA1 ~~ PA2
), somewhat similar to oblique rotation in
EFA. ~~
means “correlation”. We will use ~~
when we add correlated errors later.
Fit the model
Here, we fit the specified model. By default, marker indicator
variable approach1 is used in lavaan
to scale a
factor2.
We use MLR as the estimation method.
cfa.model = cfa(model, data = data.cfa, estimator = "MLR")
# cfa.model = cfa(model, data = data.cfa, std.lv = 1) # factor variance = 1
summary(cfa.model, fit.measures = T, standardized = T)
## lavaan 0.6.15 ended normally after 21 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 17
##
## Number of observations 150
##
## Model Test User Model:
## Standard Scaled
## Test Statistic 37.063 27.373
## Degrees of freedom 19 19
## P-value (Chi-square) 0.008 0.096
## Scaling correction factor 1.354
## Yuan-Bentler correction (Mplus variant)
##
## Model Test Baseline Model:
##
## Test statistic 453.795 325.195
## Degrees of freedom 28 28
## P-value 0.000 0.000
## Scaling correction factor 1.395
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.958 0.972
## Tucker-Lewis Index (TLI) 0.937 0.958
##
## Robust Comparative Fit Index (CFI) 0.973
## Robust Tucker-Lewis Index (TLI) 0.960
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1566.019 -1566.019
## Scaling correction factor 1.140
## for the MLR correction
## Loglikelihood unrestricted model (H1) -1547.487 -1547.487
## Scaling correction factor 1.253
## for the MLR correction
##
## Akaike (AIC) 3166.037 3166.037
## Bayesian (BIC) 3217.218 3217.218
## Sample-size adjusted Bayesian (SABIC) 3163.416 3163.416
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.080 0.054
## 90 Percent confidence interval - lower 0.040 0.000
## 90 Percent confidence interval - upper 0.118 0.091
## P-value H_0: RMSEA <= 0.050 0.098 0.397
## P-value H_0: RMSEA >= 0.080 0.527 0.133
##
## Robust RMSEA 0.063
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.112
## P-value H_0: Robust RMSEA <= 0.050 0.312
## P-value H_0: Robust RMSEA >= 0.080 0.320
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.079 0.079
##
## Parameter Estimates:
##
## Standard errors Sandwich
## Information bread Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 =~
## Q4 1.000 0.952 0.814
## Q5 0.660 0.092 7.218 0.000 0.629 0.624
## Q6 0.810 0.090 9.010 0.000 0.771 0.708
## Q7 0.916 0.086 10.641 0.000 0.872 0.735
## Q11 0.533 0.093 5.719 0.000 0.507 0.544
## PA2 =~
## Q8 1.000 0.653 0.655
## Q9 1.347 0.156 8.654 0.000 0.880 0.844
## Q10 1.436 0.199 7.206 0.000 0.938 0.856
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 ~~
## PA2 0.077 0.075 1.035 0.301 0.124 0.124
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Q4 0.460 0.089 5.182 0.000 0.460 0.337
## .Q5 0.620 0.086 7.178 0.000 0.620 0.611
## .Q6 0.590 0.101 5.836 0.000 0.590 0.498
## .Q7 0.647 0.125 5.181 0.000 0.647 0.460
## .Q11 0.611 0.077 7.934 0.000 0.611 0.704
## .Q8 0.567 0.101 5.628 0.000 0.567 0.570
## .Q9 0.312 0.094 3.325 0.001 0.312 0.287
## .Q10 0.321 0.106 3.046 0.002 0.321 0.267
## PA1 0.906 0.137 6.587 0.000 1.000 1.000
## PA2 0.427 0.106 4.009 0.000 1.000 1.000
Results
Read the results marked as Scaled
. These represent the
results of MLR.
To interpret the results, we must looks at
Std.all
column under
Latent Variables
table)Std.all
column under
Covariances
table)1. Fit indices
The following are a number of selected fit indices and the recommended cut-off values (Brown, 2015; Schreiber, Nora, Stage, Barlow, & King, 2006),
Category | Fit index | Cut-off |
---|---|---|
Absolute fit | \(\chi^2\) | \(P>0.05\) |
Standardized root mean square (SRMR) | \(\leq 0.08\) | |
Parsimony correction | Root mean square error of | and its 90% CI \(\leq 0.08\), |
approximation (RMSEA) | CFit \(P>0.05\) (\(H_0\leq 0.05\)) | |
Comparative fit | Comparative fit index (CFI) | \(\geq 0.95\) |
Tucker-Lewis index (TLI) |
2. Localized areas of misfit (Brown, 2015)
Residuals are the difference between the values in the sample and model-implied variance-covariance matrices.
Standardized residuals (SRs) > |2.58| indicate the standardized discrepancy between the matrices.
A modification index indicates the expected parameter change if we include a particular specification in the model (i.e. a constrained/fixed parameter is freely estimated, e.g. by correlating between errors of Q1 and Q2).
Specifications with MIs > 3.84 should be investigated.
3. Parameter estimates
Std.all
column under
Latent Variables
table).The guideline for EFA is applicable also to CFA. For example, FLs \(\geq\) 0.5 are practically significant. In addition, the P-values of the FLs must be significant (at \(\alpha\) = 0.05).
Also look for out-of-range values. FLs should be in range of 0 to 1 (absolute values), thus values > 1 are called Heywood cases or offending estimates (Brown, 2015)
Std.all
column under
Covariances
table).Similar to EFA, a factor correlation must be < 0.85, which indicates that the factors are distinct. A correlation > 0.85 indicates multicollinearity problem. Also look for out-of-range values. Factor correlations should be in range of 0 to 1 (absolute values).
In addition, when a model has Heywood cases, the solution is not acceptable. The variance-covariance matrix (of our data) could be non-positive definite i.e. the matrix is not invertible for the analysis.
In our output:
Fit indices,
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 17
##
## Number of observations 150
## Model Test User Model:
## Standard Scaled
## Test Statistic 37.063 27.373
## Degrees of freedom 19 19
## P-value (Chi-square) 0.008 0.096
## Scaling correction factor 1.354
## Yuan-Bentler correction (Mplus variant)
## Model Test Baseline Model:
## Test statistic 453.795 325.195
## Degrees of freedom 28 28
## P-value 0.000 0.000
## Scaling correction factor 1.395
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.958 0.972
## Tucker-Lewis Index (TLI) 0.937 0.958
##
## Robust Comparative Fit Index (CFI) 0.973
## Robust Tucker-Lewis Index (TLI) 0.960
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1566.019 -1566.019
## Scaling correction factor 1.140
## for the MLR correction
## Loglikelihood unrestricted model (H1) -1547.487 -1547.487
## Scaling correction factor 1.253
## for the MLR correction
##
## Akaike (AIC) 3166.037 3166.037
## Bayesian (BIC) 3217.218 3217.218
## Sample-size adjusted Bayesian (SABIC) 3163.416 3163.416
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.080 0.054
## 90 Percent confidence interval - lower 0.040 0.000
## 90 Percent confidence interval - upper 0.118 0.091
## P-value H_0: RMSEA <= 0.050 0.098 0.397
## P-value H_0: RMSEA >= 0.080 0.527 0.133
##
## Robust RMSEA 0.063
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.112
## P-value H_0: Robust RMSEA <= 0.050 0.312
## P-value H_0: Robust RMSEA >= 0.080 0.320
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.079 0.079
The model has good model fit based on all indices, with the exception of the upper 90% CI of robust RMSEA = 0.112. CFit P-value for robust RMSEA = 0.312, which means we cannot reject \(H_0 \leq 0.050\).
FLs and factor correlation,
## Parameter Estimates:
## Standard errors Sandwich
## Information bread Observed
## Observed information based on Hessian
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 =~
## Q4 1.000 0.952 0.814
## Q5 0.660 0.092 7.218 0.000 0.629 0.624
## Q6 0.810 0.090 9.010 0.000 0.771 0.708
## Q7 0.916 0.086 10.641 0.000 0.872 0.735
## Q11 0.533 0.093 5.719 0.000 0.507 0.544
## PA2 =~
## Q8 1.000 0.653 0.655
## Q9 1.347 0.156 8.654 0.000 0.880 0.844
## Q10 1.436 0.199 7.206 0.000 0.938 0.856
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 ~~
## PA2 0.077 0.075 1.035 0.301 0.124 0.124
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Q4 0.460 0.089 5.182 0.000 0.460 0.337
## .Q5 0.620 0.086 7.178 0.000 0.620 0.611
## .Q6 0.590 0.101 5.836 0.000 0.590 0.498
## .Q7 0.647 0.125 5.181 0.000 0.647 0.460
## .Q11 0.611 0.077 7.934 0.000 0.611 0.704
## .Q8 0.567 0.101 5.628 0.000 0.567 0.570
## .Q9 0.312 0.094 3.325 0.001 0.312 0.287
## .Q10 0.321 0.106 3.046 0.002 0.321 0.267
## PA1 0.906 0.137 6.587 0.000 1.000 1.000
## PA2 0.427 0.106 4.009 0.000 1.000 1.000
Remember to read the results down the Std.all
column.
All FLs > 0.5 and the factor correlation < 0.85. There is no
problem with the item quality and the factors are distinct.
In addition, to obtain the standardized results with SE,
standardizedSolution(cfa.model) # standardized, to view the SE of FL
## lhs op rhs est.std se z pvalue ci.lower ci.upper
## 1 PA1 =~ Q4 0.814 0.041 20.017 0.000 0.735 0.894
## 2 PA1 =~ Q5 0.624 0.068 9.117 0.000 0.490 0.758
## 3 PA1 =~ Q6 0.708 0.060 11.867 0.000 0.591 0.825
## 4 PA1 =~ Q7 0.735 0.059 12.523 0.000 0.620 0.850
## 5 PA1 =~ Q11 0.544 0.080 6.781 0.000 0.387 0.702
## 6 PA2 =~ Q8 0.655 0.067 9.712 0.000 0.523 0.788
## 7 PA2 =~ Q9 0.844 0.048 17.646 0.000 0.751 0.938
## 8 PA2 =~ Q10 0.856 0.048 17.670 0.000 0.761 0.951
## 9 Q4 ~~ Q4 0.337 0.066 5.078 0.000 0.207 0.467
## 10 Q5 ~~ Q5 0.611 0.085 7.156 0.000 0.444 0.778
## 11 Q6 ~~ Q6 0.498 0.085 5.894 0.000 0.333 0.664
## 12 Q7 ~~ Q7 0.460 0.086 5.324 0.000 0.290 0.629
## 13 Q11 ~~ Q11 0.704 0.087 8.049 0.000 0.532 0.875
## 14 Q8 ~~ Q8 0.570 0.088 6.446 0.000 0.397 0.744
## 15 Q9 ~~ Q9 0.287 0.081 3.550 0.000 0.128 0.445
## 16 Q10 ~~ Q10 0.267 0.083 3.226 0.001 0.105 0.430
## 17 PA1 ~~ PA1 1.000 0.000 NA NA 1.000 1.000
## 18 PA2 ~~ PA2 1.000 0.000 NA NA 1.000 1.000
## 19 PA1 ~~ PA2 0.124 0.113 1.099 0.272 -0.098 0.346
and, to obtain the unstandardized results with 95% CI,
parameterEstimates(cfa.model) # unstandardized, to view the 95% CI
## lhs op rhs est se z pvalue ci.lower ci.upper
## 1 PA1 =~ Q4 1.000 0.000 NA NA 1.000 1.000
## 2 PA1 =~ Q5 0.660 0.092 7.218 0.000 0.481 0.840
## 3 PA1 =~ Q6 0.810 0.090 9.010 0.000 0.634 0.986
## 4 PA1 =~ Q7 0.916 0.086 10.641 0.000 0.748 1.085
## 5 PA1 =~ Q11 0.533 0.093 5.719 0.000 0.350 0.716
## 6 PA2 =~ Q8 1.000 0.000 NA NA 1.000 1.000
## 7 PA2 =~ Q9 1.347 0.156 8.654 0.000 1.042 1.652
## 8 PA2 =~ Q10 1.436 0.199 7.206 0.000 1.046 1.827
## 9 Q4 ~~ Q4 0.460 0.089 5.182 0.000 0.286 0.633
## 10 Q5 ~~ Q5 0.620 0.086 7.178 0.000 0.451 0.789
## 11 Q6 ~~ Q6 0.590 0.101 5.836 0.000 0.392 0.788
## 12 Q7 ~~ Q7 0.647 0.125 5.181 0.000 0.402 0.891
## 13 Q11 ~~ Q11 0.611 0.077 7.934 0.000 0.460 0.762
## 14 Q8 ~~ Q8 0.567 0.101 5.628 0.000 0.369 0.764
## 15 Q9 ~~ Q9 0.312 0.094 3.325 0.001 0.128 0.495
## 16 Q10 ~~ Q10 0.321 0.106 3.046 0.002 0.115 0.528
## 17 PA1 ~~ PA1 0.906 0.137 6.587 0.000 0.636 1.175
## 18 PA2 ~~ PA2 0.427 0.106 4.009 0.000 0.218 0.635
## 19 PA1 ~~ PA2 0.077 0.075 1.035 0.301 -0.069 0.224
Localized areas of misfit,
mi = modificationIndices(cfa.model)
subset(mi, mi > 3.84)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 20 PA1 =~ Q8 10.264 0.244 0.232 0.233 0.233
## 24 PA2 =~ Q5 8.359 0.332 0.217 0.215 0.215
## 37 Q5 ~~ Q11 6.301 0.144 0.144 0.234 0.234
## 55 Q9 ~~ Q10 10.264 2.325 2.325 7.346 7.346
sr = residuals(cfa.model, type="standardized"); sr
## $type
## [1] "standardized"
##
## $cov
## Q4 Q5 Q6 Q7 Q11 Q8 Q9 Q10
## Q4 0.000
## Q5 0.007 0.000
## Q6 -0.168 -0.369 0.000
## Q7 0.238 -1.097 0.710 0.000
## Q11 -0.079 1.924 -0.945 -0.628 0.000
## Q8 1.918 3.455 1.638 1.513 1.197 0.000
## Q9 -0.972 2.224 0.141 -1.508 1.092 -0.131 0.000
## Q10 -1.263 1.691 -1.250 -1.586 0.400 -0.047 0.063 0.000
There are four suggested specifications with MIs > 3.84. We may
ignore PA1 =~ Q8
and PA2 =~ Q5
based on
content, because it is not justifiable to allow these two items
specified under other factors. Q9 ~~ Q10
is justifiable,
based on the wording “is important”. But Q5 ~~ Q11
is not
justifiable.
Q5 has an SR > 2.58 with Q8 (SR = 3.455). So we may focus on Q5. Avoid Q8 because there are only three items in the factor and FL Q8 > Q5.
Whenever the model do not fit well, we must revise the model. To do so, we must look for the causes of the poor fit to the data. The causes in CFA could be:
The problems might not surface if a proper EFA is done in the first place and the model is theoretically sound.
Model-to-model comparison following revision is done based on:
Model revision
Revision 1: Based on MI, Q9 ~~ Q10
?
Both from PA2, reasonable by the wording of the questions.
model1 = "
PA1 =~ Q4 + Q5 + Q6 + Q7 + Q11
PA2 =~ Q8 + Q9 + Q10
Q9 ~~ Q10
"
cfa.model1 = cfa(model1, data = data.cfa, estimator = "MLR")
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated ov variances are
## negative
summary(cfa.model1, fit.measures=T, standardized=T)
Take note of the warning message above! It points out to problem(s) in our model. Here we have negative variance. Remember that variance is the square of standard deviation, thus it is impossible to have a negative variance!
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
##
## Number of observations 150
## Model Test User Model:
## Standard Scaled
## Test Statistic 26.487 19.771
## Degrees of freedom 18 18
## P-value (Chi-square) 0.089 0.346
## Scaling correction factor 1.340
## Yuan-Bentler correction (Mplus variant)
## Model Test Baseline Model:
## Test statistic 453.795 325.195
## Degrees of freedom 28 28
## P-value 0.000 0.000
## Scaling correction factor 1.395
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.980 0.994
## Tucker-Lewis Index (TLI) 0.969 0.991
##
## Robust Comparative Fit Index (CFI) 0.994
## Robust Tucker-Lewis Index (TLI) 0.991
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1560.730 -1560.730
## Scaling correction factor 1.166
## for the MLR correction
## Loglikelihood unrestricted model (H1) -1547.487 -1547.487
## Scaling correction factor 1.253
## for the MLR correction
##
## Akaike (AIC) 3157.461 3157.461
## Bayesian (BIC) 3211.652 3211.652
## Sample-size adjusted Bayesian (SABIC) 3154.685 3154.685
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.056 0.026
## 90 Percent confidence interval - lower 0.000 0.000
## 90 Percent confidence interval - upper 0.099 0.073
## P-value H_0: RMSEA <= 0.050 0.376 0.754
## P-value H_0: RMSEA >= 0.080 0.200 0.024
##
## Robust RMSEA 0.030
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.091
## P-value H_0: Robust RMSEA <= 0.050 0.637
## P-value H_0: Robust RMSEA >= 0.080 0.105
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.058 0.058
The upper 90% CI of RMSEA is smaller, but
## Parameter Estimates:
## Standard errors Sandwich
## Information bread Observed
## Observed information based on Hessian
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 =~
## Q4 1.000 0.950 0.813
## Q5 0.663 0.089 7.470 0.000 0.630 0.626
## Q6 0.811 0.091 8.876 0.000 0.771 0.708
## Q7 0.923 0.089 10.409 0.000 0.877 0.739
## Q11 0.528 0.092 5.729 0.000 0.502 0.538
## PA2 =~
## Q8 1.000 1.517 1.522
## Q9 0.247 0.314 0.786 0.432 0.374 0.359
## Q10 0.267 0.332 0.803 0.422 0.404 0.369
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Q9 ~~
## .Q10 0.678 0.198 3.419 0.001 0.678 0.683
## PA1 ~~
## PA2 0.267 0.096 2.787 0.005 0.185 0.185
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Q4 0.462 0.087 5.329 0.000 0.462 0.339
## .Q5 0.618 0.084 7.331 0.000 0.618 0.609
## .Q6 0.590 0.104 5.682 0.000 0.590 0.498
## .Q7 0.639 0.125 5.101 0.000 0.639 0.454
## .Q11 0.617 0.077 8.008 0.000 0.617 0.710
## .Q8 -1.307 2.801 -0.467 0.641 -1.307 -1.316
## .Q9 0.947 0.198 4.782 0.000 0.947 0.871
## .Q10 1.038 0.229 4.531 0.000 1.038 0.864
## PA1 0.903 0.138 6.556 0.000 1.000 1.000
## PA2 2.300 2.779 0.828 0.408 1.000 1.000
## NA
we have a serious Heywood case here! Q8 FL = 1.522. Thus this solution is not acceptable.
Revision 2: Remove Q5?
model2 = "
PA1 =~ Q4 + Q6 + Q7 + Q11
PA2 =~ Q8 + Q9 + Q10
"
cfa.model2 = cfa(model2, data = data.cfa, estimator = "MLR")
summary(cfa.model2, fit.measures=T, standardized=T)
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 150
## Model Test User Model:
## Standard Scaled
## Test Statistic 20.451 14.467
## Degrees of freedom 13 13
## P-value (Chi-square) 0.085 0.342
## Scaling correction factor 1.414
## Yuan-Bentler correction (Mplus variant)
## Model Test Baseline Model:
## Test statistic 380.263 260.751
## Degrees of freedom 21 21
## P-value 0.000 0.000
## Scaling correction factor 1.458
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.979 0.994
## Tucker-Lewis Index (TLI) 0.966 0.990
##
## Robust Comparative Fit Index (CFI) 0.994
## Robust Tucker-Lewis Index (TLI) 0.990
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1380.510 -1380.510
## Scaling correction factor 1.167
## for the MLR correction
## Loglikelihood unrestricted model (H1) -1370.284 -1370.284
## Scaling correction factor 1.281
## for the MLR correction
##
## Akaike (AIC) 2791.020 2791.020
## Bayesian (BIC) 2836.179 2836.179
## Sample-size adjusted Bayesian (SABIC) 2788.707 2788.707
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.062 0.027
## 90 Percent confidence interval - lower 0.000 0.000
## 90 Percent confidence interval - upper 0.111 0.079
## P-value H_0: RMSEA <= 0.050 0.314 0.704
## P-value H_0: RMSEA >= 0.080 0.306 0.048
##
## Robust RMSEA 0.033
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.104
## P-value H_0: Robust RMSEA <= 0.050 0.579
## P-value H_0: Robust RMSEA >= 0.080 0.173
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.063 0.063
The upper 90% CI of RMSEA has reduced from 0.112 to 0.104.
## Parameter Estimates:
## Standard errors Sandwich
## Information bread Observed
## Observed information based on Hessian
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 =~
## Q4 1.000 0.941 0.806
## Q6 0.830 0.099 8.391 0.000 0.781 0.718
## Q7 0.960 0.100 9.597 0.000 0.904 0.762
## Q11 0.504 0.091 5.547 0.000 0.474 0.509
## PA2 =~
## Q8 1.000 0.651 0.653
## Q9 1.351 0.155 8.692 0.000 0.880 0.844
## Q10 1.444 0.202 7.143 0.000 0.940 0.858
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## PA1 ~~
## PA2 0.048 0.068 0.705 0.481 0.078 0.078
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Q4 0.479 0.101 4.734 0.000 0.479 0.351
## .Q6 0.574 0.106 5.418 0.000 0.574 0.485
## .Q7 0.591 0.134 4.403 0.000 0.591 0.420
## .Q11 0.644 0.079 8.180 0.000 0.644 0.741
## .Q8 0.569 0.101 5.657 0.000 0.569 0.573
## .Q9 0.313 0.096 3.266 0.001 0.313 0.288
## .Q10 0.318 0.108 2.940 0.003 0.318 0.264
## PA1 0.886 0.146 6.052 0.000 1.000 1.000
## PA2 0.424 0.106 4.006 0.000 1.000 1.000
The FLs and factor correlation are acceptable. No Heywood’s case.
mi2 = modificationIndices(cfa.model2)
subset(mi2, mi > 3.84)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 18 PA1 =~ Q8 9.707 0.241 0.227 0.228 0.228
## 45 Q9 ~~ Q10 9.707 3.661 3.661 11.615 11.615
sr2 = residuals(cfa.model2, type="standardized"); sr2
## $type
## [1] "standardized"
##
## $cov
## Q4 Q6 Q7 Q11 Q8 Q9 Q10
## Q4 0.000
## Q6 -0.225 0.000
## Q7 -0.088 0.429 0.000
## Q11 0.628 -0.412 -0.440 0.000
## Q8 2.186 1.860 1.767 1.421 0.000
## Q9 -0.485 0.506 -1.207 1.419 -0.154 0.000
## Q10 -0.571 -0.829 -1.159 0.693 -0.076 0.074 0.000
There is no SR > 2.58.
So we may stop at model2, although the upper 90% CI of RMSEA is still > 0.08, but there is no more localized areas of misfit by SR.
Model-to-model comparison
Because model2 is not nested in model, we compare by AIC and BIC,
anova(cfa.model, cfa.model2)
## Warning in lavTestLRT(object = object, ..., model.names = NAMES): lavaan WARNING: some
## models are based on a different set of observed variables
##
## Scaled Chi-Squared Difference Test (method = "satorra.bentler.2001")
##
## lavaan NOTE:
## The "Chisq" column contains standard test statistics, not the
## robust test that should be reported per model. A robust difference
## test is a function of two standard (not robust) statistics.
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## cfa.model2 13 2791 2836.2 20.451
## cfa.model 19 3166 3217.2 37.063 13.562 6 0.03493 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Clearly, the AIC and BIC are reduced (model2
[without
Q5] vs model
[with Q5]). Please ignore the \(\chi^2\) difference result.
Omega \(\omega\) coefficient
Composite reliability by omega (\(\omega\)) coefficient is the reliability coefficient applicable to CFA. It takes into account the correlated errors.
Construct reliability \(\geq\) 0.7 (Hair, Black, Babin, & Anderson, 2010) is acceptable.
Look at the omega
row in the output,
rel.model2 = compRelSEM(cfa.model2)
print(rel.model2, digits = 3)
## PA1 PA2
## 0.809 0.836
Composite reliability omega: PA1 = 0.809, PA2 = 0.836. Both factors are reliable.
A CFA model can be nicely presented in the form of path diagram.
semPaths(cfa.model2, 'path', 'std', style = 'lisrel',
edge.color = 'black')
In the report, you must include a number of important statements and results pertaining to the CFA,
Fit indices of the models.
Model | \(\chi^2\)(df) | \(P\) | SRMR | RMSEA | 90% CI | CFI | TLI | AIC | BIC |
---|---|---|---|---|---|---|---|---|---|
Model | 27.4(19) | 0.096 | 0.079 | 0.063 | 0.000, 0.112 | 0.973 | 0.960 | 3166 | 3217 |
Model 2 | 14.5(13) | 0.342 | 0.063 | 0.033 | 0.000, 0.104 | 0.994 | 0.990 | 2791 | 2836 |
Factor loadings and reliability of Model 2.
Factor | Item | Factor loading | Omega |
---|---|---|---|
Affinity | Q4 | 0.806 | 0.809 |
Q6 | 0.718 | ||
Q7 | 0.762 | ||
Q11 | 0.509 | ||
Importance | Q8 | 0.653 | 0.836 |
Q9 | 0.844 | ||
Q10 | 0.858 | ||
Factor correlation: Affinity \(\leftrightarrow\) Importance, \(r = 0.078\) |
The path diagram of Model 2.
The regression weight of an item from a factor is fixed to 1. Another approach in CFA is to fix the factor variance to 1 (Brown, 2015). This supposedly gives same model fit (Muthen, 2012)↩︎
The latent variable (factor) is an unobserved variable, thus it has to be scaled by a method to define its metric/unit of measurement. This is done by fixing either the item regression weight or the factor variance to 1.↩︎
model with same number of items, but with different model specifications e.g. number of factors↩︎