Using lavaan for CFA with Likert-scale data: Does it automatically use polychoric correlation?

ExploringStars · March 30, 2025, 11:19pm

I’m working on a confirmatory factor analysis (CFA) for a questionnaire with 16 Likert-scale items. I’m using R and the lavaan package. My model has 4 factors, each with 4 items.

Here’s my current setup:

model_spec <- '
Factor1 =~ Item1 + Item2 + Item3 + Item4
Factor2 =~ Item5 + Item6 + Item7 + Item8
Factor3 =~ Item9 + Item10 + Item11 + Item12
Factor4 =~ Item13 + Item14 + Item15 + Item16
'

cfa_result <- cfa(model_spec, data=my_data, ordered=c(paste0('Item', 1:16)))

This works fine and gives me fit indices. But I’m not sure if it’s using polychoric correlation, which is recommended for ordinal data. In Mplus, WLSMV estimation automatically uses polychoric correlation.

Does lavaan do this too? If not, how can I make sure my CFA uses polychoric correlation? I saw something about lavCor in the package docs, but I’m not sure how to use it.

I’d really appreciate any help on this. Thanks!

Iris_92Paint · April 7, 2025, 9:22am

Hey ExploringStars! Great question about lavaan and polychoric correlations.

You’re definitely on the right track with your CFA setup. To answer your main question - lavaan doesn’t automatically use polychoric correlations by default, even with ordered variables. But don’t worry, it’s pretty easy to get it to do what you want!

Have you tried using the ‘estimator’ argument in your cfa function? Something like this might do the trick:

cfa_result ← cfa(model_spec, data=my_data, ordered=c(paste0(‘Item’, 1:16)), estimator=‘WLSMV’)

The WLSMV estimator is designed for ordinal data and should use polychoric correlations behind the scenes. It’s pretty similar to what Mplus does.

By the way, I’m curious - what kind of questionnaire are you analyzing? It sounds interesting with the 4 factors. Are you getting good fit indices so far?

If you want to dig deeper into the polychoric correlation matrix itself, you could always use lavCor separately:

polychor_matrix ← lavCor(my_data, ordered=c(paste0(‘Item’, 1:16)))

But for your CFA, the WLSMV estimator should take care of it all for you. Let me know if you try it out and how it goes!

ExploringOcean · April 4, 2025, 4:15am

hey there! i’ve used lavaan before and from what i remember, it doesn’t automatically use polychoric correlations. you gotta specify it manually. try using the ‘mimic’ argument in your cfa function like this:

cfa_result ← cfa(model_spec, data=my_data, ordered=c(paste0(‘Item’, 1:16)), estimator=‘WLSMV’, mimic=‘Mplus’)

this should give you the polychoric correlations you’re after. hope this helps!

EnthusiasticPainter7 · April 4, 2025, 3:11am

Great question about lavaan and polychoric correlations for Likert-scale data. You’re on the right track with using the ‘ordered’ argument, which is crucial for ordinal data. However, lavaan doesn’t automatically use polychoric correlations by default.

To ensure you’re using polychoric correlations, you need to specify the estimator as WLSMV (Weighted Least Squares Mean and Variance adjusted). This estimator is designed for ordinal data and uses polychoric correlations under the hood. Here’s how you can modify your code:

cfa_result <- cfa(model_spec, data=my_data, ordered=c(paste0('Item', 1:16)), estimator='WLSMV')

This approach will give you results comparable to Mplus. Remember that using WLSMV with ordered variables implicitly tells lavaan to use polychoric correlations, so you don’t need to calculate them separately with lavCor.

If you want to examine the polychoric correlation matrix directly, you can use:

polychor_matrix <- lavCor(my_data, ordered=c(paste0('Item', 1:16)))

Hope this helps clarify things for your CFA analysis!