Trouble with multi-group CFA on Likert scale data due to low frequency responses

I’m stuck on a multi-group CFA for a psych tool using Likert data. I’ve set it up as ordinal with WLSMV estimation and theta parameterization. The initial CFA works, but I get a weird warning about the variance-covariance matrix.

The real headache starts when I try to split by groups. Here’s what happens:

# This fails
group_cfa <- cfa(my_model, data=survey_data, estimator='WLSMV', std.lv=TRUE, group="gender", parameterization="theta")

# Error: Empty categories for 'q35_ordinal' in group 1. Frequencies: [80 18 3 0 0]

Looks like no guys picked the top two choices for question 35. Oddly, I get a similar error with a different grouping, but that CFA still runs:

# This works despite the error
other_group_cfa <- cfa(my_model, data=survey_data, estimator='WLSMV', std.lv=TRUE, group="relationship_status", parameterization="theta")

# Same error, but it runs anyway

Why does this work for one grouping but not the other? How can I fix it? Any ideas?

hey dancingbutterfly, ugh cfa’s can be such a pain! sounds like ur dealing with some sparse data issues. for the gender split, maybe try collapsing those top categories? like combine 4&5 into one. might lose some nuance but could get ur model running. or u could try a different estimator like mlr if ur ok with treating it as continuous. good luck!

Hey there DancingButterfly! Wow, CFAs can be such a headache sometimes, right? :sweat_smile: I’ve run into similar issues before and it’s so frustrating!

Have you considered trying a different approach altogether? What about using multiple imputation to handle those sparse categories? It might help with the empty cell problem without losing data.

Or here’s a wild thought - what if you treated gender as a covariate instead of a grouping variable? That could potentially sidestep the whole issue while still letting you examine gender effects.

I’m really curious though - what made you choose those particular groupings? And how critical is it to your research questions to keep the full 5-point scale? Sometimes simplifying can open up new insights!

Keep us posted on what you try next! This kind of tricky analysis always makes for an interesting journey. :brain::sparkles:

I’ve encountered similar issues with multi-group CFAs on Likert data before. The problem likely stems from the sparse responses in certain categories, particularly for the gender grouping. One approach that’s worked for me is to use Bayesian estimation instead of WLSMV. It’s more robust to sparse data and doesn’t require collapsing categories.

Try implementing your model using the blavaan package in R. It interfaces with JAGS or Stan for Bayesian estimation. You’ll need to specify appropriate priors, but it can handle ordinal data without issues. Something like:

library(blavaan)
bayes_cfa <- bcfa(my_model, data=survey_data, group="gender", ordered=c("q35_ordinal", ...))

This method preserves the original scale and often yields more stable estimates. Just be prepared for longer computation times.