I’m trying to do a multi-group CFA in R with the lavaan package. I’ve got some categorical variables with 11 categories each. This means there should be 10 thresholds for these variables.
Here’s the weird part: when I look at the results, the 10th threshold is smaller than the 9th one. It’s not increasing like it should. This is happening for several of my 11-category variables.
I’m confused. Why are these thresholds acting up? Is this normal or am I doing something wrong?
Here’s a simplified version of my R code:
model <- '
factor1 =~ item1 + item2 + item3
factor2 =~ item4 + item5 + item6
factor1 ~~ 1*factor1
factor2 ~~ 1*factor2
item2 ~~ item3
item5 ~~ item6
'
cfa_result <- cfa(model, ordered = categorical_vars, estimator = 'WLSMV', data = my_data)
summary(cfa_result, fit.measures = TRUE, standardized = TRUE)
Any ideas what could be causing this threshold weirdness?
This is an intriguing issue you’re encountering with your CFA. The decreasing threshold between the 9th and 10th categories is indeed unusual and warrants investigation. Have you examined the frequency distribution of responses for these problematic items? It’s possible there’s a very low response rate in the highest categories, leading to estimation difficulties.
One approach you might consider is collapsing some of the upper categories if they have low frequencies. This could potentially stabilize the threshold estimates. Additionally, you could try a Bayesian estimation approach, which sometimes handles these situations better than frequentist methods.
If the issue persists, it might be worth reaching out to the lavaan developers directly. They might have insights into potential software limitations or alternative specifications that could resolve this.
hey there, i’ve seen this before! it’s usually a sign that ur data’s got some funky distributions goin on. might be worth checkin if those last 2 categories are super rare or if there’s any weird skewness. sometimes it happens with small samples too. have u tried playin around with different estimators? ULSMV can be more forgiving sometimes
Hmm, that’s a puzzling situation you’ve got there! 
I wonder if you’ve considered the possibility of response bias in your data? Sometimes when we have lots of categories, people tend to avoid the extreme ends. Maybe that’s messing with your thresholds?
Have you tried visualizing the response patterns for those tricky items? A quick histogram might shed some light on what’s going on. It could be that your respondents are clumping up in certain categories and leaving others nearly empty.
Oh, and here’s a wild thought - could there be any cultural or linguistic factors at play? Sometimes the way questions are phrased can lead to unexpected response patterns, especially if you’re dealing with a diverse sample.
What do you think? Have you noticed any patterns in which items are showing this weird threshold behavior? I’m super curious to hear more about your dataset and what you’ve observed so far!