CFA in R: Observed variables missing error when using lavaan package

I'm trying to do a Confirmatory Factor Analysis (CFA) in R with the lavaan package. But I'm hitting a snag. The error message says some observed variables are missing. Here's what I'm seeing:

Error: lavaan ERROR: missing observed variables in dataset: Q2 Q4 Q5 Q6 Q12 Q15 Q17 Q18 Q20 Q21 Q25 Q3 Q9 Q10 Q11 Q13 Q14 Q16 Q24 Q26 Q1 Q7 Q8 Q19 Q22 Q23

My code looks like this:

```R
data <- read.csv('my_data.csv', header=TRUE)

model <- '
  Factor1 =~ V1 + V2 + V3 + V4 + V5
  Factor2 =~ V6 + V7 + V8 + V9 + V10
  Factor3 =~ V11 + V12 + V13 + V14 + V15
'

result <- cfa(model, data = data, estimator = 'MLR', missing = 'fiml')

I can do other stuff with my data, so I know it loaded okay. What’s going on here? Any ideas would be super helpful. Thanks!

Hey there Charlotte91! :wave:

I’ve run into this exact problem before, and it can be super frustrating. Have you tried taking a peek at your data structure after you’ve loaded it? Sometimes CSV files can be tricky little beasts.

Maybe try running:

str(data)
head(data)

This should give you a good look at what’s actually in your dataset. My guess is that the variable names might not be what you’re expecting. CSV headers can sometimes get wonky during the import process.

Also, just curious - are you working with a pre-existing questionnaire or survey? Those ‘Q’ variables in the error message make me wonder if there’s some kind of standardized naming convention at play here.

If you’re still stuck after checking out your data structure, maybe we could brainstorm some other potential hiccups? Sometimes it helps to have a fresh pair of eyes on these things. Let me know how it goes!

Looks like you’ve hit a common snag with lavaan. The error message suggests a mismatch between your model specification and the variable names in your dataset. In your model, you’re using ‘V1’, ‘V2’, etc., but the error is looking for ‘Q1’, ‘Q2’, and so on.

Double-check your CSV file. It’s likely your variables are named ‘Q1’, ‘Q2’, etc., instead of ‘V1’, ‘V2’. To fix this, you could either:

  1. Rename your variables in the data frame to match your model:
    data ← setNames(data, paste0(‘V’, 1:ncol(data)))

  2. Or, update your model to use the correct variable names:
    model ← ’
    Factor1 =~ Q1 + Q2 + Q3 + Q4 + Q5
    Factor2 =~ Q6 + Q7 + Q8 + Q9 + Q10
    Factor3 =~ Q11 + Q12 + Q13 + Q14 + Q15

Also, ensure all these variables actually exist in your dataset. You might want to run names(data) to verify the column names. Hope this helps sort out your CFA issue!

yo Charlotte91, been there done that! :sweat_smile: check ur variable names in the dataset. bet they’re Q1, Q2 etc. instead of V1, V2. quick fix: change ur model to match the actual names or rename the columns in ur data. run names(data) to see what u really got. good luck!