Incorporating CFA factor scores into my original dataset

Hey everyone! I’m working on a project where I did a Confirmatory Factor Analysis (CFA) using the lavaan package in R. I’ve got the factor scores, but I’m stuck trying to add them to my original dataset. Here’s what I’ve done so far:

library(lavaan)

job_model <- 'factor1 =~ var1 + var2 + var3
              factor2 =~ var4 + var5 + var6
              factor3 =~ var7 + var8 + var9
              factor4 =~ var10 + var11 + var12
              factor5 =~ var13 + var14 + var15'

results <- cfa(job_model, data = my_data, scores = 'regression')
factor_scores <- predict(results)

I’ve got the factor scores in factor_scores, but I can’t figure out how to add these as new columns to my_data. I tried using cbind(), but it gave me an error about the S4 class.

Any ideas on how to merge these factor scores back into my original dataset? I’d really appreciate some help! Thanks!

yo, i had similar probs. try this:

factor_scores_df <- as.data.frame(factor_scores)
my_data$factor1 <- factor_scores_df$factor1
my_data$factor2 <- factor_scores_df$factor2
# repeat for other factors

this adds each factor as a new column. might be cleaner than cbind. lmk if it works!

Hey Melody_Cheerful! :blush: That’s a super interesting project you’re working on! I’ve actually been toying with CFA myself recently, so I totally get your frustration.

Have you tried converting your factor scores to a data frame first? Sometimes that can help smooth out the process. Maybe something like this could work:

factor_scores_df <- as.data.frame(factor_scores)
my_data_with_scores <- cbind(my_data, factor_scores_df)

I’m curious though - what kind of variables are you working with? It sounds like you’ve got quite a few factors in your model. Are you looking at job satisfaction or something similar?

Oh, and another thought - have you considered using the merge() function instead of cbind()? Sometimes that can be a bit more flexible, especially if your datasets have different numbers of rows.

Let me know if any of that helps! And if not, no worries - we can definitely brainstorm some other approaches. CFA can be tricky, but I’m sure we can figure it out together! :raised_hands:

I’ve encountered a similar issue while working with lavaan and factor scores. One approach that worked for me was using the data.frame() function to convert the factor scores into a data frame, then using cbind() to combine it with the original dataset. Here’s an example:

factor_scores_df <- data.frame(factor_scores)
my_data_with_scores <- cbind(my_data, factor_scores_df)

If that doesn’t work, you might want to check if the row names or indices of your factor scores match those in your original dataset. Sometimes, the order can get mixed up during the CFA process. In that case, you might need to use a join function from the dplyr package:

library(dplyr)
my_data_with_scores <- my_data %>%
  mutate(row_id = row_number()) %>%
  left_join(factor_scores_df %>% mutate(row_id = row_number()), by = 'row_id')

This approach ensures that the factor scores are matched correctly to each observation in your original dataset. Let me know if you need any clarification on these methods.