Multilevel CFA: Calculating School-Level Factor Scores in a Three-Tier Dataset

I’m working with a dataset that has three levels: teacher, school, and country. I want to do a Confirmatory Factor Analysis (CFA) using teacher survey responses, but I need the factor scores at the school level. I also want to check for measurement invariance across countries.

I’m using the lavaan package in R because it can handle my complex survey design with the lavaan.survey extension. So far, I’ve done some analysis using country-ID as the group in the cfa function. This lets me do the measurement invariance analysis for countries, but the factor scores are at the teacher level.

Does anyone know how to get these factor scores at the school level instead? Here’s a basic example of what I’m doing:

library(lavaan)

model <- '
  teacher_support =~ Q1 + Q2 + Q3
  professional_growth =~ Q4 + Q5 + Q6 + Q7
'

fit_base <- cfa(model, data = survey_data, group = 'country_code')
fit_metric <- cfa(model, data = survey_data, group = 'country_code', group.equal = 'loadings')
fit_scalar <- cfa(model, data = survey_data, group = 'country_code', group.equal = c('loadings', 'intercepts'))

Any suggestions would be really helpful!

have u looked into the multilevel.sem package? it might help with ur 3-tier data. for school-level scores, you could try aggregating teacher scores by school ID. but watch out for small sample sizes per school.

Also, check out the ICC to see if there’s enough between-school variance. good luck with ur analysis, sounds tricky!

Hey CreativeChef15! Your multilevel CFA problem sounds super interesting. I’m curious about a few things:

Have you considered using a multilevel SEM approach instead? It might be a good fit for your three-tier data structure.

For getting school-level factor scores, have you thought about aggregating the teacher-level scores? You could calculate the mean or median of the teacher scores for each school. It’s not perfect, but it could be a starting point.

I’m wondering about your sample size at each level. How many teachers per school and how many schools per country do you have? This could impact the reliability of school-level estimates.

Also, have you looked into the MplusAutomation package? It can work with lavaan and might offer some multilevel CFA options that could help.

Keep us posted on what you figure out! This kind of analysis is tricky but super valuable.

I’ve encountered a similar challenge in my research. One approach you might consider is using the ‘aggregate’ function in R to compute school-level means of the teacher-level factor scores. This method, while not perfect, can provide a reasonable approximation of school-level scores.

Here’s a potential workflow:

  1. Run your CFA model as you’ve done.
  2. Extract factor scores at the teacher level.
  3. Use ‘aggregate’ to compute mean scores for each school.

However, this method doesn’t account for the nested structure of your data. For a more sophisticated approach, you might want to explore multilevel structural equation modeling (MSEM) using packages like ‘lme4’ or ‘nlme’ in combination with ‘lavaan’. These can handle the hierarchical nature of your data more effectively.

Remember to assess the intraclass correlation (ICC) to ensure there’s sufficient between-school variance to justify school-level aggregation. Good luck with your analysis!