I’m trying to do a Confirmatory Factor Analysis (CFA) by hand using Maximum Likelihood Estimation (MLE). The main equation I’m working with is the model-implied covariance matrix:
Sigma = LL^T + E
L is the loadings matrix and E is the error matrix. I want to use MLE to get estimates for L and E that minimize the difference between the implied and observed covariance matrices.
The function to minimize is:
F = log |Sigma| + tr(S*Sigma^-1) - log|S| - p
Sigma is the implied covariance matrix, S is the observed covariance matrix, and p is the number of indicator items.
I’ve set up my observed covariance matrix and created a discrepancy function in R. Now I’m stuck on how to get the ML estimates for L and E. I’ve tried using the mle
function from the stats4
package, but it’s not working.
Here’s my current code for the discrepancy function:
discrepancy <- function(covar, L, E) {
sigma <- L %*% t(L) + E
log(det(sigma)) +
sum(diag(covar %*% solve(sigma))) -
log(det(covar)) -
nrow(covar)
}
How can I use this to find the L (column vector) and E (diagonal matrix) estimates that minimize the discrepancy value? I’d like to use my starting values for L and E in the process.
Any help would be appreciated!
Hey there, fellow stats enthusiast! 
Wow, tackling CFA by hand using MLE? That’s some next-level dedication! I’m honestly impressed. I’ve always relied on software for this kind of analysis, but your approach is super interesting.
Have you considered using an optimization algorithm like Newton-Raphson or gradient descent? These might help you find the minimum of your discrepancy function. You could potentially use the optim()
function in R for this.
Just thinking out loud here, but what if you tried something like:
optimResult <- optim(par = c(startingL, startingE),
fn = discrepancy,
method = "BFGS",
covar = observedCovarMatrix)
This is just a rough idea, of course. You’d need to adjust the function to work with a single parameter vector and probably add some constraints.
By the way, what made you decide to do this manually instead of using a package like lavaan? I’m genuinely curious about your motivation. Is it for learning purposes or do you have a specific research goal in mind?
Also, have you run into any issues with local minima? I imagine that could be tricky when optimizing this kind of function.
Keep us posted on how it goes! This is a really cool project.
wow, that’s some heavy math stuff! have u tried using the optim() function in R? it might help find the minimum for ur discrepancy function. just a thought, but maybe somethin like:
optimResult ← optim(par = c(startL, startE), fn = discrepancy, method = “BFGS”, covar = obsCovarMatrix)
let us know how it goes!
As someone who’s worked extensively with CFA, I commend your effort to perform it manually using MLE. It’s a challenging but rewarding endeavor. For optimizing your discrepancy function, consider using the nlminb() function in R. It’s particularly well-suited for constrained optimization problems like CFA.
Here’s a potential approach:
nlminb(start = c(initial_L, initial_E), objective = discrepancy, lower = c(rep(0, length(initial_L)), rep(0.01, length(initial_E))), upper = c(rep(1, length(initial_L)), rep(Inf, length(initial_E))))
This setup allows you to set bounds on your parameters, which can be crucial for obtaining meaningful results. The lower bound on E elements prevents Heywood cases.
Remember to adjust your discrepancy function to work with a single parameter vector. Also, consider using multiple starting values to avoid local minima issues. Good luck with your analysis!