Hey everyone, I’m stuck on a MySQL problem. I’m trying to figure out which professor and student have the most classes in common. Here’s what I’ve got so far:
My database has these tables:
- Students (id, first name, last name)
- Professors (id, first name, last name)
- Courses (id, name)
- Grades (student id, course id, assignment, grade)
I managed to get a list of students, teachers, and courses with this query:
SELECT DISTINCT s.first_name AS student, p.first_name AS professor, c.name AS course
FROM grades g
JOIN students s ON g.student_id = s.id
JOIN courses c ON g.course_id = c.id
JOIN professors p ON c.professor_id = p.id
This gives me a good starting point, but I’m not sure how to go from here to find the student-professor pair with the most overlapping courses. Any ideas on how to approach this in MySQL? Thanks in advance for your help!
Your query is a good starting point. To find the student-professor pair with the most shared courses, you’ll need to use GROUP BY and COUNT. Here’s a modified version of your query that should work:
SELECT s.first_name AS student, p.first_name AS professor, COUNT(DISTINCT c.id) AS shared_courses
FROM grades g
JOIN students s ON g.student_id = s.id
JOIN courses c ON g.course_id = c.id
JOIN professors p ON c.professor_id = p.id
GROUP BY s.id, p.id
ORDER BY shared_courses DESC
LIMIT 1
This query groups the results by student and professor, counts the distinct courses they share, and then orders the results to show the pair with the highest count. The LIMIT 1 ensures you get only the top result. Let me know if you need any clarification on this approach.
try grouping by student & professor; then count distinct courses. example: SELECT s.first_name, p.first_name, COUNT(DISTINCT c.id) as shared FROM grades … GROUP BY s.id, p.id ORDER BY shared DESC LIMIT 1. good luck!
Hey Charlotte91!
Your question got me thinking… Have you considered using subqueries to tackle this problem? 
I’m curious - what’s the context for this query? Are you working on a school project or maybe building a system for course management?
Here’s a thought: instead of joining all the tables at once, what if we first found the courses for each student-professor pair, then counted them? Something like:
SELECT student, professor, COUNT(*) as shared_courses
FROM (
SELECT DISTINCT s.first_name as student, p.first_name as professor, c.id
FROM grades g
JOIN students s ON g.student_id = s.id
JOIN courses c ON g.course_id = c.id
JOIN professors p ON c.professor_id = p.id
) as subquery
GROUP BY student, professor
ORDER BY shared_courses DESC
LIMIT 1
What do you think about this approach? Does it make sense for your use case?
By the way, have you encountered any performance issues with your current query? Large datasets can sometimes cause problems with complex joins. Just wondering if that’s something you’ve had to deal with! 