Conditionally Counting in MySQL
This week's post will be a relatively straightforward and revolves around MySQL and a situation that I encountered recently.
Let's say that I have a database called school_timetable
. Here is how it is structured (specific details of its data is not important for this):
So I have a table of students, a table of subjects and a table that links the two together. A standard "many to many" pattern.
If we wanted to get a list of all of the students and how many subjects that each one is assigned to, we would do something like this:
SELECT
student.first_name AS 'first name',
student.second_name AS 'second name',
COUNT(sub.pk) AS 'total subjects'
FROM
tbl_student student
LEFT JOIN tbl_student_subject stu_sub ON stu_sub.fk_student = student.pk
JOIN tbl_subject sub ON sub.pk = stu_sub.fk_subject
GROUP BY
student.pk
Fairly straightforward. We join the 3 tables (we actually don't need to join the subject table at this point, but I have anyway as I want it later), and then group the data by student. Then I perform a count on the subjects.
The results are something like:
| first name | second name | total subjects |
|------------|-------------|----------------|
| Alan | Turing | 5 |
| Dennis | Ritchie | 5 |
| Grace | Hopper | 3 |
| Ada | Lovelace | 5 |
| Edsger | Dijkstra | 4 |
The subject table has a column called "period" which specifies if the subject is in the "morning" or the "afternoon". Let's say that I want to have the above list, but with the total "morning" subjects and "afternoon" subjects in two separate columns.
You could do this using subselects I am sure, but there is a much cleaner solution. Add this to our SELECT
clause:
SUM(CASE WHEN sub.period = "morning" THEN 1 ELSE 0 END) AS 'total morning subjects'
What is going on here? sub
is the subject table that we included in the JOIN
above. SUM
totals up whatever is given to it and in this case, we are giving a condition: if the value of period
is "morning", return 1, otherwise return 0.
Hence, SUM
adds up all of the 1s and therefore all of the subjects that have the value "morning".
It is a little difficult to explain, I hope it sort of makes sense. What usually confuses me as a programmer is that there is no "loop" here to cycle through all of the values. Instead the GROUP BY
clause is sort of doing that for us by putting together all of the rows of the same student id.
Let's see how the final statement looks with this line and another line for the afternoon subjects:
SELECT
student.first_name AS 'first name',
student.second_name AS 'second name',
SUM(CASE WHEN sub.period = "morning" THEN 1 ELSE 0 END) AS 'total morning subjects',
SUM(CASE WHEN sub.period = "afternoon" THEN 1 ELSE 0 END) AS 'total morning subjects',
COUNT(sub.pk) AS 'total subjects'
FROM
tbl_student student
LEFT JOIN tbl_student_subject stu_sub ON stu_sub.fk_student = student.pk
JOIN tbl_subject sub ON sub.pk = stu_sub.fk_subject
GROUP BY
student.pk
And the results:
| first name | second name | total morning subjects | total morning subjects | total subjects |
|------------|-------------|------------------------|------------------------|----------------|
| Alan | Turing | 3 | 2 | 5 |
| Dennis | Ritchie | 3 | 2 | 5 |
| Grace | Hopper | 2 | 1 | 3 |
| Ada | Lovelace | 4 | 1 | 5 |
| Edsger | Dijkstra | 2 | 2 | 4 |
As you can see, the totals are there and add up correctly.
Hope this has been useful to someone, I know it will be to me in the future!