Conditionally Counting in MySQL

This week's post will be a relatively straightforward and revolves around MySQL and a situation that I encountered recently.

Let's say that I have a database called school_timetable. Here is how it is structured (specific details of its data is not important for this):

So I have a table of students, a table of subjects and a table that links the two together. A standard "many to many" pattern.

If we wanted to get a list of all of the students and how many subjects that each one is assigned to, we would do something like this:

SELECT
    student.first_name AS 'first name',
    student.second_name AS 'second name',
    COUNT(sub.pk) AS 'total subjects'
FROM
    tbl_student student
    LEFT JOIN tbl_student_subject stu_sub ON stu_sub.fk_student = student.pk
    JOIN tbl_subject sub ON sub.pk = stu_sub.fk_subject
GROUP BY
    student.pk

Fairly straightforward. We join the 3 tables (we actually don't need to join the subject table at this point, but I have anyway as I want it later), and then group the data by student. Then I perform a count on the subjects.

The results are something like:

| first name | second name | total subjects | 
|------------|-------------|----------------| 
| Alan       | Turing      | 5              | 
| Dennis     | Ritchie     | 5              | 
| Grace      | Hopper      | 3              | 
| Ada        | Lovelace    | 5              | 
| Edsger     | Dijkstra    | 4              | 

The subject table has a column called "period" which specifies if the subject is in the "morning" or the "afternoon". Let's say that I want to have the above list, but with the total "morning" subjects and "afternoon" subjects in two separate columns.

You could do this using subselects I am sure, but there is a much cleaner solution. Add this to our SELECT clause:

SUM(CASE WHEN sub.period = "morning" THEN 1 ELSE 0 END) AS 'total morning subjects'

What is going on here? sub is the subject table that we included in the JOIN above. SUM totals up whatever is given to it and in this case, we are giving a condition: if the value of period is "morning", return 1, otherwise return 0.

Hence, SUM adds up all of the 1s and therefore all of the subjects that have the value "morning".

It is a little difficult to explain, I hope it sort of makes sense. What usually confuses me as a programmer is that there is no "loop" here to cycle through all of the values. Instead the GROUP BY clause is sort of doing that for us by putting together all of the rows of the same student id.

Let's see how the final statement looks with this line and another line for the afternoon subjects:

SELECT
    student.first_name AS 'first name',
    student.second_name AS 'second name',
    SUM(CASE WHEN sub.period = "morning" THEN 1 ELSE 0 END) AS 'total morning subjects',
    SUM(CASE WHEN sub.period = "afternoon" THEN 1 ELSE 0 END) AS 'total morning subjects',
    COUNT(sub.pk) AS 'total subjects'
FROM
    tbl_student student
    LEFT JOIN tbl_student_subject stu_sub ON stu_sub.fk_student = student.pk
    JOIN tbl_subject sub ON sub.pk = stu_sub.fk_subject
GROUP BY
    student.pk

And the results:

| first name | second name | total morning subjects | total morning subjects | total subjects | 
|------------|-------------|------------------------|------------------------|----------------| 
| Alan       | Turing      | 3                      | 2                      | 5              | 
| Dennis     | Ritchie     | 3                      | 2                      | 5              | 
| Grace      | Hopper      | 2                      | 1                      | 3              | 
| Ada        | Lovelace    | 4                      | 1                      | 5              | 
| Edsger     | Dijkstra    | 2                      | 2                      | 4              | 

As you can see, the totals are there and add up correctly.

Hope this has been useful to someone, I know it will be to me in the future!


© 2012-2023