This article applies to Microsoft SQL server.

The concept of null can be confusing to programmers and non-coders alike. For databases, the normal understanding as defined by ANSI is null means anything or nothing. It’s literally an undefined value. As such, comparing something to null is impossible. In fact, null = null is neither true nor false. It’s null.

Yeah, that’s confusing.

It makes sense though, if you understand that null in ANSI database-land doesn’t preclude that a value exists so much as it’s impossible to determine what that value might be. There might be that off chance one null value equals another, so there’s no right or wrong answer –otherwise known as true or false– as to whether one null equals another. Much like the values on either side of that equal sign are unknown, so is the answer to whether or not they are equal; hence null –impossible to determine– is the result.

Coalesce is one function which allows us to redefine what null is. Coalesce(@variable,’4′) will return the value of @variable UNLESS it is null. If @variable is null, then ‘4’ is provided as a substitute for null. For numbers, it is very common to redefine null as zero using a function such as “coalesce” simply because if a value is not defined (null), appropriate business logic allows the assumption to be made any such undefined number is zero. In most cases, it’s wise to capture nulls and set them to some such predefined value.

Why is that? Because, nulls bubble up analogously to how trappable errors do. Consider this statement:

select (5 + null) / 7

5 + null = null. Divide that by 7 and you still get null. Similarly with Boolean logic, true and false = false but false and null = null, not false. Because null is like Ice 9. Whatever null touches turns to null, unless you protect yourself with a function like coalesce or the comparison operator “is null” or use another technique to stop the bubbling null.

Now to see this in action. First, let’s create a simple temporary table with one integer column and populate it with six records, five of which have numeric values and one of which is null. Note that we are explicitly turning ANSI nulls on in this exercise.

Now we’re going to use a simple case statement to determine whether each column has a value or is null.

What do you think the result is going to be?

Five values one null wanted result

Five values one null wanted result

You might be expecting that third row to be a “0” and the rest to be “1” like the image above and you would have been correct had ANSI nulls been turned off, because turning off ANSI nulls effectively treats null as a concrete value that can be compared instead of an unknown value which cannot.

So that being the case (no pun intended, seriously), why does the above case statement fail to catch that third record with a null and populate the result with a “0”? Instead, you’re left with this:

Six values erroneous result

Six values erroneous result

Consider that second line of T-SQL, “case testValue”. When testValue is null, like it is on that third record, well, it’s NULL. Case statements fall through to the else clause whenever no other “when” statement is satisfied OR if the initial case condition for the record is null. Because “case testValue” already results in null when testValue for the record is null, the third line “when null then 0” never gets a chance to be tested. The logic falls through to the else clause.

So how can we rewrite this statement so that we get the results we want, as seen in the first result set above?

Why does this work? What makes it different?

The key is where the null is being processed. In the first SQL statement, the null is first processed on line 2, “case testValue”, and when testValue is null, your result is null and all criteria (the “when” clauses) automatically return null without even processing. The “else” clause then runs by default.

The second case statement encapsulates the null result right within the when statement. Now the “is” operator can now come into play and actually check to see if testValue “is null”. Note I didn’t say compare testValue to null. “IS NULL” is a special two-word operator to check if a value has been set to null and that is its only job. “Is null” works where “=” fails because “is null” is not comparing one null value to another. Instead, its simply checking if a value is null at all and this can then result as either true or false.

So you then might consider modifying the first SQL statement to handle nulls right at the outset using the coalesce function. Consider this SQL statement:

This case statement would appear to solve the problem of nulls, but does it?

Let’s peruse the results:

Oops, wrong again!

Oops, wrong again!

If you recall, the first record was populated with a 0. When you treat nulls as a default value, it can become very difficult to tell if a value was originally null or was purposely set to that default value; in this case, “0”. Now the SQL is showing an erroneous result because five records had a value set of which one was set to “0” and one record had no value, being set to null, but the invalid result returns that there were only four values set.

It’s wise to encapsulate nulls in coalesce statements, but beware of what might happen when you do.

And finally time to clean up the mess.

I hope you enjoyed this worthy wordy treatise on nulls using case statements as examples. You should learn more about us by meeting our practice leads!

Like this post? Share it!