Working with NULL in Databases. Turn Your Frustration Into Delight

Working with NULL in Databases. Turn Your Frustration Into Delight

NULL values are one of the most frequent frenemies of almost all database administrators. For many, they’re simply a symbol that no value exists in a column – for others, they’re a symbol of problems within their database instances. Join us as we explore them and tell you how best to act in specific scenarios.

NULL values are some of the most popular values within database management systems: there’s no need to be a database expert to tell that NULL means nothing, and that’s exactly what NULL signifies – it’s a symbol to signify that no values exist in a column. That’s not to say that NULL cannot be problematic though.

The Basics of NULL

Many database administrators set NULL as the default value of their columns when creating a table by running a query like so:

Setting NULL as the default value of a column when creating a table by running a query.

Assigning the default value of columns in DbVisualizer.

Now when any data is inserted into the column named column_1, the default – “pre-built” if you will – value will be NULL. Everything can also be done the other way around too, though – columns can be specified to not accept NULL values and have a default empty value like so:

Error message showing a column cannot be NULL.

Columns cannot be NULL - a message by DbVisualizer.

But that’s the least of our problems... Keep reading – we’ll tell you why in a second.

Problems with NULL

To start with, the most frequent problem related to NULL values is related to the confusion of NULL values and empty strings (rows with no value.) Many newcomers to the SQL world think these two things are the same when in reality, they’re not. Take these two queries for an example:

The confusion of NULL values and empty strings is a frequent problem related to NULL values.

INSERT INTO Queries in DbVisualizer.

The first query will insert an empty string into the email column, while the second query will insert NULL. Now look at the output of this query:

Inserting an empty string into the email column.

The results of the query in DbVisualizer.

The query above probably looks a little different than the queries you’re used to, doesn’t it? The WHERE clause doesn’t specify an equal (=) sign, but instead, it specifies IS NULL. Let’s try to do everything the ordinary way:

Empty result set fetched.

An empty result set in DbVisualizer.

“Empty result set fetched.” Why? Because something = nothing (NULL) is never true.

To test for null, we need to use the IS [NOT] NULL clause rather than search for a match – if we need a function, we should probably turn to IFNULL() instead. In the previous example, we’ve successfully confused MySQL– are we searching for an email with a value of NULL? Or are we searching for an email without a value, but with a default value of NULL? The question is not clear – so is the answer.

NULL also has a couple of things unique to itself – when using some data types (we’re talking about integers and floating-point values), if the column is specified to increment automatically, the next value is inserted into it. Look at the following example:

NULL values inserted with column set to increment automatically.

Inserting and selecting data in DbVisualizer.

Technically, 3 NULL values were inserted, but as the column was set to increment automatically, the values were populated with numbers.

As far as indexes are concerned, columns having NULL values can easily be indexed and that isn’t the cause of much problems, however, bear in mind that NULL values also occupy space within MySQL. Everything works like this:

  • When MyISAM is being used, a field set to NOT NULL will take up less space on the disk.

  • When MyISAM is in use, a field set to NULL as the default value, will take up more space on the disk.

  • NULL does take up storage space, but in order to see any real impact of storing NULL values within a database, you would literally have to have billions of such values (at that point NULL values are the least of your problems.) In other words, NULL values don’t require much storage space in and of itself for many storage engines within MySQL, however, if the NDB storage engine (or NDBCluster) is in use, things change a little – it reserves 4 bytes per row for NULL values. However, the impact of this is probably negligible.

However, if we come to partitions, that’s where NULL values cause the most problems for many advanced DBAs. We get into them below.

Partitioning and NULL

One more problem with NULL values that is specific to MySQL and its associates is partitioning. MySQL as an RDBMS doesn’t exactly disallow or forbid the NULL value to be used, but it does treat the NULL value as being “less than any other value.” Here’s how everything works when our data is partitioned:

  • If you insert NULL values into a partition on a table partitioned by RANGE, the row is inserted into the lowest partition possible.

  • If LIST partitioning is in use, NULL values can only be inserted into the specific partition if NULL is specified in the list of possible values. Otherwise, we’ll get the error 1504: ERROR 1504 (HY000): Table has no partition for value NULL

    ERROR 1504 (HY000): Table has no partition for value NULL.

Table has no partition for value NULL.

  • If HASH or KEY partitioning is being used, NULL is treated the same as if the expression would return 0 (more information can be found over at the documentation.)

Summary

In this blog, we’ve walked you through the concept of NULL values in relational database management systems, with the examples being based on MySQL. While the concept of NULL values might seem problematic and perhaps even frightening to some, it doesn’t have to be that way – when properly managed, NULL values can become a really good friend of yours.

We hope that you’ve enjoyed reading this blog, come back to the DbVisualizer blog in the future to learn even more about databases and how they work, and until next time!

About the author

Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.