on NULL and NOT IN

I’ve been trying to think of something “big” to write about for so long, I haven’t written anything.  So I’ll write about something “small” that I found out the other day.

It turns out, ‘NOT IN’ and ‘NULL’ can have an odd (to me) effect.

Say you are doing something like

SELECT a FROM table1
WHERE a NOT IN
(SELECT a FROM table2);

If there are any NULLs in the table2′s a column, you will never get any results from this query.
Here’s an example:

– first, here are the two tables I used:

mysql> select * from test1;
+——+
| a    |
+——+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
+——+
6 rows in set (0.02 sec)

mysql> select * from test2;
+——+
| a    |
+——+
|    1 |
|    2 |
|    3 |
| NULL |
+——+
4 rows in set (0.00 sec)

mysql> select a from test1 where a not in (select a from test2);
Empty set (0.02 sec)

– personally, I expected to get 4,5,6.  For that, you can do the following:

mysql> select a from test1 where a not in (select a from test2 where a is NOT NULL);
+——+
| a    |
+——+
|    4 |
|    5 |
|    6 |
+——+
3 rows in set (0.00 sec)

Why is this?  It is actually to follow the SQL standards on NULL.  As mentioned at http://dev.mysql.com/doc/refman/5.1/en/comparison-operators.html ,

“To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.”

Just as “1=NULL” is undefined rather than false, and “1<>NULL” is too, “1 in (null)” is also undefined (or null), and “1 NOT IN (null) is null”.

mysql> select 1 in (null);
+————-+
| 1 in (null) |
+————-+
|        NULL |
+————-+
1 row in set (0.00 sec)

mysql> select 1 not in (null);
+—————–+
| 1 not in (null) |
+—————–+
|            NULL |
+—————–+
1 row in set (0.00 sec)

To simplify the examples with tables above, we can do the following, and see similar results.

mysql> select 1 not in (2);
+————–+
| 1 not in (2) |
+————–+
| 1            |
+————–+
1 row in set (0.00 sec)

mysql> select 1 not in (2,null);
+——————-+
| 1 not in (2,null) |
+——————-+
| NULL              |
+——————-+
1 row in set (0.00 sec)

Personally, once I read the reasoning, it made sense to me, but initially I was surprised.  I thought I’d write about it, because it is common to do a “SELECT … WHERE NOT IN (SELECT …)”, as in the example above with table1 and table2, and the results might not be what you expect if you have any NULLs in your table.

About these ads

3 Responses to “on NULL and NOT IN”

  1. Bill Karwin Says:

    Interesting. In cases like this, I use an alternative query form that doesn’t use a subquery:

    SELECT a FROM test1
    LEFT OUTER JOIN test2 USING (a)
    WHERE test2.a IS NULL;

    This will return the rows from test1 when there is no matching row in test2, even if test2 contains a NULL.

  2. ben Says:

    Bill – good point! Your version is also generally more efficient!

    Thanks,

    Ben

  3. nur_ina Says:

    hi,
    i would like to check whether is there any way for me to read MySQL tables at Oracle???

    if withing oracle and oracle i can do that by synonym…but how if different type of databases…..or is there any other way to accomplish it???

    advance thank for reply…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: