I’ve been trying to think of something “big” to write about for so long, I haven’t written anything. So I’ll write about something “small” that I found out the other day.
It turns out, ‘NOT IN’ and ‘NULL’ can have an odd (to me) effect.
Say you are doing something like
SELECT a FROM table1
WHERE a NOT IN
(SELECT a FROM table2);
If there are any NULLs in the table2′s a column, you will never get any results from this query.
Here’s an example:
– first, here are the two tables I used:
mysql> select * from test1;
+——+
| a |
+——+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+——+
6 rows in set (0.02 sec)
mysql> select * from test2;
+——+
| a |
+——+
| 1 |
| 2 |
| 3 |
| NULL |
+——+
4 rows in set (0.00 sec)
mysql> select a from test1 where a not in (select a from test2);
Empty set (0.02 sec)
– personally, I expected to get 4,5,6. For that, you can do the following:
mysql> select a from test1 where a not in (select a from test2 where a is NOT NULL);
+——+
| a |
+——+
| 4 |
| 5 |
| 6 |
+——+
3 rows in set (0.00 sec)
Why is this? It is actually to follow the SQL standards on NULL. As mentioned at http://dev.mysql.com/doc/refman/5.1/en/comparison-operators.html ,
“To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.”
Just as “1=NULL” is undefined rather than false, and “1<>NULL” is too, “1 in (null)” is also undefined (or null), and “1 NOT IN (null) is null”.
mysql> select 1 in (null);
+————-+
| 1 in (null) |
+————-+
| NULL |
+————-+
1 row in set (0.00 sec)
mysql> select 1 not in (null);
+—————–+
| 1 not in (null) |
+—————–+
| NULL |
+—————–+
1 row in set (0.00 sec)
To simplify the examples with tables above, we can do the following, and see similar results.
mysql> select 1 not in (2);
+————–+
| 1 not in (2) |
+————–+
| 1 |
+————–+
1 row in set (0.00 sec)
mysql> select 1 not in (2,null);
+——————-+
| 1 not in (2,null) |
+——————-+
| NULL |
+——————-+
1 row in set (0.00 sec)
Personally, once I read the reasoning, it made sense to me, but initially I was surprised. I thought I’d write about it, because it is common to do a “SELECT … WHERE NOT IN (SELECT …)”, as in the example above with table1 and table2, and the results might not be what you expect if you have any NULLs in your table.
February 27, 2010 at 7:12 am |
Interesting. In cases like this, I use an alternative query form that doesn’t use a subquery:
SELECT a FROM test1
LEFT OUTER JOIN test2 USING (a)
WHERE test2.a IS NULL;
This will return the rows from test1 when there is no matching row in test2, even if test2 contains a NULL.
February 28, 2010 at 3:36 am |
Bill – good point! Your version is also generally more efficient!
Thanks,
Ben
June 23, 2010 at 3:52 am |
hi,
i would like to check whether is there any way for me to read MySQL tables at Oracle???
if withing oracle and oracle i can do that by synonym…but how if different type of databases…..or is there any other way to accomplish it???
advance thank for reply…