[TYPO3-core] FYI #14182: Multiple join conditions are not supported

Ries van Twisk typo3 at rvt.dds.nl
Fri Apr 23 00:26:29 CEST 2010


On Apr 22, 2010, at 4:51 PM, Jigal van Hemert wrote:

> Ries van Twisk wrote:
>>> Xavier Perseguers wrote:
>>>> $rec = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
>>>>   '*',
>>>>   'T1 LEFT OUTER JOIN T2 ON T2.pid=T1.uid AND T2.size=4',
>>>>   'T1.cr_userid=1'
>>>> );
>> you cannot always move T2.size to the where clause specially not  
>> with a outer join,
>> this get's most tricky when size can be NULL.
>
> Can you give an example for this situation?
>
> If T2.size is NULL, T2.size=4 becomes NULL=4 which evaluates to "not  
> true" in the expression.
> Actually:
> Size   (Size=4) IS TRUE	  (Size=4) IS FALSE   (Size=4) IS UNKNOWN
> 4	1			0			0
> 3	0			1			0
> NULL	0			0			1
> For conditions NULL=4 is "not true", so that will work fine (in  
> MySQL at least).
>
> The only difference AFAIK can be that the query optimizer does not  
> move this condition to an early stage in the JOIN processing. The  
> condition "should" be in the ON clause to suggest to the query  
> optimizer that it can be used at that point to decide which rows to  
> consider for that join.
> If the condition is in the WHERE clause it might be processed later,  
> which leads to a larger possible result set before the WHERE  
> conditions are applied.
>
> So far I've seen that other DBMS-s might handle this and other SQL  
> features in different ways.
>



The query optimizer cannot really do much here, since it sees it's a  
LEFT JOIN it cannot move conditions around except trying to understand  
how much rows each request
will return. Since there are just a couple of rows in the tables the  
query optimizer will simply do table scans and not use a key at all.

Putting a condition in the ON clause doesn't have much to do with  
optimizing (not completely true), but telling your DMB what rows you  
are interested in.
See below for some examples of this.

It's true that some RDBM systems behave differently when handling NULL.
Best practice is to avoid NULL in a RDBM unless you really really need  
it.

The condition NULL=4 is NULL (not true or false)

This is done in MySQL

mysql> CREATE TABLE test1 (id int NOT NULL AUTO_INCREMENT, fa int,  
PRIMARY KEY (id));
Query OK, 0 rows affected (0.02 sec)

mysql> CREATE TABLE test2 (id int NOT NULL AUTO_INCREMENT, fa int,  
PRIMARY KEY (id));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO test1 VALUES (DEFAULT, 1);
Query OK, 1 row affected (0.06 sec)
mysql> INSERT INTO test1 VALUES (DEFAULT, 2);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test1 VALUES (DEFAULT, 3);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test1 VALUES (DEFAULT, 4);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test1 VALUES (DEFAULT, null);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test1 VALUES (DEFAULT, null);
Query OK, 1 row affected (0.00 sec)


mysql> SELECT * FROM test1;
+----+------+
| id | fa   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 | NULL |
|  6 | NULL |
+----+------+
6 rows in set (0.00 sec)


mysql> INSERT INTO test2 VALUES (DEFAULT, 1);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 2);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 3);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 4);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 5);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 6);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, 7);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, null);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test2 VALUES (DEFAULT, null);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM test2;
+----+------+
| id | fa   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 |    5 |
|  6 |    6 |
|  7 |    7 |
|  8 | NULL |
|  9 | NULL |
+----+------+
9 rows in set (0.00 sec)


-- Some test
mysql> SELECT * FROM test1 a JOIN test2 b ON (a.fa=b.fa);
+----+------+----+------+
| id | fa   | id | fa   |
+----+------+----+------+
|  1 |    1 |  1 |    1 |
|  2 |    2 |  2 |    2 |
|  3 |    3 |  3 |    3 |
|  4 |    4 |  4 |    4 |
+----+------+----+------+
4 rows in set (0.00 sec)

-- OK, make sence, after all only row 4 is selected
mysql> SELECT * FROM test1 a JOIN test2 b ON (a.fa=b.fa AND b.fa=4);
+----+------+----+------+
| id | fa   | id | fa   |
+----+------+----+------+
|  4 |    4 |  4 |    4 |
+----+------+----+------+
1 row in set (0.00 sec)

-- OK, make sence, after all only row 4 is selected even though it's  
in the where clause
mysql> SELECT * FROM test1 a JOIN test2 b ON (a.fa=b.fa) WHERE a.fa=4;
+----+------+----+------+
| id | fa   | id | fa   |
+----+------+----+------+
|  4 |    4 |  4 |    4 |
+----+------+----+------+
1 row in set (0.00 sec)

-- Here it get's a bit more tricky where some people might get  
confused since the result set is zero,
-- SOme people expect to see here two rows both are NULL, after all  
NULL = NULL, right??? WRONG!!!

mysql> SELECT * FROM test1 a JOIN test2 b ON (a.fa=b.fa AND a.fa IS  
NULL);
Empty set (0.00 sec)

mysql> SELECT * FROM test1 a JOIN test2 b ON (a.fa=b.fa) WHERE a.fa IS  
NULL;
Empty set (0.00 sec)

-- SOme proof, NULL is not NULL,as in it doesn't result to TRUE in  
it's evaulation
mysql> SELECT NULL=NULL;
+-----------+
| NULL=NULL |
+-----------+
|      NULL |
+-----------+
1 row in set (0.00 sec)

-- For the sake of completeness
mysql> SELECT NULL IS NULL;
+--------------+
| NULL IS NULL |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)


-- It can get more confusing
-- LEFT JOIN Shows two rows now!?!?!?! What the ......
-- It's because we get ALL rows from test1 and filtered LATER to only  
show the NULL rows
mysql> SELECT * FROM test1 a LEFT JOIN test2 b ON (a.fa=b.fa) WHERE  
a.fa IS NULL;
+----+------+------+------+
| id | fa   | id   | fa   |
+----+------+------+------+
|  5 | NULL | NULL | NULL |
|  6 | NULL | NULL | NULL |
+----+------+------+------+
2 rows in set (0.00 sec)


-- Here we get ALL rows from test1 and all matching rows from test2.
-- Since test2 only returns the NULL rows we see all NULL here
-- There is a clear difference where a.fa get's evaulated!!!
mysql> SELECT * FROM test1 a LEFT JOIN test2 b ON (a.fa=b.fa AND a.fa  
IS NULL);
+----+------+------+------+
| id | fa   | id   | fa   |
+----+------+------+------+
|  1 |    1 | NULL | NULL |
|  2 |    2 | NULL | NULL |
|  3 |    3 | NULL | NULL |
|  4 |    4 | NULL | NULL |
|  5 | NULL | NULL | NULL |
|  6 | NULL | NULL | NULL |
+----+------+------+------+
6 rows in set (0.00 sec)


-- For teh sake of completeness
-- We get all rows from test1 and all matching rows from test2
mysql> SELECT * FROM test1 a LEFT JOIN test2 b ON (a.fa=b.fa);
+----+------+------+------+
| id | fa   | id   | fa   |
+----+------+------+------+
|  1 |    1 |    1 |    1 |
|  2 |    2 |    2 |    2 |
|  3 |    3 |    3 |    3 |
|  4 |    4 |    4 |    4 |
|  5 | NULL | NULL | NULL |
|  6 | NULL | NULL | NULL |
+----+------+------+------+
6 rows in set (0.01 sec)


-- The other way around from above
mysql> SELECT * FROM test2 a LEFT JOIN test1 b ON (a.fa=b.fa);
+----+------+------+------+
| id | fa   | id   | fa   |
+----+------+------+------+
|  1 |    1 |    1 |    1 |
|  2 |    2 |    2 |    2 |
|  3 |    3 |    3 |    3 |
|  4 |    4 |    4 |    4 |
|  5 |    5 | NULL | NULL |
|  6 |    6 | NULL | NULL |
|  7 |    7 | NULL | NULL |
|  8 | NULL | NULL | NULL |
|  9 | NULL | NULL | NULL |
+----+------+------+------+
9 rows in set (0.01 sec)



More information about the TYPO3-team-core mailing list