I came today upon a very strange issue. It seems like MySQL is unable to utilize a compound index when evaluating a plan for a query with a range condition. I’m looking for an explanation. I’ll appreciate any insight on this.
Take a look at the following table:
CREATE TABLE `t` ( `a` int(11) NOT NULL default '0', `b` int(11) NOT NULL default '0', `c` int(11) default NULL, PRIMARY KEY (`a`,`b`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Filled with this data:
mysql> SELECT * FROM t; +---+---+------+ | a | b | c | +---+---+------+ | 1 | 1 | NULL | | 1 | 2 | NULL | | 1 | 3 | NULL | | 1 | 4 | NULL | | 1 | 5 | NULL | | 2 | 1 | NULL | | 2 | 2 | NULL | | 2 | 3 | NULL | | 2 | 4 | NULL | | 2 | 5 | NULL | +---+---+------+ 10 rows in set (0.00 sec)
Now, it is known that I can query by tuples:
mysql> SELECT * FROM t WHERE (a,b) < (2,2); +---+---+------+ | a | b | c | +---+---+------+ | 1 | 1 | NULL | | 1 | 2 | NULL | | 1 | 3 | NULL | | 1 | 4 | NULL | | 1 | 5 | NULL | | 2 | 1 | NULL | +---+---+------+ 6 rows in set (0.00 sec)
MySQL understands tuple comparison (e.g. (a,b) < (2.2)) and returns correct results. Now here’s my issue: I would assume the PRIMARY KEY is used – since it’s on (a,b) – so that’s a simple (well, compound) range condition. Alas:
mysql> EXPLAIN SELECT * FROM t WHERE (a,b) < (2,2)G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10 Extra: Using where 1 row in set (0.00 sec)
We get a full table scan! Now, MySQL has no problem when I do an equality search (e.g. (a,b) = (2,2)). In that case, the PRIMARY KEY is utilized.
Why, then, would it not use it in the range query? Why would it not consider it as a possible key (I could live with FORCE KEY if that would solve the problem). This problem undermines some efforts of mine for nice optimization tricks.
[UPDATE: the above is a simplified version of a very large table I was using (~50M rows, ~30GB), and on which same results were achieved]
Any ideas or suggestions?
Also, I’m a little unsure what mean with row comparators.
is
select * from t where (a,b) < (2,2) equivalent to:
select * from t where a<2 and b<2
OR
select * from t where a<=2 and b<2
in other words:
is (2,1) < (2,2) ?
I’m pretty sure that row comparison operators just can’t use indexes in MySQL.
If you check out:
http://dev.mysql.com/doc/refman/5.0/en/subquery-restrictions.html
You see this gem:
* Row constructors are not well optimized. The following two expressions are equivalent, but only the second can be optimized:
(col1, col2, …) = (val1, val2, …)
col1 = val1 AND col2 = val2 AND …
Possible work around is to make sure you have a covering index when doing ROW() comparisons.
mysql> desc SELECT * FROM t WHERE (a,b) > (2,2);
+—-+————-+——-+——+—————+——+———+——+——+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+——-+——+—————+——+———+——+——+————-+
| 1 | SIMPLE | t | ALL | NULL | NULL | NULL | NULL | 37 | Using where |
+—-+————-+——-+——+—————+——+———+——+——+————-+
1 row in set (0.00 sec)
mysql> alter table t add index (a,b,c);
Query OK, 37 rows affected (0.08 sec)
Records: 37 Duplicates: 0 Warnings: 0
mysql> desc SELECT * FROM t WHERE (a,b) > (2,2);
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
| 1 | SIMPLE | t | index | NULL | a | 13 | NULL | 37 | Using where; Using index |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
1 row in set (0.00 sec)
I *think* we have a bug report for that already, can’t find it in the bugs database though …
Row comparators simply aren’t very well-optimized in MySQL yet. This is a documented limitation, although if memory serves it’s actually mentioned as a subquery limitation.
Some cases are handled properly (equality checks should behave in 5.1). Others you simply have to expand (ie: WHERE (a<2) OR (a=2 AND b<2)).