I came today upon a very strange issue. It seems like MySQL is unable to utilize a compound index when evaluating a plan for a query with a range condition. I’m looking for an explanation. I’ll appreciate any insight on this.
Take a look at the following table:
CREATE TABLE `t` ( `a` int(11) NOT NULL default '0', `b` int(11) NOT NULL default '0', `c` int(11) default NULL, PRIMARY KEY (`a`,`b`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Filled with this data:
mysql> SELECT * FROM t; +---+---+------+ | a | b | c | +---+---+------+ | 1 | 1 | NULL | | 1 | 2 | NULL | | 1 | 3 | NULL | | 1 | 4 | NULL | | 1 | 5 | NULL | | 2 | 1 | NULL | | 2 | 2 | NULL | | 2 | 3 | NULL | | 2 | 4 | NULL | | 2 | 5 | NULL | +---+---+------+ 10 rows in set (0.00 sec)
Now, it is known that I can query by tuples:
mysql> SELECT * FROM t WHERE (a,b) < (2,2); +---+---+------+ | a | b | c | +---+---+------+ | 1 | 1 | NULL | | 1 | 2 | NULL | | 1 | 3 | NULL | | 1 | 4 | NULL | | 1 | 5 | NULL | | 2 | 1 | NULL | +---+---+------+ 6 rows in set (0.00 sec)
MySQL understands tuple comparison (e.g. (a,b) < (2.2)) and returns correct results. Now here’s my issue: I would assume the PRIMARY KEY is used – since it’s on (a,b) – so that’s a simple (well, compound) range condition. Alas:
mysql> EXPLAIN SELECT * FROM t WHERE (a,b) < (2,2)G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10 Extra: Using where 1 row in set (0.00 sec)
We get a full table scan! Now, MySQL has no problem when I do an equality search (e.g. (a,b) = (2,2)). In that case, the PRIMARY KEY is utilized.
Why, then, would it not use it in the range query? Why would it not consider it as a possible key (I could live with FORCE KEY if that would solve the problem). This problem undermines some efforts of mine for nice optimization tricks.
[UPDATE: the above is a simplified version of a very large table I was using (~50M rows, ~30GB), and on which same results were achieved]
Any ideas or suggestions?
The query isn’t a subquery, however for some reason that is where the docs mentions the restriction about row operators. I suspect it was because the row operators were added at the same time as subqueries.
Really, the restriction should be put somewhere else in the documentation, such as the index section or even where row operators are mentioned in the docs.
@Shantanu,
Thanks,
1. Haven’t tried this with 5.1
2. I do
3. Unlimited (killed query after realizing it was running for a few minutes)
@Harrison,
I agree. Thanks for the reference!
Shlomi, as others said — this is just an optimizer weakness at the moment. Peter posted on this a while ago, too: http://www.mysqlperformanceblog.com/2008/04/04/multi-column-in-clause-unexpected-mysql-issue/