Beware of implicit casting

Ever so often a query provides a “bad” execution plan. Adding a missing index can many times solve the problem. However, not everything can be solved with an index. I wish to highlight the point of having an implicit cast, which negates the use of an index on MySQL.

I see this happening a lot on customers’ databases, and this begs for a short introduction.

MySQL doesn’t support index functions

Let’s assume the following table:

CREATE TABLE `person` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `first_name` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
 `last_name` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
 `driver_license_registration` bigint(20) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `last_name` (`last_name`),
 KEY `driver_license_registration` (`driver_license_registration`)
)

And suppose we’re looking for persons whose last name begin with “Smith”. The following query will NOT utilize an index:

SELECT * FROM person WHERE LEFT(last_name, 5) = 'Smith';

Why not? Because we use a function (LEFT) over the last_name column, and this makes MySQL unable to deduce the outcome of the expression. Will it be ordered in the same way as the original column? (in this case, the answer is yes, LEFT is a monotonic function) Is it possible to produce the reverse function? (In this case, no). MySQL cannot and does not handle these questions and therefore avoids using the index on last_name altogether:

EXPLAIN SELECT * FROM person WHERE LEFT(last_name, 5) = 'Smith';
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | person | ALL  | NULL          | NULL | NULL    | NULL | 1000 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+

To solve the problem above, a simple change to the query will do the trick:

EXPLAIN SELECT * FROM person WHERE last_name LIKE 'Smith%';
+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys | key       | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+
|  1 | SIMPLE      | person | range | last_name     | last_name | 99      | NULL |    5 | Using where |
+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+

Not all functions are explicit

This leads us to the subject of this post. What if we were to look for persons whose driver license registration number begins with ‘123’? Trying to learn from the above example, we write:

SELECT * FROM person WHERE driver_license_registration LIKE '123%';

But the above query does NOT utilize an index! See:

EXPLAIN SELECT * FROM person WHERE driver_license_registration LIKE '123%';
+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+
| id | select_type | table  | type | possible_keys               | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | person | ALL  | driver_license_registration | NULL | NULL    | NULL | 1000 | Using where |
+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+

Why not?

Because the driver_license_registration column is an integer. The query asks for a string comparison (LIKE ‘123%’). This kind of comparison cannot be performed on integers. The column value will have to transform to text in order to work out the query. Effectively, it’s as if we wrote:

SELECT * FROM person WHERE CAST(driver_license_registration AS CHAR) LIKE '123%';

There’s no immediate solution to this last query. Perhaps the column should be textual after all, perhaps maintain a “ghost” column. If the number of digits in a number if fixed an known, then we can convert the above to a range query.

Casting is an implicit function

And, alas, it’s a well hidden function. There’s nothing to suggest a problem here. This is SQL, not Java. It’s not a strongly typed language. I see the above example quite a lot. Another common casting mistake is comparing Timestamp values to strings.

So, until MySQL supports index functions, watch out for these nuances.

6 thoughts on “Beware of implicit casting

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.