{"id":1886,"date":"2010-02-02T21:55:22","date_gmt":"2010-02-02T19:55:22","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=1886"},"modified":"2010-02-02T21:55:22","modified_gmt":"2010-02-02T19:55:22","slug":"beware-of-implicit-casting","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/beware-of-implicit-casting","title":{"rendered":"Beware of implicit casting"},"content":{"rendered":"<p>Ever so often a query provides a &#8220;bad&#8221; execution plan. Adding a missing index can many times solve the problem. However, not everything can be solved with an index. I wish to highlight the point of having an <em>implicit cast<\/em>, which negates the use of an index on MySQL.<\/p>\n<p>I see this happening a lot on customers&#8217; databases, and this begs for a short introduction.<\/p>\n<h4>MySQL doesn&#8217;t support index functions<\/h4>\n<p>Let&#8217;s assume the following table:<!--more--><\/p>\n<blockquote>\n<pre>CREATE TABLE `person` (\r\n `id` bigint(20) NOT NULL AUTO_INCREMENT,\r\n `first_name` varchar(32) CHARACTER SET utf8 DEFAULT NULL,\r\n `last_name` varchar(32) CHARACTER SET utf8 DEFAULT NULL,\r\n `driver_license_registration` bigint(20) DEFAULT NULL,\r\n PRIMARY KEY (`id`),\r\n KEY `last_name` (`last_name`),\r\n KEY `driver_license_registration` (`driver_license_registration`)\r\n)\r\n<\/pre>\n<\/blockquote>\n<p>And suppose we&#8217;re looking for persons whose last name begin with &#8220;Smith&#8221;. The following query will <strong>NOT<\/strong> utilize an index:<\/p>\n<blockquote>\n<pre>SELECT * FROM person WHERE <strong>LEFT(last_name, 5) = 'Smith'<\/strong>;\r\n<\/pre>\n<\/blockquote>\n<p>Why not? Because we use a function (<strong>LEFT<\/strong>) over the <strong>last_name<\/strong> column, and this makes MySQL unable to deduce the outcome of the expression. Will it be ordered in the same way as the original column? (in this case, the answer is <em>yes<\/em>, <strong>LEFT<\/strong> is a monotonic function) Is it possible to produce the reverse function? (In this case, <em>no<\/em>). MySQL cannot and does not handle these questions and therefore avoids using the index on <strong>last_name<\/strong> altogether:<\/p>\n<blockquote>\n<pre>EXPLAIN SELECT * FROM person WHERE LEFT(last_name, 5) = 'Smith';\r\n+----+-------------+--------+------+---------------+------+---------+------+------+-------------+\r\n| id | select_type | table\u00a0 | type | possible_keys | key\u00a0 | key_len | ref\u00a0 | rows | Extra\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\r\n+----+-------------+--------+------+---------------+------+---------+------+------+-------------+\r\n|\u00a0 1 | SIMPLE\u00a0\u00a0\u00a0\u00a0\u00a0 | person | <strong>ALL<\/strong>\u00a0 | NULL\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | NULL | NULL\u00a0\u00a0\u00a0 | NULL | 1000 | Using where |\r\n+----+-------------+--------+------+---------------+------+---------+------+------+-------------+\r\n<\/pre>\n<\/blockquote>\n<p>To solve the problem above, a simple change to the query will do the trick:<\/p>\n<blockquote>\n<pre>EXPLAIN SELECT * FROM person WHERE last_name <strong>LIKE 'Smith%'<\/strong>;\r\n+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+\r\n| id | select_type | table\u00a0 | type\u00a0 | possible_keys | key\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | key_len | ref\u00a0 | rows | Extra\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\r\n+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+\r\n|\u00a0 1 | SIMPLE\u00a0\u00a0\u00a0\u00a0\u00a0 | person | <strong>range<\/strong> | last_name\u00a0\u00a0\u00a0\u00a0 | last_name | 99\u00a0\u00a0\u00a0\u00a0\u00a0 | NULL |\u00a0\u00a0\u00a0 5 | Using where |\r\n+----+-------------+--------+-------+---------------+-----------+---------+------+------+-------------+\r\n<\/pre>\n<\/blockquote>\n<h4>Not all functions are explicit<\/h4>\n<p>This leads us to the subject of this post. What if we were to look for persons whose driver license registration number begins with <strong>&#8216;123&#8217;<\/strong>? Trying to learn from the above example, we write:<\/p>\n<blockquote>\n<pre>SELECT * FROM person WHERE driver_license_registration LIKE '123%';<\/pre>\n<\/blockquote>\n<p>But the above query does <strong>NOT<\/strong> utilize an index! See:<\/p>\n<blockquote>\n<pre>EXPLAIN SELECT * FROM person WHERE driver_license_registration LIKE '123%';\r\n+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+\r\n| id | select_type | table\u00a0 | type | possible_keys\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | key\u00a0 | key_len | ref\u00a0 | rows | Extra\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\r\n+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+\r\n|\u00a0 1 | SIMPLE\u00a0\u00a0\u00a0\u00a0\u00a0 | person | ALL\u00a0 | driver_license_registration | NULL | NULL\u00a0\u00a0\u00a0 | NULL | 1000 | Using where |\r\n+----+-------------+--------+------+-----------------------------+------+---------+------+------+-------------+\r\n<\/pre>\n<\/blockquote>\n<p>Why not?<\/p>\n<p>Because the <strong>driver_license_registration<\/strong> column is an integer. The query asks for a string comparison (<strong>LIKE &#8216;123%&#8217;<\/strong>). This kind of comparison cannot be performed on integers. The column value will have to transform to text in order to work out the query. Effectively, it&#8217;s as if we wrote:<\/p>\n<blockquote>\n<pre>SELECT * FROM person WHERE <strong>CAST(driver_license_registration AS CHAR)<\/strong> LIKE '123%';\r\n<\/pre>\n<\/blockquote>\n<p>There&#8217;s no immediate solution to this last query. Perhaps the column should be textual after all, perhaps maintain a &#8220;ghost&#8221; column. If the number of digits in a number if fixed an known, then we can convert the above to a range query.<\/p>\n<h4>Casting is an implicit function<\/h4>\n<p>And, alas, it&#8217;s a well hidden function. There&#8217;s nothing to suggest a problem here. This is SQL, not Java. It&#8217;s not a strongly typed language. I see the above example quite a lot. Another common casting mistake is comparing Timestamp values to strings.<\/p>\n<p>So, until MySQL supports index functions, watch out for these nuances.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ever so often a query provides a &#8220;bad&#8221; execution plan. Adding a missing index can many times solve the problem. However, not everything can be solved with an index. I wish to highlight the point of having an implicit cast, which negates the use of an index on MySQL. I see this happening a lot [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[26,21],"class_list":["post-1886","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-indexing","tag-sql"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-uq","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1886","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=1886"}],"version-history":[{"count":14,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1886\/revisions"}],"predecessor-version":[{"id":1901,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1886\/revisions\/1901"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=1886"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=1886"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=1886"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}