{"id":2368,"date":"2010-05-11T06:57:02","date_gmt":"2010-05-11T04:57:02","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=2368"},"modified":"2010-05-11T06:57:02","modified_gmt":"2010-05-11T04:57:02","slug":"explain-missing-db-info","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/explain-missing-db-info","title":{"rendered":"EXPLAIN: missing db info"},"content":{"rendered":"<p>I&#8217;m further developing a general log hook, which can stream queries from the general log.<\/p>\n<p>A particular direction I&#8217;m taking is to filter queries by their type of actions. For example, the tool (<a href=\"http:\/\/code.google.com\/p\/openarkkit\/source\/browse\/trunk\/openarkkit\/src\/oak\/oak-hook-general-log.py\">oak-hook-general-log<\/a>) can be instructed to only stream out those queries which involve creation of a temporary table; or those which cause for a filesort, or full index scan, etc.<\/p>\n<p>This is done by evaluating of query execution plans on the fly. I suspect the <a href=\"http:\/\/www.mysql.com\/why-mysql\/white-papers\/mysql_wp_queryanalyzer.php\">MySQL query analyzer<\/a> roughly does the same (as a small part of what it does).<\/p>\n<p>It&#8217;s almost nothing one cannot do with sed\/awk. However, I bumped into a couple of problems:<\/p>\n<ol>\n<li>The general log (and the <strong>mysql.general_log table<\/strong>, in\u00a0 particular) does not indicate the particular database one is using for the query. Since slow log does indicate this data, I <a href=\"http:\/\/bugs.mysql.com\/bug.php?id=52554\">filed a bug<\/a> on this. I currently solve this by crossing connection id with the process list, where the current database is listed. It&#8217;s shaky, but mostly works.<\/li>\n<li>Just realized: there&#8217;s no DB info in the <strong>EXPLAIN<\/strong> output! It&#8217;s weird, since I&#8217;ve been EXPLAINing queries for years now. But I&#8217;ve always had the advantage of <em>knowing<\/em> the schema used: either because I was manually executing the query on a known schema, or <a href=\"http:\/\/www.maatkit.org\/doc\/mk-query-digest.html\">mk-query-digest<\/a> was kind enough to let me know.<\/li>\n<\/ol>\n<p><!--more-->For example, look at the following imaginary query, involving both the <strong>world<\/strong> and <strong>sakila<\/strong> databases:<\/p>\n<blockquote>\n<pre>mysql&gt; use test;\r\nDatabase changed\r\nmysql&gt; EXPLAIN SELECT * FROM world.Country JOIN sakila.city WHERE Country.Capital = city.city_id;\r\n+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+\r\n| id | select_type | table\u00a0\u00a0 | type\u00a0\u00a0 | possible_keys | key\u00a0\u00a0\u00a0\u00a0 | key_len | ref\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | rows | Extra\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\r\n+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+\r\n|\u00a0 1 | SIMPLE\u00a0\u00a0\u00a0\u00a0\u00a0 | Country | ALL\u00a0\u00a0\u00a0 | NULL\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | NULL\u00a0\u00a0\u00a0 | NULL\u00a0\u00a0\u00a0 | NULL\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0 239 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\r\n|\u00a0 1 | SIMPLE\u00a0\u00a0\u00a0\u00a0\u00a0 | city\u00a0\u00a0\u00a0 | eq_ref | PRIMARY\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | PRIMARY | 2\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | world.Country.Capital |\u00a0\u00a0\u00a0 1 | Using where |\r\n+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+\r\n2 rows in set (0.00 sec)\r\n<\/pre>\n<\/blockquote>\n<p>The query is imaginary, since the tables don&#8217;t actually have anything in common. But look at the <strong>EXPLAIN<\/strong> result: can you tell where <strong>city<\/strong> came from? <strong>Country<\/strong> can somehow be parsed from the <strong>ref<\/strong> column, but no help on <strong>city<\/strong>.<\/p>\n<p>Moreover, table aliases show on the <strong>EXPLAIN<\/strong> plan (which is good), but with no reference to the original table.<\/p>\n<p>So, is it back to parsing of the SQL query? I&#8217;m <span style=\"text-decoration: line-through;\">lazy<\/span> reluctant to do that. It&#8217;s error prone, and one needs to implement, or use, a good parser, which also accepts MySQL dialect. Haven&#8217;t looked into this yet.<\/p>\n<p>I&#8217;m currently at a standstill with regard to automated query execution plan evaluation where database cannot be determined.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m further developing a general log hook, which can stream queries from the general log. A particular direction I&#8217;m taking is to filter queries by their type of actions. For example, the tool (oak-hook-general-log) can be instructed to only stream out those queries which involve creation of a temporary table; or those which cause for [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[25,15,56,34,50],"class_list":["post-2368","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-analysis","tag-execution-plan","tag-logs","tag-openark-kit","tag-scripts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-Cc","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2368","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=2368"}],"version-history":[{"count":15,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2368\/revisions"}],"predecessor-version":[{"id":2429,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2368\/revisions\/2429"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=2368"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=2368"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=2368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}