{"id":4947,"date":"2012-06-18T11:16:29","date_gmt":"2012-06-18T09:16:29","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=4947"},"modified":"2012-06-18T11:16:29","modified_gmt":"2012-06-18T09:16:29","slug":"on-stored-routines-and-dynamic-statements","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/on-stored-routines-and-dynamic-statements","title":{"rendered":"On stored routines and dynamic statements"},"content":{"rendered":"<p>I very much enjoyed reading <a href=\"http:\/\/blog.mclaughlinsoftware.com\/2012\/06\/16\/overloading-procedures\/\" rel=\"bookmark\">Overloading Procedures<\/a> by Michael McLaughlin: good stuff!<\/p>\n<p>I&#8217;m dealing with similar issues in <a href=\"http:\/\/common-schema.googlecode.com\/svn\/trunk\/common_schema\/doc\/html\/query_script.html\">common_schema\/QueryScript<\/a>, where I implement a whole new scripting language within MySQL, interpreted by stored routines. I am now finalizing the next version of <em>common_schema\/QueryScript<\/em>, with a major addition to the scripting language to put yet even more power at the hands of the programmer\/DBA using simple, clean syntax.<\/p>\n<p>Still hush hush, the development of that feature touched at the very same issues described in Michael&#8217;s post. Present in current release, these issues are intensified by the use and complexity of the new development. Here are a few insights of mine:<\/p>\n<h4>Internal array implementation<\/h4>\n<p>Like Michael, I started by implementing arrays through tables. That is, create a (temporary, in my case) table, wrap it up with a lot of stored routine code, and simulate an array. This array is not yet provided to the user, but is used internally for QueryScript&#8217;s own code.<\/p>\n<p>Well, disappointment here: during load tests on intense structures, such as a <a href=\"http:\/\/common-schema.googlecode.com\/svn\/trunk\/common_schema\/doc\/html\/query_script_foreach.html\">foreach<\/a> loop, where each iteration of the loop requires the creation of an array, I found that the current solution does not hold well on busy servers.<\/p>\n<p>Seemingly, there&#8217;s nothing wrong with the creation of a new table every once in a while &#8212; and in particular a temporary table. However, I quickly found out that a busy server thrashes the table cache with such intense rate of creation\/dropping of tables. The competition over the table cache mutex becomes intolerable and hogs not only the script&#8217;s execution but the entire server&#8217;s.<\/p>\n<p>There&#8217;s also the issue of the type of array values &#8212; no going around using textual columns, of course, but &#8212; how long? A <strong>VARCHAR(32767)<\/strong> should be enough for any reasonable implementation, but &#8212; how much memory would that consume? Both <strong>MEMORY<\/strong> and standard temporary tables (<em>Percona Server<\/em> has that partially <a href=\"http:\/\/www.mysqlperformanceblog.com\/2011\/09\/06\/dynamic-row-format-for-memory-tables\/\">resolved<\/a>) use a fixed row format, which means a 32K text is actually allocated in memory even when your value is &#8216;x&#8217;.<!--more--><\/p>\n<p>The next release completely rewrites the internal array implementation. How does it work now?<\/p>\n<p>By avoiding using tables in the first place. To implement arrays now, I chose to use a well formatted text, such that can be easily parsed and manipulated.<\/p>\n<p>I chose XML format, for which MySQL provides with <a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.1\/en\/xml-functions.html\">a couple functions<\/a>. Far from covering real XML manipulation requirements, but with a little effort and hacking, can provide with a reasonable base for work.<\/p>\n<p>This also means the entire array resides in a single string, and this can be implemented by a User Defined Variable. So &#8212; no tables!<\/p>\n<p>Moreover, there is now no need for stored procedures, since no dynamic SQL is sued. It&#8217;s all functions, which makes for an easier access. And to wrap it up &#8212; no need to decide about value limits. What fits in the string is what you get.<\/p>\n<p>This solution is not perfect, either, of course. There&#8217;s more CPU used to parse\/manipulate the XML (not a bug problem on my work since arrays are typically small), allocation\/reallocation of memory (any manipulation assigns a new value to the User Defined Variable), and handling of <strong>NULL<\/strong>s is more complex.<\/p>\n<p>Bottom line: overdoing <strong>CREATE TABLE\/DROP TABLE<\/strong>, even for <strong>TEMPORARY<\/strong>, can be a pain. Avoid it.<\/p>\n<h4>Dyamic SQL<\/h4>\n<p>You can&#8217;t invoke dynamic SQL from within dynamic SQL. There is a wide variety of additional commands that cannot be invoked by dynamic SQL.<\/p>\n<p>This calls for some attention. If you want to be able to dynamically invoke server commands issued by the user, you either limit the user&#8217;s choice of commands, or you find another way.<\/p>\n<p><em>QueryScript<\/em> is a classic <em>&#8220;interpreted by an interpreter&#8221;<\/em> code. I wish there was a better way, but I use stored routines to interpret <em>QueryScript<\/em> text. So the <em>QueryScript<\/em> interpreter takes care not to over do it. While just reading the script&#8217;s text and sending it for dynamic invocation is certainly easier to code, it makes for said limitations. Thanks to <a href=\"http:\/\/rpbouman.blogspot.co.il\/\">Roland<\/a>&#8216;s contributions, <em>common_schema<\/em> parses user&#8217;s text, analyses and find intended queries, and &#8212; sometimes &#8212; simply runs it without dynamic invocation. Think nested if-else statement.<\/p>\n<p>Still, most user&#8217;s code is indeed invoked dynamically.<\/p>\n<h4>Parsing by comma<\/h4>\n<p>If a user provides a comma delimited string, it is very dangerous to parse it via comma. The text <strong>&#8216;one,two,three,four&#8217;<\/strong> seems fine, but what about <strong>&#8216;one,two,&#8221;hi, dad&#8221;&#8216;<\/strong>?<\/p>\n<p>So <em>common_schema<\/em> does a lot of such parsing. It takes care to properly analyze quoted text. But more than that: since the parsing of such text is costly, <em>common_schema<\/em> translates such text to a well-formed delimited text, such that there is a known, unique delimiter, not to be found within quotes, such that be splitting using said delimiter we get the definite tokenizing of the text.<\/p>\n<p>To illustrate how it works: it attempts an unlikely single-character delimiter. Consider <strong>&#8216;\\0&#8217;<\/strong>. Is <strong>&#8216;\\0&#8217;<\/strong> found in the original text? If not, we can once analyze the proper tokenization, then replace appropriate commas with <strong>&#8216;\\0&#8217;<\/strong>. Wait, is <strong>&#8216;\\0&#8217;<\/strong> found anywhere in the text? Then we try the even more unlikely delimiter <strong>&#8216;\\0\\b&#8217;<\/strong>, then the yet unlikelier <strong>&#8216;\\0\\b\\r&#8217;<\/strong>, etc.<\/p>\n<h4>QueryScript<\/h4>\n<p>QueryScript is a way to write stored SQL code &#8220;the way it ought to be&#8221;: simple, clean, useful, dynamic, flexible, wicked cool. A lot is going on, and I have enough ideas to last a couple years worth of development. No intention to create a heavyweight, full blown language. On the contrary: the language is intended to be simple, lightweight, and provide with easy to access interface to complex SQL operations. <a href=\"http:\/\/common-schema.googlecode.com\/svn\/trunk\/common_schema\/doc\/html\/query_script.html\">Start the journey here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I very much enjoyed reading Overloading Procedures by Michael McLaughlin: good stuff! I&#8217;m dealing with similar issues in common_schema\/QueryScript, where I implement a whole new scripting language within MySQL, interpreted by stored routines. I am now finalizing the next version of common_schema\/QueryScript, with a major addition to the scripting language to put yet even more [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[67,82,76,59],"class_list":["post-4947","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-common_schema","tag-prepared-statements","tag-queryscript","tag-stored-routines"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-1hN","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/4947","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=4947"}],"version-history":[{"count":23,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/4947\/revisions"}],"predecessor-version":[{"id":4970,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/4947\/revisions\/4970"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=4947"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=4947"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=4947"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}