{"id":1908,"date":"2010-02-09T09:47:21","date_gmt":"2010-02-09T07:47:21","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=1908"},"modified":"2010-02-11T08:52:52","modified_gmt":"2010-02-11T06:52:52","slug":"monotonic-functions-sql-and-mysql","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/monotonic-functions-sql-and-mysql","title":{"rendered":"Monotonic functions, SQL and MySQL"},"content":{"rendered":"<blockquote><p>In mathematics, a <strong>monotonic function<\/strong> (or <strong>monotone function<\/strong>) is a function which preserves the given order. [<a href=\"http:\/\/en.wikipedia.org\/wiki\/Monotonic_function\">Wikipedia<\/a>]<\/p><\/blockquote>\n<p>To be more precise, a function <em>f<\/em> is monotonic increasing, if for every <em>x \u2264 y<\/em> it holds that <em>f(x)\u00a0\u2264 f(y)<\/em>. <em>f<\/em> is said to be strictly monotonic increasing is for every <em>x &lt; y<\/em> it holds that <em>f(x) &lt; f(y)<\/em>.<\/p>\n<p>So, if we follow values in some order, we say that <em>f<\/em> is monotonic increasing if <em>f<\/em>&#8216;s value never decreases (it either increases or stays the same), and we say that <em>f<\/em> is strictly increasing if <em>f<\/em>&#8216;s value is always changes &#8220;upwards&#8221;.<\/p>\n<p>Monotonic functions play an important role in SQL. To discuss monotonic functions in SQL we must first determine what the <em>order<\/em> is, and then, what the <em>function<\/em> is.<\/p>\n<p>Well, they both change according to our point of view. Let&#8217;s look at some examples. Take a look at the following table:<!--more--><\/p>\n<blockquote>\n<pre>CREATE TABLE `log` (\r\n `<strong>id<\/strong>` int(11) NOT NULL <strong>AUTO_INCREMENT<\/strong>,\r\n `<strong>ts<\/strong>` <strong>timestamp<\/strong> NOT NULL <strong>DEFAULT CURRENT_TIMESTAMP<\/strong>,\r\n `error_level` tinyint(4) DEFAULT NULL,\r\n `subject` varchar(32) DEFAULT NULL,\r\n `description` varchar(255) DEFAULT NULL,\r\n PRIMARY KEY (`id`)\r\n)<\/pre>\n<\/blockquote>\n<p>In the above <strong>log<\/strong> table, log entries are added with <strong>id<\/strong> and <strong>ts<\/strong> getting automatically evaluated. Assuming no dirty hacks occur, we can expect that <strong>ts<\/strong> in <em>monotonic<\/em> by order of <strong>id<\/strong>. That is, as <strong>id<\/strong> increases, so does <strong>ts<\/strong>. Is is possible that we get the same <strong>ts<\/strong> for a few rows (it is not unique), but once it increases, it never decreases again.<\/p>\n<h4>Why is this interesting?<\/h4>\n<p>Because it simplifies common problems.<\/p>\n<p>For example, it simplifies a search for a given <strong>ts<\/strong> value, when no index exists on the <strong>ts<\/strong> column. If we were to look for a log entry from <strong>&#8216;2009-02-07 11:58:00&#8217;<\/strong> by simple <strong>SELECT<\/strong>, we would have to use a full table scan. But, by knowing that <strong>ts<\/strong> is monotonic, we can also use <a href=\"http:\/\/en.wikipedia.org\/wiki\/Binary_search_algorithm\">binary search<\/a> on <strong>id<\/strong>.<\/p>\n<p>As another example, it simplifies the task of purging all rows up to last midnight. Instead of issuing <strong>&#8220;DELETE FROM log WHERE ts &lt; DATE(NOW())&#8221;<\/strong>, thus using, again, full table scan plus locking all rows (depending on storage engine), we can use other methods:<\/p>\n<ul>\n<li>We can detect the <strong>id<\/strong> for the first row which holds the condition using binary search, then <strong>&#8220;DELETE FROM log WHERE id &lt; ###&#8221;<\/strong><\/li>\n<li>Or we can slowly work our way in ascending <strong>id<\/strong> order, issuing something like <strong>&#8220;DELETE FROM log WHERE ts &lt; DATE(NOW()) ORDER BY id ASC LIMIT 1000&#8221;<\/strong>, and stop once the <strong>ROW_COUNT()<\/strong> is less than <strong>1000<\/strong>. We know we need not advance any further, without needing to compute anything. We thus block less, while retaining correctness of our operation.<\/li>\n<\/ul>\n<h4>Monotonic functions in MySQL<\/h4>\n<p>When we iterate InnoDB tables (as in full table scan), we know that rows are iterated in ascending <strong>PRIMARY KEY<\/strong> order <a name=\"note1m\" href=\"#note1\">[\u00b9]<\/a>. So the <strong>PRIMARY KEY<\/strong> dictates the order by which monotonic functions are evaluated.<\/p>\n<p>With MyISAM, rows are iterated according to internal storage order. It has nothing to do with <strong>PRIMARY KEY<\/strong>s (though depending on <strong>concurrent_insert<\/strong> this can be somewhat controlled). It also has nothing to do with chronological order. Newer rows may capture space held by older rows.<\/p>\n<p>But MyISAM allows for <strong>ALTER TABLE &#8230; ORDER BY &#8230;<\/strong> syntax, which allows us to do a one-time sort of the table. Assuming no writes shortly thereafter, a full table scan will iterate the rows according to specified order.<\/p>\n<h4>Monotonic functions and indexes<\/h4>\n<p>A column which is indexed dictates a monotonic function by index order.<\/p>\n<p>Wait. Isn&#8217;t that obvious? Of course: if we index a column, then the index sorts by that column, and the column is ascending by the index order which is,&#8230; itself.<\/p>\n<p>I call that trivial, but it does interest us: because, while mathematically there may be nothing significant here, we do care about this order when we have index scans. So, if we can force an index scan on our query, then we can anticipate the order by which rows are processed; we now have some order by which to evaluate monotonic functions.<\/p>\n<p>OK, maybe I made it sound more complicated than it really is. Monotonic functions work well when the <em>order<\/em> by which they are monotonic is some indexed column(s). The <strong>AUTO_INCREMENT PRIMARY KEY<\/strong> we saw in the <strong>log<\/strong> example above, is perhaps the most trivial case.<\/p>\n<p>While MySQL does not support function indexes, if the function we consider is monotonic, we still benefit from adding an index on the raw column.<\/p>\n<h4>Other examples of monotonic functions<\/h4>\n<p>So, where else can we find them? Timestamp columns are probably the most common (this post holds true until time travel to the past is introduced).<\/p>\n<p>But also summaries: like a reporting table which lists down some ever-ascending value (the number of books ever sold in our store; trip mileage; hop counter; etc.).<\/p>\n<p>I&#8217;ve seen many cases (though difficult to illustrate in this scope) when foreign key values are in ascending order. A very brief example is a 1-1 relation between two denormalized tables, where the tables ids do not necessarily have to match, but is always ascending).<\/p>\n<p>And Baron&#8217;s <a href=\"http:\/\/www.xaprb.com\/blog\/2010\/01\/22\/my-wishlist-for-sql-the-until-clause\/\">wishlist for SQL<\/a> can also benefit from monotonic functions.<\/p>\n<h4>Conclusion<\/h4>\n<p>When a monotonic function is present, it brings an added value to our schema and query design. It allows for less indexing; quicker operations. Look for these. I&#8217;ve only discussed increasing functions. Indeed, MySQL&#8217;s indexes are always increasing (they cannot be defined in decreasing order), but query simplifications work just as well for monotonic decreasing functions.<\/p>\n<p><a name=\"note1\" href=\"#note1m\">[\u00b9]<\/a> I&#8217;ve actually seen a different behavior on temporary InnoDB tables and on compressed InnoDB Plugin tables; I&#8217;ll write on this on another occasion.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In mathematics, a monotonic function (or monotone function) is a function which preserves the given order. [Wikipedia] To be more precise, a function f is monotonic increasing, if for every x \u2264 y it holds that f(x)\u00a0\u2264 f(y). f is said to be strictly monotonic increasing is for every x &lt; y it holds that [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[26,54,21],"class_list":["post-1908","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-indexing","tag-math","tag-sql"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-uM","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1908","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=1908"}],"version-history":[{"count":30,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1908\/revisions"}],"predecessor-version":[{"id":1959,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1908\/revisions\/1959"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=1908"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=1908"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=1908"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}