{"id":5347,"date":"2012-10-09T06:44:12","date_gmt":"2012-10-09T04:44:12","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=5347"},"modified":"2012-10-09T06:44:12","modified_gmt":"2012-10-09T04:44:12","slug":"your-magical-range-partitioning-maintenance-query","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/your-magical-range-partitioning-maintenance-query","title":{"rendered":"Your magical RANGE partitioning maintenance query"},"content":{"rendered":"<p>If you use <strong>RANGE<\/strong> (or <strong>RANGE COLUMNS<\/strong>) partitioning, and in particular when partitioning by date\/time, then your are subject to the following questions: how and when do you create the &#8220;next&#8221; partition? How and when do you drop your older partitions?<\/p>\n<p>Many people use in-house scripting to do that, and Giuseppe Maxia wrote <a href=\"http:\/\/datacharmer.blogspot.co.il\/2008\/12\/partition-helper-improving-usability.html\">Partition Helper<\/a>. But I would like to take you one step forward, and provide with a <em>query<\/em> (based on views) which automagically understands which new partition you want to create, and provides you with the statement to do so. It looks somewhat like this (a demo follows later on):<\/p>\n<blockquote>\n<pre>mysql&gt; SELECT * FROM <strong>sql_range_partitions<\/strong> \\G\r\n*************************** 1. row ***************************\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_schema: test\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_name: city\r\n<strong>sql_drop_first_partition<\/strong>: alter table `test`.`city` drop partition `p3`\r\n\u00a0 <strong>sql_add_next_partition<\/strong>: alter table `test`.`city` add partition (partition `p_20160101000000` values less than (736329) \/* 2016-01-01 00:00:00 *\/ )\r\n*************************** 2. row ***************************\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_schema: test\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_name: quarterly_report_status\r\n<strong>sql_drop_first_partition<\/strong>: alter table `test`.`quarterly_report_status` drop partition `p3`\r\n\u00a0 <strong>sql_add_next_partition<\/strong>: alter table `test`.`quarterly_report_status` reorganize partition `p_maxvalue` into (partition `p_20110401000000` values less than (1301608800) \/* 2011-04-01 00:00:00 *\/ , partition p_maxvalue values less than MAXVALUE)<\/pre>\n<\/blockquote>\n<h4>A closer look at why this is magic<\/h4>\n<p>This query just gave you the <strong>DROP PARTITION<\/strong> and <strong>ADD PARTITION<\/strong> for all tables in your databases that use a <strong>RANGE<\/strong> partitioning scheme. But, consider:<!--more--><\/p>\n<ul>\n<li>The query <em>automatically<\/em> deduces the <strong>LESS THAN<\/strong> value of the new partition. It looks for a constant interval, time-based or integer-based, and keeps this interval onward.<\/li>\n<li>It understands that <strong>5.1<\/strong> does not allow you to partition by <strong>DATETIME<\/strong>, only via integers. It understands your integer may sometimes stand for <strong>TO_DAYS()<\/strong>, and sometimes for <strong>UNIX_TIMESTAMP()<\/strong> of your datetime. It auto-detects that.<\/li>\n<li>The query recognizes a <strong>MAXVALUE<\/strong> partition, and if such partition exists, it provides with a <strong>REORGANIZE PARTITION<\/strong> statement rather than <strong>ADD PARTITION<\/strong> statement.<\/li>\n<li>It suggests names for your partitions which give you a clue on what the partition contains. <strong>p_20160101000000<\/strong> (can you splot the date\/time?) tells me a lot more than some arbitrary <strong>p17<\/strong>.<\/li>\n<li>It recognizes the common case of using a <strong>LESS THAN (0)<\/strong> as first partition, to take care of <strong>NULL<\/strong>s. It skips this partition: the query does not offer to drop it, not does it consider it while examining the interval.<\/li>\n<\/ul>\n<p>So I never have to tell the query <em>&#8220;I want a 3 month interval between partitions, and these are implemented using TO_DAYS()&#8221;<\/em>. I let it understand it <em>on its own<\/em>.<\/p>\n<h4>It is just a view<\/h4>\n<p>Which means you can <strong>SELECT sql_drop_first_partition<\/strong>, or you can <strong>SELECT sql_add_next_partition<\/strong>, or you can only <strong>SELECT &#8230; WHERE table_schema=&#8217;your_schema&#8217;<\/strong>. Or you can select them all.<\/p>\n<h4>You can eval() it<\/h4>\n<p>This view will be released with <a href=\"http:\/\/code.google.com\/p\/common-schema\/\">common_schema<\/a>&#8216;s next version. <em>common_schema<\/em> has a lot of these views which generate SQL statements. And it provides with the means to evaluate them: the <a href=\"http:\/\/common-schema.googlecode.com\/svn\/trunk\/common_schema\/doc\/html\/eval.html\">eval()<\/a> routine. So you don&#8217;t need to export the text <strong>INTO OUTFILE<\/strong> and execute it via <strong>SOURCE<\/strong>. You can simply:<\/p>\n<blockquote>\n<pre>call common_schema.<strong>eval<\/strong>(<span style=\"color: #000080;\">\"SELECT\u00a0<strong>sql_add_next_partition<\/strong> FROM sql_range_partitions\u00a0WHERE table_name='quarterly_report_status'\"<\/span>);<\/pre>\n<\/blockquote>\n<p>And it is done.<\/p>\n<h4>Get it<\/h4>\n<p>The<strong>\u00a0sql_range_partitions<\/strong> view will be included in <em>common_schema<\/em> <strong>1.2<\/strong>, schedules to be released soon. Meanwhile, you can import this file: [download id=&#8221;1&#8243; format=&#8221;1&#8243;] onto your existing <a href=\"http:\/\/code.google.com\/p\/common-schema\">common_schema<\/a> <strong>1.1<\/strong> install (what? You don&#8217;t already have <em>common_schema<\/em> installed? You should know it&#8217;s packed with lots of stuff like this one!)<\/p>\n<p>If, by the time you read this, <em>common_schema<\/em> <strong>1.2<\/strong> is already out, you don&#8217;t need to add anything.<\/p>\n<h4>In action<\/h4>\n<p>Consider the following table on a MySQL <strong>5.1<\/strong> server:<\/p>\n<blockquote>\n<pre><strong>CREATE TABLE<\/strong> test.quarterly_report_status (\r\n\u00a0\u00a0\u00a0 report_id INT NOT NULL,\r\n\u00a0\u00a0\u00a0 report_status VARCHAR(20) NOT NULL,\r\n\u00a0\u00a0\u00a0 report_updated TIMESTAMP NOT NULL \r\n)\r\n<strong>PARTITION BY RANGE<\/strong> (<strong>UNIX_TIMESTAMP(report_updated)<\/strong>) (\r\n\u00a0\u00a0\u00a0 PARTITION p0 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2008-01-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p1 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2008-04-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p2 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2008-07-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p3 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2008-10-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p4 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2009-01-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p5 VALUES LESS THAN (<strong>UNIX_TIMESTAMP('2009-04-01 00:00:00')<\/strong>),\r\n\u00a0\u00a0\u00a0 PARTITION p6 VALUES LESS THAN (<strong>MAXVALUE<\/strong>)\r\n);<\/pre>\n<\/blockquote>\n<p>Unfortunately MySQL does not remember the definition expressions, so:<\/p>\n<blockquote>\n<pre>mysql&gt; <strong>SHOW CREATE TABLE<\/strong> test.quarterly_report_status \\G\r\n\r\nCreate Table: <strong>CREATE TABLE<\/strong> `quarterly_report_status` (\r\n\u00a0 `report_id` int(11) NOT NULL,\r\n\u00a0 `report_status` varchar(20) NOT NULL,\r\n\u00a0 `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP\r\n) ENGINE=MyISAM DEFAULT CHARSET=latin1\r\n\/*!50100 <strong>PARTITION BY RANGE<\/strong> (<strong>UNIX_TIMESTAMP(report_updated)<\/strong>)\r\n(PARTITION p0 VALUES LESS THAN (<strong>1199138400<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p1 VALUES LESS THAN (<strong>1206997200<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p2 VALUES LESS THAN (<strong>1214859600<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p3 VALUES LESS THAN (<strong>1222808400<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p4 VALUES LESS THAN (<strong>1230760800<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p5 VALUES LESS THAN (<strong>1238533200<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION p6 VALUES LESS THAN MAXVALUE ENGINE = MyISAM) *\/<\/pre>\n<\/blockquote>\n<p>Yikes! What does<strong> <\/strong><strong>1238533200<\/strong> stand for?<\/p>\n<p>No worries, let&#8217;s <strong>eval()<\/strong>:<\/p>\n<blockquote>\n<pre>mysql&gt; call common_schema.<strong>eval<\/strong>(<span style=\"color: #000080;\">\"SELECT <strong>sql_add_next_partition<\/strong> FROM sql_range_partitions <strong>WHERE table_name='quarterly_report_status'<\/strong>\"<\/span>);\r\n\r\nmysql&gt; <strong>SHOW CREATE TABLE<\/strong> test.quarterly_report_status \\G\r\n\r\nCREATE TABLE `quarterly_report_status` (\r\n\u00a0 `report_id` int(11) NOT NULL,\r\n\u00a0 `report_status` varchar(20) NOT NULL,\r\n\u00a0 `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP\r\n) ENGINE=MyISAM DEFAULT CHARSET=latin1\r\n\/*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(report_updated))\r\n(PARTITION p0 VALUES LESS THAN (1199138400) ENGINE = MyISAM,\r\n\u00a0PARTITION p1 VALUES LESS THAN (1206997200) ENGINE = MyISAM,\r\n\u00a0PARTITION p2 VALUES LESS THAN (1214859600) ENGINE = MyISAM,\r\n\u00a0PARTITION p3 VALUES LESS THAN (1222808400) ENGINE = MyISAM,\r\n\u00a0PARTITION p4 VALUES LESS THAN (1230760800) ENGINE = MyISAM,\r\n\u00a0PARTITION p5 VALUES LESS THAN (1238533200) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_20090701000000<\/strong> VALUES LESS THAN (<strong>1246395600<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_maxvalue<\/strong> VALUES LESS THAN MAXVALUE ENGINE = MyISAM) *\/<\/pre>\n<\/blockquote>\n<p>Our query auto-detected the meaning of those numbers like <strong><\/strong><strong>1238533200<\/strong>, and has found the next partition to be created:<strong> p_20090701000000<\/strong>. That&#8217;s <strong>&#8216;2009-07-01 00:00:00&#8217;<\/strong>, and now we know what the partition stands for. A new <strong>MAXVALUE<\/strong> partition called <strong>p_maxvalue<\/strong> is created.<\/p>\n<p>Just for the fun of it, let&#8217;s issue the same a few more times and see what comes out:<\/p>\n<blockquote>\n<pre>mysql&gt; call common_schema.<strong>eval<\/strong>(\"SELECT sql_add_next_partition FROM sql_range_partitions WHERE table_name='quarterly_report_status'\");\r\n\r\nmysql&gt; call common_schema.<strong>eval<\/strong>(\"SELECT sql_add_next_partition FROM sql_range_partitions WHERE table_name='quarterly_report_status'\");\r\n\r\nmysql&gt; call common_schema.<strong>eval<\/strong>(\"SELECT sql_add_next_partition FROM sql_range_partitions WHERE table_name='quarterly_report_status'\");\r\n\r\nmysql&gt; call common_schema.<strong>eval<\/strong>(\"SELECT sql_add_next_partition FROM sql_range_partitions WHERE table_name='quarterly_report_status'\");\r\n\r\nmysql&gt; <strong>SHOW CREATE TABLE<\/strong> test.quarterly_report_status \\G\r\n\r\nCreate Table: CREATE TABLE `quarterly_report_status` (\r\n\u00a0 `report_id` int(11) NOT NULL,\r\n\u00a0 `report_status` varchar(20) NOT NULL,\r\n\u00a0 `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP\r\n) ENGINE=MyISAM DEFAULT CHARSET=latin1\r\n\/*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(report_updated))\r\n(PARTITION p0 VALUES LESS THAN (1199138400) ENGINE = MyISAM,\r\n\u00a0PARTITION p1 VALUES LESS THAN (1206997200) ENGINE = MyISAM,\r\n\u00a0PARTITION p2 VALUES LESS THAN (1214859600) ENGINE = MyISAM,\r\n\u00a0PARTITION p3 VALUES LESS THAN (1222808400) ENGINE = MyISAM,\r\n\u00a0PARTITION p4 VALUES LESS THAN (1230760800) ENGINE = MyISAM,\r\n\u00a0PARTITION p5 VALUES LESS THAN (1238533200) ENGINE = MyISAM,\r\n\u00a0PARTITION p_20090701000000 VALUES LESS THAN (1246395600) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_20091001000000<\/strong> VALUES LESS THAN (<strong>1254348000<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_20100101000000<\/strong> VALUES LESS THAN (<strong>1262296800<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_20100401000000<\/strong> VALUES LESS THAN (<strong>1270069200<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_20100701000000<\/strong> VALUES LESS THAN (<strong>1277931600<\/strong>) ENGINE = MyISAM,\r\n\u00a0PARTITION <strong>p_maxvalue<\/strong> VALUES LESS THAN MAXVALUE ENGINE = MyISAM) *\/<\/pre>\n<\/blockquote>\n<p>Notice the number interval is not constant: we have different number of days in different quarters. We have Feb <strong>29th<\/strong> every <strong>4<\/strong> years. Yet we get the right <strong>LESS THAN<\/strong> value.<\/p>\n<p>Now isn&#8217;t this cool?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you use RANGE (or RANGE COLUMNS) partitioning, and in particular when partitioning by date\/time, then your are subject to the following questions: how and when do you create the &#8220;next&#8221; partition? How and when do you drop your older partitions? Many people use in-house scripting to do that, and Giuseppe Maxia wrote Partition Helper. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[67,88],"class_list":["post-5347","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-common_schema","tag-partitioning"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-1of","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/5347","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=5347"}],"version-history":[{"count":53,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/5347\/revisions"}],"predecessor-version":[{"id":5633,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/5347\/revisions\/5633"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=5347"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=5347"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=5347"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}