{"id":2144,"date":"2010-03-10T20:28:29","date_gmt":"2010-03-10T18:28:29","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=2144"},"modified":"2010-03-10T20:28:29","modified_gmt":"2010-03-10T18:28:29","slug":"mk-schema-change-check-out-ideas-from-oak-online-alter-table","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/mk-schema-change-check-out-ideas-from-oak-online-alter-table","title":{"rendered":"mk-schema-change? Check out ideas from oak-online-alter-table"},"content":{"rendered":"<p>In response to Mark Callaghan&#8217;s post <a href=\"http:\/\/www.facebook.com\/note.php?note_id=356997370932\">mk-schema-change<\/a>.<\/p>\n<p>I apologize for not commenting on the post itself, I do not hold a Facebook account. Anyway this is a long write, so it may as well deserve a post of its own.<\/p>\n<p>Some of the work Mark is describing already exists under <a href=\"http:\/\/code.openark.org\/forge\/openark-kit\">openark kit<\/a>&#8216;s <a href=\"http:\/\/code.openark.org\/forge\/openark-kit\/oak-online-alter-table\">oak-online-alter-table<\/a>. Allow me to explain what I have gained there, and how the issue can be further pursued. There is relevance to Mark&#8217;s suggestion.<\/p>\n<p><em>oak-online-alter-table<\/em> uses a combination of locks, chunks and triggers to achieve an almost non-blocking <strong>ALTER TABLE<\/strong> effect. I had a very short opportunity to speak with Mark on last year&#8217;s conference, in between bites. Mark stated that anything involving triggers was irrelevant in his case.<\/p>\n<p>The triggers are a pain, but I believe a few other insights from <em>oak-online-alter-table<\/em> can be of interest.<!--more--><\/p>\n<h4>The first attempt<\/h4>\n<p>My first attempt with the script assumed:<\/p>\n<ul>\n<li>Table has an <strong>AUTO_INCREMENT PRIMARY KEY<\/strong> column<\/li>\n<li>New rows always gain ascending <strong>PRIMARY KEY<\/strong> values<\/li>\n<li><strong>PRIMARY KEY<\/strong> never changes for an existing row<\/li>\n<li><strong>PRIMARY KEY<\/strong> values are never reused<\/li>\n<li>Rows may be deleted at will<\/li>\n<li>No triggers exist on the table<\/li>\n<li>No <strong>FOREIGN KEY<\/strong>s exist on the table.<\/li>\n<\/ul>\n<p>So the idea was: when one wants to do an <strong>ALTER TABLE<\/strong>:<\/p>\n<ol>\n<li>Create a <em>ghost<\/em> table with the new structure.<\/li>\n<li>Read the minimum and maximum PK values.<\/li>\n<li>Create <strong>AFTER INSERT<\/strong>, <strong>AFTER UPDATE<\/strong>, <strong>AFTER DELETE<\/strong> triggers on the original table. These triggers will propagate the changes onto the <em>ghost<\/em> table.<\/li>\n<li>Working out slowly, and in small chunks, copy rows within recorded min-max values range into the <em>ghost<\/em> table. The interesting part is where the script makes sure there&#8217;s no contradiction between these actions and those of the triggers, (whichever came first!). This is largely solved using <strong>INSERT IGNORE<\/strong> and <strong>REPLACE INTO<\/strong> in the proper context.<\/li>\n<li>Working out slowly and in chunks again, we <em>remove<\/em> rows from the <em>ghost<\/em> table, which are no longer existent in the original table.<\/li>\n<li>Once all chunking is complete, <strong>RENAME<\/strong> original table to *_old, and <em>ghost<\/em> table in place of the original table.<\/li>\n<\/ol>\n<p>Steps <strong>4<\/strong> &amp; <strong>5<\/strong> are similar in concept to transactional recovery through <em>redo logs<\/em> and <em>undo logs<\/em>.<\/p>\n<h4>The next attempt<\/h4>\n<p>Next phase removed the <strong>AUTO_INCREMENT<\/strong> requirement, as well as the &#8220;no reuse of PK&#8221;. In fact, the only remaining constraints were:<\/p>\n<ul>\n<li>There is some <strong>UNIQUE KEY<\/strong> on the table which is unaffected by the <strong>ALTER<\/strong> operation<\/li>\n<li>No triggers exist on the table<\/li>\n<li>No <strong>FOREIGN KEY<\/strong>s exist on the table.<\/li>\n<\/ul>\n<p>The steps are in general very similar to those listed previously, only now a more elaborate chunking method is used with possible non-integer, possible multi-column chunking algorithm. Also, the triggers take care of changes in <strong>UNIQUE KEY<\/strong> values themselves.<\/p>\n<h4>mk-schema-change?<\/h4>\n<p>Have a look at the <a href=\"http:\/\/code.google.com\/p\/openarkkit\/w\/list\">wiki pages<\/a> for OnlineAlterTable*. There is some discussion on concurrency issues; on transactional behavior, which explains why <em>oak-online-alter-table<\/em> performs correctly. Some of these are very relvant, I believe, to Mark&#8217;s suggestion. In particular, making the chunks copy; retaining transactional integrity, etc.<\/p>\n<p>To remove any doubt, <em>oak-online-alter-table<\/em> is<em> <\/em> <strong>not production ready<\/strong> or anywhere near. Use at your own risk. I&#8217;ve seen it work, and I&#8217;ve seen it crash. I got little feedback and thus little chance to fix things. I also didn&#8217;t touch the code for quite a few months now, so I&#8217;m a little rusty myself.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In response to Mark Callaghan&#8217;s post mk-schema-change. I apologize for not commenting on the post itself, I do not hold a Facebook account. Anyway this is a long write, so it may as well deserve a post of its own. Some of the work Mark is describing already exists under openark kit&#8216;s oak-online-alter-table. Allow me [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[34,17,50],"class_list":["post-2144","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-openark-kit","tag-schema","tag-scripts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-yA","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2144","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=2144"}],"version-history":[{"count":11,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2144\/revisions"}],"predecessor-version":[{"id":2155,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2144\/revisions\/2155"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=2144"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=2144"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=2144"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}