{"id":8127,"date":"2021-03-17T12:36:19","date_gmt":"2021-03-17T10:36:19","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=8127"},"modified":"2021-03-17T17:21:25","modified_gmt":"2021-03-17T15:21:25","slug":"the-problem-with-mysql-foreign-key-constraints-in-online-schema-changes","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/the-problem-with-mysql-foreign-key-constraints-in-online-schema-changes","title":{"rendered":"The problem with MySQL foreign key constraints in Online Schema Changes"},"content":{"rendered":"<p>This post explains the inherent problem of running online schema changes in MySQL, on tables participating in a foreign key relationship. We&#8217;ll lay some ground rules and facts, sketch a simplified schema, and dive into an online schema change operation.<\/p>\n<p>Our discussion applies to <a href=\"https:\/\/www.percona.com\/doc\/percona-toolkit\/3.0\/pt-online-schema-change.html\">pt-online-schema-change<\/a>, <a href=\"https:\/\/github.com\/github\/gh-ost\">gh-ost<\/a>, and <a href=\"https:\/\/vitess.io\/docs\/user-guides\/schema-changes\/ddl-strategies\/#onlinevreplication\">Vitess<\/a> based migrations, or any other online schema change tool that works with a shadow\/ghost table like the Facebook tools.<\/p>\n<h2>Why Online Schema Change?<\/h2>\n<p>Online schema change tools come as workarounds to an old problem: schema migrations in MySQL were blocking, uninterruptible, aggressive in resources, replication unfriendly. Running a straight <code>ALTER TABLE<\/code> in production means locking your table, generating high load on the primary, causing massive replication lag on replicas once the migration moves down the replication stream.<\/p>\n<h2>Isn&#8217;t there some Online DDL?<\/h2>\n<p>Yes. InnoDB supports <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/innodb-online-ddl-operations.html\">Online DDL<\/a>, where for many <code>ALTER<\/code> types, your table remains unblocked throughout the migration. That&#8217;s an important improvement, but unfortunately not enough. Some migration types do not permit concurrent DDL (notably changing column data type, e.g. from <code>INT<\/code> to <code>BIGINT<\/code>). Migration is still aggressive and generates high load on your server. Replicas still run the migration sequentially. If your migration takes 5 hours to run concurrently on the primary, expect a 5 hour replication lag on your replica, i.e. complete loss of your fresh read capacity.<\/p>\n<h2>Isn&#8217;t there some Instant DDL?<\/h2>\n<p>Yes. But unfortunately extremely limited. Mostly just for adding a new column. See <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/alter-table.html\">here<\/a> or again <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/innodb-online-ddl-operations.html\">here<\/a>. Instant DDLs showed great promise when <a href=\"https:\/\/mysqlserverteam.com\/mysql-8-0-innodb-now-supports-instant-add-column\/\">introduced<\/a> (contributed to MySQL by Tencent Games DBA Team) three years ago, and the hope was that MySQL would support many more types of <code>ALTER TABLE<\/code> in <code>INSTANT<\/code> DDL. At this time this has not happened yet, and we do with what we have.<\/p>\n<h2>Not everyone is Google or Facebook scale, right?<\/h2>\n<p>True. But you don&#8217;t need to to be Google, or Facebook, or GitHub etc. scale to feel the pain of schema changes. Any non trivially sized table takes time to <code>ALTER<\/code>, which results with lock\/downtime. If your tables are limited to hundreds or mere thousands of small rows, you can get away with it. When your table grows, and a mere dozens of MB of data is enough, <code>ALTER<\/code> becomes non-trivial at best case, and outright a cause of outage in a\u00a0<em>common<\/em> scenario, in my experience.<\/p>\n<h2>Let&#8217;s discuss foreign key constraints<\/h2>\n<p>In the relational model tables have relationships. A column in one table indicates a column in another table, so that a row in one table has a relationship one or more rows in another table. That&#8217;s the &#8220;foreign key&#8221;. A foreign key <em>constraint<\/em> is the enforcement of that relationship. A foreign key constraint is a database construct which watches over rows in different tables and ensures the relationship does not break. For example, it may prevent me from deleting a row that is in a relationship, to prevent the related row(s) from becoming orphaned.<!--more--><\/p>\n<h2>Is this a biased post? We hear you don&#8217;t like foreign keys<\/h2>\n<p>No, this is a technical discussion (we&#8217;re getting there, I promise). But, for context:<\/p>\n<p>I&#8217;ve been working on and around schema migration for many years now, and my current work on Vitess introduces some <em>outrageous<\/em> new super powers for schema migrations, which I can&#8217;t wait to present (and if you can&#8217;t wait, either, feel free to browse the public PRs, it&#8217;s free and open source).<\/p>\n<p>Every once in a while, <a href=\"https:\/\/github.com\/github\/gh-ost\/issues\/331\">this<\/a> pops up, on twitter, on Hacker News, on internal discussions. And the question gets asked: why can&#8217;t we support foreign keys?<\/p>\n<p>And so this post explains why, technically, there&#8217;s an inherent problem in supporting foreign keys in Online Schema Changes. This is not about opinions for or against foreign keys.<\/p>\n<h2>Wait! pt-online-schema-change <em>does<\/em>\u00a0support foreign keys! There&#8217;s command line flags for that!<\/h2>\n<p>Yes, no. Not quite, and I&#8217;ll elaborate as we dive into the details. And, to clarify, <code>pt-online-schema-change<\/code> attempts to make the best of the situation. Back when developing <code>gh-ost<\/code>, we saw that as a non-feasible solution. <code>pt-online-schema-change<\/code> does a good job at explaining the restrictions and limitations of its foreign key support, and we will cover these and beyond, here.<\/p>\n<p>OK, let&#8217;s dive in.<\/p>\n<h2>We begin, let&#8217;s first present a model<\/h2>\n<p>Consider the following extremely simplified model. Don&#8217;t judge me on the oversimplification, we just want to address the foreign keys issue here.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nCREATE TABLE country (\nid INT NOT NULL,\nname VARCHAR(255) NOT NULL,\nPRIMARY KEY (id)\n);\n\nCREATE TABLE person (\nid INT NOT NULL,\ncountry_id INT NOT NULL,\nname VARCHAR(255) NOT NULL,\nPRIMARY KEY(id),\nKEY country_idx (country_id),\nCONSTRAINT person_country_fk FOREIGN KEY (country_id) REFERENCES country(id) ON DELETE NO ACTION\n);\n\nCREATE TABLE company (\nid INT NOT NULL,\ncountry_id INT NOT NULL,\nname VARCHAR(255) NOT NULL,\nPRIMARY KEY(id),\nKEY country_idx (country_id),\nCONSTRAINT company_country_fk FOREIGN KEY (country_id) REFERENCES country(id) ON DELETE NO ACTION\n);\n<\/pre>\n<h2>Some analysis, rules and facts<\/h2>\n<ul>\n<li>In the above we have 3 tables participating in two foreign key relationship.\n<ul>\n<li>We will add a 4th one later<\/li>\n<li><code>country<\/code> is a <em>parent<\/em> table in both relationship<\/li>\n<li><code>person<\/code> is a <em>child<\/em>\u00a0table in relationship with <code>country<\/code><\/li>\n<li><code>company<\/code> is a <em>child<\/em>\u00a0table in relationship with <code>country<\/code><\/li>\n<\/ul>\n<\/li>\n<li>Let&#8217;s assume\/agree that <code>country<\/code> is a small table (maybe a couple hundred rows), and that both <code>person<\/code> and <code>company<\/code> are large tables (just, large enough to be a problem)<\/li>\n<li>MySQL doesn&#8217;t support foreign keys, per se. At this time, foreign keys are implemented by the storage engine, which is InnoDB in our case. This matters. I just dug this <a href=\"http:\/\/code.openark.org\/blog\/mysql\/triggers-use-case-compilation-part-i\">post from 2009<\/a>, quote:<br \/>\n> MySQL\u2019s plan is to add foreign keys for all storage engines. The plan is on print for quite a few years now.<br \/>\nThis didn&#8217;t happen, external foreign keys do not exist.<\/li>\n<li>Why does this matter? Because a foreign key in InnoDB is coupled with a <em>table<\/em>. There&#8217;s a space where the foreign key exists, and that space is a <em>table<\/em>. It matters because <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/5.7\/en\/create-table-foreign-keys.html\">adding or dropping a foreign key<\/a>\u00a0is done by an <code>ALTER TABLE<\/code> statement. This is where the turtles begin to pile up.<\/li>\n<li>Foreign keys don&#8217;t associate to tables by <em>name<\/em> but by identity. If you <code>RENAME<\/code> a parent table, for example, than children&#8217;s foreign keys follow the table under its new name. This is where our pillar of turtles becomes higher.<\/li>\n<li>I chose <code>NO ACTION<\/code> (aka <code>RESTRICT<\/code>), but it doesn&#8217;t really matter to our discussion.<\/li>\n<li>MySQL allows you to disable foreign key checks for your session via <code>SET FOREIGN_KEY_CHECKS=0<\/code><\/li>\n<li>You can disable foreign key checks globally via <code>SET GLOBAL FOREIGN_KEY_CHECKS=0<\/code>, but this does not affect existing sessions, only ones created after your statement.<\/li>\n<li>All Online Schema Change tools: <code>gh-ost<\/code>, <code>fb-osc<\/code>, <code>pt-online-schema-change<\/code> ,<code>LHM<\/code>, and Vitess&#8217;s <code>VReplication<\/code>, work by creating a &#8220;shadow&#8221; table, which I like to call the\u00a0<em>ghost<\/em> table.\n<ul>\n<li>They create that table in the likeness of the original table.<\/li>\n<li>They modify the ghost table, and slowly populate it with data from the original table.<\/li>\n<li>At the end of the operation, in slightly different techniques, they <code>RENAME<\/code> the original table away, e.g. to <code>_mytable_old<\/code>, and <code>RENAME<\/code> the ghost table in its place, at which time it assumes production traffic.<\/li>\n<li>This is where the pile of turtles begins to shake.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><a href=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/osc-flow.png.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8146\" src=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/osc-flow.png.png\" alt=\"\" width=\"960\" height=\"540\" srcset=\"https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/osc-flow.png.png 960w, https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/osc-flow.png-300x169.png 300w, https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/osc-flow.png-768x432.png 768w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/><\/a><\/p>\n<h2>Changing a child table<\/h2>\n<p>Say we want to <code>ALTER TABLE person MODIFY name VARCHAR(1024) NOT NULL CHARSET utf8mb4<\/code>. Or add a column. Or an index. Whichever. Let&#8217;s see what happens.<\/p>\n<p><code>person<\/code> has a foreign key. We therefore create the ghost table with similar foreign key, a child table that references the parent <code>country<\/code> table. Funnily, even though InnoDB&#8217;s foreign keys live inside a table scope, their names are globally unique. So we create the ghost table as follows:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nCREATE TABLE _person_ghost (\nid INT NOT NULL,\ncountry_id INT NOT NULL,\nname VARCHAR(255) NOT NULL,\nPRIMARY KEY(id),\nKEY country_idx (country_id),\nCONSTRAINT person_country_fk2 FOREIGN KEY (country_id) REFERENCES country(id) ON DELETE NO ACTION\n);\n\n<\/pre>\n<ul>\n<li>Notice the name of the constraint changes to `person_country_fk2`.<\/li>\n<li>Because `_person_ghost` is a child-only table, there&#8217;s no problem with it being empty.<\/li>\n<li><code>pt-online-schema-change<\/code> is based on synchronous, same-transaction, data copy via triggers. At any point in time, if we populate <code>_person_ghost<\/code> with a row, that row also exists in the original <code>person<\/code> table during that same transaction. This means the data we insert to <code>_person_ghost<\/code> is foreign key safe.<\/li>\n<li><code>gh-ost<\/code>, <code>fb-osc<\/code>, <code>Vitess<\/code> use an asynchronous approach where they tail either the binary logs or a changelog table. It is possible that as we <code>INSERT<\/code> data to <code>_person_ghost<\/code>, that data no longer exists in <code>person<\/code>. It is possible that there&#8217;s no matching entry in <code>country<\/code>! We can overcome that by disabling foreign key checks on our session\/connection that populates the ghost table. We run <code>SET FOREIGN_KEY_CHECKS=0<\/code> as make the server (and our users!) a promise, that even while populating the table there may be inconsistencies, we&#8217;ll figure it all out at time of cut-over.<\/li>\n<li>Finally, population is complete. We place whatever locks we need to, ensure everything is in sync, and swap <code>_person_ghost<\/code> in place of <code>person<\/code>.<\/li>\n<\/ul>\n<h3>ERROR!<\/h3>\n<p>What have ended up with? Take a look:<\/p>\n<p><a href=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/fk-tables.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8147\" src=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/fk-tables.png\" alt=\"\" width=\"323\" height=\"291\" srcset=\"https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/fk-tables.png 323w, https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/fk-tables-300x270.png 300w\" sizes=\"auto, (max-width: 323px) 100vw, 323px\" \/><\/a><\/p>\n<p>The table <code>person_OLD<\/code> still exists, and maintains a foreign key constraint on <code>country<\/code>. Now, suppose we want to delete <code>country<\/code> number <code>99<\/code>. We delete or update all rows in <code>person<\/code> which point to country <code>99<\/code>. Good. We proceed to <code>DELETE FROM country WHERE id=99<\/code>. We can&#8217;t. That&#8217;s because <code>person_OLD<\/code> still has rows where <code>country_id=99<\/code>.<\/p>\n<h3>Well, why don&#8217;t you just drop that old constraint?<\/h3>\n<p>To drop the foreign key constraint from <code>person_old<\/code> is to <code>ALTER TABLE person_old DROP FOREIGN KEY person_country_fk<\/code>. What&#8217;s that? An <code>ALTER TABLE<\/code>? Wasn&#8217;t that the thing we wanted to avoid in the first place? There was a reason we ran an online schema change! So that&#8217;s an absolute no go.<\/p>\n<h3>Well, why don&#8217;t you just drop the old table?<\/h3>\n<p><code>pt-online-schema-change<\/code> offers <code>--alter-foreign-keys-method drop_swap<\/code>: to get rid of the foreign key we can drop the old table. The logic it offers is:<\/p>\n<ol>\n<li>Before we cut-over<\/li>\n<li>Disable foreign key checks<\/li>\n<li><code>DROP<\/code> the original table (e.g. <code>person<\/code>)<\/li>\n<li><code>RENAME<\/code> the ghost table in its place<\/li>\n<\/ol>\n<h3>Problem: DROP<\/h3>\n<p>Alas, more turtles. Dropping a MySQL table is production is a cause for outage. Here&#8217;s a <a href=\"https:\/\/github.com\/github\/gh-ost\/issues\/307\">lengthy discussion<\/a> form the <code>gh-ost<\/code> repo. Digging my notes shows <a href=\"http:\/\/code.openark.org\/blog\/mysql\/tip-faster-than-truncate\">this post<\/a> from 2010. This is an ancient problem where dropping a table places locks on buffer pool and on adaptive hash index, and there&#8217;s been multiple attempts to work around it. See Vitess&#8217;s <a href=\"https:\/\/vitess.io\/docs\/reference\/features\/table-lifecycle\/\">table lifecycle<\/a> for more.<\/p>\n<p>Just a couple months ago, MySQL <code>8.0.23<\/code> <a href=\"https:\/\/dev.mysql.com\/doc\/relnotes\/mysql\/8.0\/en\/news-8-0-23.html\">release notes<\/a> indicate that this bug is finally solved. I can&#8217;t wait to try it out. Most of the world is not on <code>8.0.23<\/code>\u00a0<em>yet<\/em> and until it is, <code>DROP<\/code> is a problem.<\/p>\n<p>In my personal experience, if you can&#8217;t afford to run a straight <code>ALTER<\/code> on a table, it&#8217;s likely you can&#8217;t afford to <code>DROP<\/code> it.<\/p>\n<h3>Problem: outage<\/h3>\n<p>As <code>pt-online-schema-change<\/code> documentation correctly point out, we cause a brief time of outage after we <code>DROP<\/code> the <code>person<\/code> table, and before we <code>RENAME TABLE _person_ghost TO person<\/code>. This is unfortunate, but, assuming <code>DROP<\/code> is instantaneous, is indeed brief.<\/p>\n<h3>Child-side: summary<\/h3>\n<p>Assuming MySQL <code>8.0.23<\/code> with instantaneous <code>DROP<\/code>, altering a table with child-side-only constraint is feasible. Without instantaneous <code>DROP<\/code>, the migration can be as blocking as a straight <code>ALTER<\/code>.<\/p>\n<p>I regret to inform that from here things only get worse.<\/p>\n<h2>Changing a parent table<\/h2>\n<p>What happens if we <em>naively<\/em> try to <code>ALTER TABLE country ADD COLUMN currency VARCHAR(16) NOT NULL<\/code>?<\/p>\n<p>We create a ghost table, we populate the ghost table, we cut-over, and&#8230; End up with:<\/p>\n<p><a href=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/parent-side-swap-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8153\" src=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/parent-side-swap-1.png\" alt=\"\" width=\"960\" height=\"540\" srcset=\"https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/parent-side-swap-1.png 960w, https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/parent-side-swap-1-300x169.png 300w, https:\/\/code.openark.org\/blog\/wp-content\/uploads\/2021\/03\/parent-side-swap-1-768x432.png 768w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/><\/a><\/p>\n<p>Our naive approach fails miserably. As we <code>RENAME TABLE country to country_OLD<\/code>, the children&#8217;s foreign keys, on <code>person<\/code> and <code>company<\/code>, followed the table entity into <code>country_OLD<\/code>. We are now in a situation where there is\u00a0<em>no active constraint<\/em> on <code>country<\/code>, and we&#8217;re stuck with a legacy table that affects our production.<\/p>\n<h3>Just drop the old table?<\/h3>\n<p>Other than the <code>DROP<\/code> issue discussed above, this doesn&#8217;t solve the main problem, which is that we are left with no constraint on <code>country<\/code>.<\/p>\n<h3>ALTER on parent implies ALTER on children<\/h3>\n<p>The shocking result of our naive experiment, is that if we want to <code>ALTER TABLE country<\/code>, we must &#8211; concurrently somehow &#8211; also <code>ALTER TABLE person<\/code> <em>and<\/em> &#8211; concurrently somehow &#8211; <code>ALTER TABLE company<\/code>. On the children tables we need to <code>DROP<\/code> the old foreign key, and create a new foreign key that points into <code>country_ghost<\/code>.<\/p>\n<p>That&#8217;s a lot to unpack.<\/p>\n<h3>How does pt-online-schema-change solve this?<\/h3>\n<p><code>pt-online-schema-change<\/code> offers <code>--alter-foreign-keys-method rebuild_constraints<\/code>. In this method, just before we cut-over and <code>RENAME<\/code> the tables, we iterate all children, and , one by one, run a straight <code>ALTER TABLE<\/code> on each of the children to <code>DROP<\/code> the old constraint and to <code>ADD<\/code> the new constraint, pointing to <code>country_ghost<\/code> (imminently? to be renamed to <code>country<\/code>).<\/p>\n<p>This must happen when the ghost table is in full sync with the original table, or else there can be violations. For <code>pt-online-schema-change<\/code>, which uses synchronous in-transaction trigger propagation, this works. For <code>gh-ost<\/code>, <code>Vitess<\/code> etc., which use the asynchronous approach, this can only take place while we place a write lock on the original table.<\/p>\n<p>As <code>pt-online-schema-change<\/code> documentation correctly indicates, this makes sense only when the children are all very small tables.<\/p>\n<p>This gets worse. Let&#8217;s break this down even more.<\/p>\n<h3>Straight ALTER on children, best case scenario?<\/h3>\n<p>Best case is achieved when indeed all children tables are very small. Still, we need to place a lock, and either sequentially or concurrently <code>ALTER<\/code> multiple such small tables.<\/p>\n<p>In my experience, on databases that aren&#8217;t trivially small, the opposite is more common: children tables are\u00a0<em>much<\/em> larger than parent tables, and running a straight <code>ALTER<\/code> on children is just not feasible.<\/p>\n<h3>Straight ALTER on children, failures?<\/h3>\n<p>Even the best case scenario poses the complexity of recovering\/rolling back from error. For example, in a normal online schema change, we set timeouts for DDLs. Like the final <code>RENAME<\/code>. If something doesn&#8217;t work out, we timeout the DDL, take a step back, and try cutting-over again later on. But our situation is much more complex now. While we keep a write lock, we must run multiple DDLs on the children, repointing their foreign keys from the original <code>country<\/code> table to <code>country_ghost<\/code>. What if\u00a0<em>one<\/em> of those DDLs fail? We are left in a limbo state. Some of the DDLs may have succeeded. We&#8217;d need to either revert them, introducing\u00a0<em>even more<\/em> DDLs (remember, we&#8217;re still holding locks), or retry that failing DDL. Those are a lot of DDLs to synchronize at the same time, even when they&#8217;re at all feasible.<\/p>\n<h3>If children tables are large?<\/h3>\n<p>In our scenario, <code>person<\/code> and <code>company<\/code> are large tables. A straight <code>ALTER<\/code> table is just not feasible. We began this discussion assuming there&#8217;s a problem with <code>ALTER<\/code> in the first place.<\/p>\n<p>Also, for asynchronous online schema changes the situation is much more complex since we need to place more locks.<\/p>\n<h3>So, let&#8217;s ALTER the children with Online Schema Change?<\/h3>\n<p>There&#8217;s an alluring thought. We bite, and illustrate what it would take to run an online schema change on each of the large children, concurrently to, and coordinated with, an online schema change on the parent.<\/p>\n<h3>When can we start OSC on children?<\/h3>\n<p>We want the children to point their FK to <code>country_ghost<\/code>. So we must kick the migration on each child\u00a0<em>after<\/em> the parent&#8217;s migration creates the ghost table, and certainly before cut-over.<\/p>\n<p>Initially, the parent&#8217;s ghost table is empty, or barely populated. Isn&#8217;t that a problem? Pointing to a parent table which is not even populated? Fortunately for us, we again remember we can disable foreign key checks as our OSC tool populates the child table. Sure, everything is broken at first, but we\u00a0<em>promise<\/em> the server and the user that we will figure it all out at cut-over time.<\/p>\n<p>So far, looks like we have a plan. We need to catch that notification that <code>country_ghost<\/code> table is created, and we kick an online migration on <code>person<\/code> and on <code>company<\/code>.<\/p>\n<h3>When do we cut-over each migration?<\/h3>\n<p>We absolutely can&#8217;t cut-over <code>country<\/code> before <code>person<\/code> and <code>company<\/code> are complete. That&#8217;s why we embarked on altering the children in the first place. We must have the children&#8217;s foreign keys point to <code>country_ghost<\/code> before cutting it over.<\/p>\n<p>But now, we need to also consider: when is it\u00a0<em>safe<\/em> to cut-over <code>person<\/code> and <code>company<\/code>? It is only\u00a0<em>safe<\/em> to cut-over when referential integrity is guaranteed. We remember that throughout the parent&#8217;s migration there&#8217;s no such guarantee. surely not while the table gets populated. And for asynchronous-based migrations, even after that, because the ghost table always &#8220;lags&#8221; a bit behind the original table.<\/p>\n<p>The only way to provide referential integrity guarantee for asynchronous based migrations is when we place a write lock on the parent table (<code>country<\/code>). We bite. We lock the table for writes, and sync up <code>country_ghost<\/code> until we&#8217;re satisfied both are in complete sync. Now&#8217;s logically a\u00a0<em>safe<\/em> time to cut-over the children.<\/p>\n<p>But notice: this is a single, unique time, where we must cut-over\u00a0<em>all children<\/em>, or none. This gets worse.<\/p>\n<h3>Best case scenario for cutting-over<\/h3>\n<p>In the best scenario, we place a lock on <code>country<\/code>, sync up <code>country_ghost<\/code>, hold the lock, then iterate all children, and cut-over each. All children operations are successful. We cut-over the parent.<\/p>\n<p>But this best case scenario depends on getting the best case scenario on each of the children, to its own. Remember, an <code>ALTER<\/code> on a child table means we have to <code>DROP<\/code> the child&#8217;s\u00a0<em>old<\/em> table. Recall the impact it has in production. Now multiply by <code>n<\/code> children. The <code>ALTER<\/code> on <code>country<\/code>, and while holding a write lock,\u00a0 will need to sustain survive <code>DROP<\/code> on both <code>person_OLD<\/code> and <code>company_OLD<\/code>. This ie best case.<\/p>\n<h3>Less than best case scenario is a disaster<\/h3>\n<p>We don&#8217;t have the room for problems. Suppose <code>person<\/code> cuts over, and we <code>DROP<\/code>\u00a0<code>person_OLD<\/code>. But then <code>company<\/code> fails to cut-over. There&#8217;s DDL timeout.<\/p>\n<p>We can&#8217;t roll back.\u00a0<code>person<\/code> is now committed to <code>company_ghost<\/code>. We can try cutting over <code>company<\/code> again, and again, and again. But we may not fail. During these recurring attempts we must keep the lock on <code>country<\/code>. And try again <code>company<\/code>. Did it succeed? Phew. We can cut-over <code>country<\/code> and finally remove the lock.<\/p>\n<p>But\u00a0<em>what if<\/em> something really fails? <em>Pro tip<\/em>: it most certainly happens.<\/p>\n<p>If <code>person<\/code> made it, and <code>company<\/code> does not &#8211; if <code>company<\/code>&#8216;s migration breaks, fails, panics, gets killed, goes into seemingly infinite deadlocks, is unable to cut-over &#8212; whichever &#8212; we&#8217;re left in inconsistent and impossible scenario. <code>person<\/code> is committed to <code>company_ghost<\/code>, but <code>company<\/code> is still committed to <code>country<\/code>. We have to keep that lock on <code>country<\/code> and run a new migration on <code>company<\/code>! and again, and again. Meanwhile, <code>country<\/code> is locked. Oh yes, meanwhile <code>person<\/code> is also locked. You can&#8217;t write to <code>person<\/code> because you can&#8217;t verify that related rows exist in <code>country<\/code>, because <code>country<\/code> has a <code>WRITE<\/code> lock.<\/p>\n<p>I can&#8217;t stress this enough: the lock must not be released until all children tables are migrated. So, for our next turtle, what happens on a failover? We get referential integrity corruption, because locks don&#8217;t work across servers.<\/p>\n<h3>Disk space<\/h3>\n<p>Remember that an OSC works by creating a ghost table and populating it until it is in sync with the original table. This effectively means requiring extra disk space at roughly the same volume as the original table.<\/p>\n<p>In a perfect world, we&#8217;d have all the disk space we ever needed. In my experience we&#8217;re far from living in a perfect world. I&#8217;ve had migrations where we weren&#8217;t sure we had the disk space for a single table change.<\/p>\n<p>If we are to <code>ALTER<\/code> a parent, and as by product <code>ALTER<\/code> all of its children,\u00a0<em>at the same time<\/em>, we&#8217;d need enough free disk space for all volumes of affected tables, <em>combined<\/em>.<\/p>\n<p>In fact, running out of disk space is one of the common reasons for failing an online schema change operation. Consider how low the tolerance is for parent-side schema migration errors. Consider that running out of disk space isn&#8217;t something that just gets solved by retrying the cut-over again, and again, &#8230; the disk space is not there.<\/p>\n<h3>Run time<\/h3>\n<p>Three migrations running concurrently will not run faster than three migrations running sequentially &#8212; that&#8217;s my experience backed with production experiments. In my experience they actually end up taking <em>longer<\/em> because they&#8217;re all fighting for same resources, and context switch matters, as back-off intervals pile up. Maybe there&#8217;s some scenario where they could run <em>slightly<\/em> faster?<\/p>\n<p>Altering our 200 row <code>country<\/code>\u00a0 table ends up taking hours and hours due to the large <code>person<\/code> and <code>country<\/code> tables. The time for a migration is roughly the sum of times for all dependent migrations!<\/p>\n<p>Hmmm. Maybe on <code>country<\/code> we should just run a straight <code>ALTER<\/code>. I think so, that wins! But it only wins our particular scenario, as we see next.<\/p>\n<h3>Parent-side: summary<\/h3>\n<p>The operational complexity of Online Schema Changes for parent-side foreign keys is IMO not feasible. We need to assume all child-side operations are feasible, first (I&#8217;m looking at you, <code>DROP TABLE<\/code>), and we have almost zero tolerance to things going wrong. Coordinating multiple migrations is complex, and a failover at the wrong time may cause corruption<\/p>\n<h2>Changing a deep nested relationship<\/h2>\n<p>Truly, everything discussed thus far was a simplified situation. We introduce more turtles to our story. Let&#8217;s add this table:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nCREATE TABLE person_company (\nid INT NOT NULL AUTO_INCREMENT,\nperson_id INT NOT NULL,\ncompany_id INT NOT NULL,\nstart_at TIMESTAMP NOT NULL,\nend_at TIMESTAMP NULL,\nPRIMARY KEY(id),\nKEY person_idx (person_id),\nKEY company_idx (company_id),\nCONSTRAINT person_company_person_fk FOREIGN KEY (person_id) REFERENCES person(id) ON DELETE NO ACTION,\nCONSTRAINT person_company_company_fk FOREIGN KEY (company_id) REFERENCES company(id) ON DELETE NO ACTION\n);\n<\/pre>\n<p><code>person_company<\/code> is a child of <code>person<\/code> and of <code>company<\/code>. It&#8217;s actually enough that it&#8217;s a child of\u00a0<em>one<\/em> of them. What&#8217;s important is that now <code>person<\/code> is both a child table\u00a0<em>and<\/em> a parent table. So is <code>company<\/code>. This is a pretty common scenario in schema designs.<\/p>\n<h3>How do you <code>ALTER<\/code> a table that is both a parent and a child?<\/h3>\n<p>We introduce no new logic here, we &#8220;just&#8221; have to combine the logic for both. Given <code>person_company<\/code> exists, if we wanted to <code>ALTER TABLE person<\/code> we&#8217;d need to:<\/p>\n<ul>\n<li>Alter <code>person<\/code> as a child table (implies <code>DROP<\/code> issue and outage)<\/li>\n<li>Alter <code>person<\/code> as a parent (implies altering <code>person_company<\/code> and synchronizing the cut-over)<\/li>\n<\/ul>\n<p>So how do we alter <code>country<\/code> now?<\/p>\n<p>To <code>ALTER TABLE country<\/code>, we&#8217;d need to:<\/p>\n<ul>\n<li>Begin <code>country<\/code> OSC, wait till <code>country_ghost<\/code> is created<\/li>\n<li>Then, begin <code>person<\/code> OSC, wait till <code>person_ghost<\/code> is created, and<\/li>\n<li>begin <code>company<\/code> OSC, wait till <code>company_ghost<\/code> is created<\/li>\n<li>Then, begin OSC on <code>person_company<\/code><\/li>\n<li>Run until\u00a0<em>all of the migrations<\/em> seem to be ready to cut-over<\/li>\n<li>Place lock on <code>country<\/code>. while this lock is in place:\n<ul>\n<li>Sync up <code>person<\/code> migration. Place lock on <code>person<\/code>, and<\/li>\n<li>Sync up <code>company<\/code> migration. Place lock on <code>company<\/code>.<\/li>\n<li>While both locks are in place:\n<ul>\n<li>Sync up <code>person_company<\/code>.<\/li>\n<li><code>DROP person_company_OLD<\/code>!<\/li>\n<li>Cut-over <code>person_company<\/code>!<\/li>\n<\/ul>\n<\/li>\n<li><code>DROP company_OLD!<\/code><\/li>\n<li>Cut-over <code>company<\/code>!<code><\/code><code><\/code><\/li>\n<li><code>DROP person_OLD<\/code>!<\/li>\n<li>Cut-over <code>person<\/code>!<\/li>\n<\/ul>\n<\/li>\n<li>Cut-over <code>country<\/code>!<\/li>\n<\/ul>\n<p>And we have near zero tolerance to any failure in the above, and we can&#8217;t afford a failover during that time&#8230;<\/p>\n<h2>Overall summary<\/h2>\n<p>It would all be better if we could just run <code>ALTER TABLE<\/code> in MySQL and have it truly online, throttling, and on replicas, too. This doesn&#8217;t exists and our alternative is mostly Online Schema change tools, where, IMO, handing foreign key constraints on large tables is not feasible.<\/p>\n<p>There&#8217;s an alternative to Online Schema change, which is to <code>ALTER<\/code> on replicas. That comes with its own set of problems, and for this blog post I just ran out of fumes. For another time!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post explains the inherent problem of running online schema changes in MySQL, on tables participating in a foreign key relationship. We&#8217;ll lay some ground rules and facts, sketch a simplified schema, and dive into an online schema change operation. Our discussion applies to pt-online-schema-change, gh-ost, and Vitess based migrations, or any other online schema [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[125,121,96,17,137],"class_list":["post-8127","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-gh-ost","tag-operations","tag-percona-toolkit","tag-schema","tag-vitess"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-275","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/8127","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=8127"}],"version-history":[{"count":32,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/8127\/revisions"}],"predecessor-version":[{"id":8165,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/8127\/revisions\/8165"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=8127"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=8127"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=8127"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}