Things that don’t work well with MySQL’s FOREIGN KEY implementation

Foreign keys are an important database construct, that let you keep an aspect of data integrity within your relational model. Data integrity has many faces, most of which are application-specific. Foreign keys let you maintain an association integrity between rows in two tables, one being the parent, the other being the child. I’ve personally mostly avoided using foreign keys in MySQL for many years in my professional work, for several reasons. I’d like to now highlight things I find to be wrong/broken with the MySQL implementation.

But, first, as in the past this caused some confusion: when I say I’m not using foreign keys, that does not mean I don’t JOIN tables. I still JOIN tables. I still have some id column in one table and some parent_id in another. I still use the benefit of the relational model. In a sense, I do use foreign keys. What I don’t normally is the foreign key CONSTRAINT, i.e. the declaration of a CONSTRAINT some_fk FOREIGN KEY ... in a table’s definition.

So here are things I consider to be broken, either specific to the MySQL implementation, or in the general concept. Some are outright deal breakers for environments I’ve worked with. Others are things to work around. In no particular order:

No binary log entries for cascaded writes

I think there are many people unaware of this. In a way, MySQL doesn’t really support foreign keys. The InnoDB engine does. This is old history, from before InnoDB was even officially a MySQL technology, and was developed independently as a 3rd party product. There was a time when MySQL sought alternative engines. There was a time when there was a plan to implement foreign keys in MySQL, above the storage engine level. But as history goes, MySQL and InnoDB both became one with Oracle acquiring both, and I’m only guessing implementing foreign keys in MySQL became lower priority, to be eventually abandoned.

Alas, the fact foreign keys are implemented in the storage engine level has dire consequences. The engine does not have direct access to the binary log. If you create a foreign key constraint with ON DELETE|UPDATE of SET NULL or CASCADE, you should be aware that cascaded operations are never written to the binary log. Consider these two tables:

CREATE TABLE `parent_table` (
  `id` int NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

CREATE TABLE `child_table` (
  `id` int NOT NULL,
  `parent_id` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_id_idx` (`parent_id`),
  CONSTRAINT `child_parent_fk` FOREIGN KEY (`parent_id`) REFERENCES `parent_table` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB;

insert into parent_table values (1);
insert into child_table values (1, 1);
insert into child_table values (2, 1);

If you were to DELETE FROM parent_table WHERE id=1, then the two rows in child_table are also deleted, due to the CASCADE rule. However, only the parent_table deleted row is written in the binary log. The two child_table rows are deleted internally by the InnoDB engine. The assumption is that when a replica applies the DELETE on parent_table the replica’s own InnoDB engine will likewise delete the two relevant child_table rows.

Fair assumption. But we lose information along the way. As Change Data Captures are becoming more and more common, and as we stream changes from MySQL to other data stores, the DELETEs on child_table are never reflected and cannot be captured.

Online DDL, aka online schema changes

I’ve written about this at length in the past. But even that write up is incomplete!

MySQL is pushing towards INSTANT DDL, which is a wonderful thing. With 8.0.29, even more schema change operations are supported by ALGORITHM=INSTANT. But, there’s still quite a lot of operations unsupported yet, and until such time that INSTANT DDL supports all (or at least all common) schema changes, Online Schema Change tools like gh-ost, pt-online-schema-change, and Vitess (disclaimer: I’m a Vitess maintainer and actively developing Vitess's Online DDL), are essential when it comes to production changes.

Both Vitess and gh-ost tail the binary logs to capture changes to the table. In light of the previous section, it is impossible to run such an Online Schema Change operation on a foreign key child table that has either SET NULL or CASCADE rule. The changes to the table are never reflected in the binary log. pt-online-schema-change is also unable to detect those changes as there’s nothing to invoke the triggers.

Then, please do go ahead and read The problem with MySQL foreign key constraints in Online Schema Changes, as it goes deep into what it otherwise means to deal with FK constraints in Online DDL, as it cannot fit in this post.

Locked data types

In the above table definitions, id and parent_id are int. As data grows, I might realize the choice of data type was wrong. I really should have used bigint unsigned.

Alas, it is impossible to change the data type in either parent_table or child_table:

> alter table parent_table modify column id bigint unsigned;
ERROR 3780 (HY000): Referencing column 'parent_id' and referenced column 'id' in foreign key constraint 'child_parent_fk' are incompatible.

> alter table child_table modify column parent_id bigint unsigned;
ERROR 3780 (HY000): Referencing column 'parent_id' and referenced column 'id' in foreign key constraint 'child_parent_fk' are incompatible.

It’s impossible to do that with straight-DDL (never mind INSTANT), and it’s impossible to do that with Online DDL. InnoDB (not MySQL) flatly refuses to accept any change in the related columns’s data type. Well, it’s not really about changing them as it is about having an incompatibility. But then, we can’t change either. The column type changes are only made possible if we modify the child table to remove the foreign key constraint, then alter both parent and child to modify the respective columns types, then re-add the foreign key constraint. There are four different ALTER TABLE statements. Neither removing nor adding a foreign key constraint is supported in INSTANT algorithm, so you can expect a long time in which the foreign key relationship simply does not exist!

CREATE TABLE … LIKE

One of those quirks that come with InnoDB owning the foreign key definition, is that CREATE TABLE ... LIKE does not generate foreign keys. I think this is mostly an oversight. A SHOW CREATE TABLE statement does produce foreign key output, so I’m not sure why CREATE TABLE ... LIKE doesn’t. Continuing our above child_table example:

> create table child_table_copy like child_table;
Query OK, 0 rows affected (0.06 sec)

> show create table child_table_copy \G
*************************** 1. row ***************************
       Table: child_table_copy
Create Table: CREATE TABLE `child_table_copy` (
  `id` int NOT NULL,
  `parent_id` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_id_idx` (`parent_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

Unique constraint names

I know this is ANSI SQL, and so I won’t fault MySQL for this. I do think this is one of those scenarios where deviating from ANSI SQL would be beneficial. A foreign key constraint has a name (if you don’t provide one, one is auto-generated for you). And, that name, according to ANSI SQL, has to be unique across your schema. It means the following table conflicts with our original child_table:

CREATE TABLE `another_child_table` (
  `id` int NOT NULL,
  `parent_id` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_id_idx` (`parent_id`),
  CONSTRAINT `child_parent_fk` FOREIGN KEY (`parent_id`) REFERENCES `parent_table` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB;

You can’t have two foreign key constraints both named child_parent_fk.

I never understood that limitation. See, it’s just fine the tables both have a key named parent_id_idx. No conflict about that. Why do foreign keys have to have unique names?

Maybe, in ANSI SQL, foreign keys can be independent constructs, living outside the table scope. Meh, even so this could be technically solved using some sort of namespace. But, in MySQL this isn’t the case in the first place. Foreign keys are part of the table definition.

This is again just painful for Online DDL, or for any automation that tries to duplicate tables on the fly.

Lack of declarative-only definitions

This is more of a “I wish this existed” rather than “this is wrong”. One of the greatest benefits of foreign keys is the graph. Given a schema with foreign keys, you can formally analyze the relationships between tables. You can draw the dependency graph. It’s really educating.

What I wish for is to have a declarative-only foreign key definition. One that does not actually enforce anything. Merely indicates an association. Something like so:

CREATE TABLE `child_table` (
  `id` int NOT NULL,
  `parent_id` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_id_idx` (`parent_id`),
  DECLARATIVE FOREIGN KEY (`parent_id`) REFERENCES `parent_table` (`id`)
)

The declarative foreign key could still enforce the existence of the parent table and referenced column, definition-wise, but do nothing at all to enforce relationship of data.

Anyway, just a wish.

SET FOREIGN_KEY_CHECKS

I love that we have set foreign_key_checks. But it’s a bit inconsistent. Basically, set foreign_key_checks=0 lets you override foreign key constraints. You can do any of the following:

  • INSERT data to a child table even if the parent table does not have matching values.
  • With NO ACTION/RESTRICT rule, DELETE data from a parent table even if children tables have matching rows.
  • With SET NULL/CASCADE rule, DELETE data from a parent table without even attempting to cascade the change to children tables.
  • CREATE TABLE child_table that references parent_table even if parent_table does not exist.
  • DROP TABLE parent_table even if child_table exists and is populated.

But, why oh why, will set foreign_key_checks=0 not let me:

  • alter table parent_table modify column id bigint unsigned;(column type relationship are still enforced)
  • Swap a new parent table using a RENAME TABLE statement (wishful feature, would really help Online DDL)

Limited to server scope

This one becomes obvious as your data grows. If you use foreign keys and you rely on their behavior (e.g. your app relies on a DELETE to fail if there’s dependent rows in children tables), and your data set grows such that a single server does not have the write capacity, you’re in trouble.

You may attempt to do functional sharding. You will hopefully find two subsets of your schema’s tables, that are not connected in the foreign key graph. If so, you win the day. But if it’s all connected, then you have to break some relationships. You’d have to audit your app. It previously assume the database would take care of data integrity, and now, for some relationships, it wouldn’t.

Or you may want to have horizontal sharding. If you mean to keep foreign key constraints, that means you need to find a way to co-locate data across the entire dependency graph. Unless this was pre-designed, you will probably find this to be impossible without a major refactor.

Vitess is looking into FOREIGN KEY implementation. It will attempt to address some of the above limitations. See https://github.com/vitessio/vitess/issues/11975 and https://github.com/vitessio/vitess/issues/12967 for some preliminary write ups and tracking.

3 thoughts on “Things that don’t work well with MySQL’s FOREIGN KEY implementation

  1. Hi Shlomi, a question / comment on below.

    > The engine does not have direct access to the binary log.

    I am not sure this is completely true. In RBR, a UPDATEs and DELETEs put the full row “before” the operation in the binlogs. I do not exactly know how this is implemented (the Server querying these, or InnoDB generating binlogs) but it looks like the binlogs gets somehow “enriched” with data not available at the moment an UPDATE or DELETE is run by the Server. So in this respect, it is probably possible to “enrich” the binlogs with the FK impacted.

    And this reminds me of this bug I opened 2 years ago that you might be interested in: Foreign Key ON DELETE CASCADE breaks with RBR and multiple-table DELETE / https://bugs.mysql.com/bug.php?id=102586.

    Cheers, Jean-François

  2. Thanks JF, you may well be right on this!

    I did happen to look at the InnoDB code and I looked for references to the binary log, and came up with nothing. But in all honesty, I had hard time comprehending the code & structure, and may well have missed it. If the binlogs can be enriched then that’s fantastic.

    Thank for the link to that bug.

  3. In SQL, all constraint names are schema-qualified names. That means they don’t belong to tables, they belong to the schema. SQL constraints include table constraints, domain constraints, and assertions. The latter two types of constraints are clearly not table-specific. So the names of constraints must be unique within a schema, like other schema-qualified objects such as tables, routines, sequences, triggers, etc.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.