5.6.8-rc is out, and so I’m following up on InnoDB’s online DDL new feature: the ability to SELECT, INSERT, DELETE, UPDATE a table even while an ALTER TABLE is executing on same table.
The brief summary
Not as advertised; many things can’t be done.
The longer review
I’m using 5.6.8-rc 64bit binary distribution for Linux, installed via mysqlsandbox. My hardware is irrelevant, but the fact I’m testing on my laptop assists me in that ALTER TABLE operations take a while, so that I’m able to easily type commands in two terminals and have the time to watch them being executed. Query cache is disabled.
I’m using the sakila sample database, and in particular I’m working with the rental table. Here’s the table definition:
CREATE TABLE `rental` ( `rental_id` int(11) NOT NULL AUTO_INCREMENT, `rental_date` datetime NOT NULL, `inventory_id` mediumint(8) unsigned NOT NULL, `customer_id` smallint(5) unsigned NOT NULL, `return_date` datetime DEFAULT NULL, `staff_id` tinyint(3) unsigned NOT NULL, `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`rental_id`), UNIQUE KEY `rental_date` (`rental_date`,`inventory_id`,`customer_id`), KEY `idx_fk_inventory_id` (`inventory_id`), KEY `idx_fk_customer_id` (`customer_id`), KEY `idx_fk_staff_id` (`staff_id`), CONSTRAINT `fk_rental_staff` FOREIGN KEY (`staff_id`) REFERENCES `staff` (`staff_id`) ON UPDATE CASCADE, CONSTRAINT `fk_rental_inventory` FOREIGN KEY (`inventory_id`) REFERENCES `inventory` (`inventory_id`) ON UPDATE CASCADE, CONSTRAINT `fk_rental_customer` FOREIGN KEY (`customer_id`) REFERENCES `customer` (`customer_id`) ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=16050 DEFAULT CHARSET=utf8
Highlights for the table: AUTO_INCREMENT PRIMARY KEY, some columns indexed, some not, and Foreign Keys in place. Pretty much a standard table. It contains 16,044 rows. Row format is COMPACT.
What I want to know is: which DDL commands allow for which online DML commands?
So, on terminal #1 I will issue queries like:
node1 5.6.8-rc-log sakila> alter table sakila.rental ROW_FORMAT=COMPACT /* or whatever */; Query OK, 0 rows affected (10.57 sec) Records: 0 Duplicates: 0 Warnings: 0
And during the above operation, I will execute the following on terminal #2:
- select max(rental_id) from sakila.rental; this queries the AUTO_INCREMENT value, which is of course a PRIMARY KEY
- select min(rental_date) from sakila.rental; there is an index on rental_date, and normal execution plan is to optimize table away and just use the index
- select min(return_date) from sakila.rental; there is no index on return_date, and full table scan is required
- update rental set return_date = return_date + interval 1 second where rental_id=3; the UPDATE uses the PRIMARY KEY
- update rental set return_date = return_date + interval 1 second where return_date = NOW(); won’t actually affect anything, but requires full scan.
So here are the results:
+-------------------------------------------------------------+-------+---------------------------+---------------+---------------------+-------------------------+--------------+---------------------+ | ALTER statement | Time | General comments | select max PK | select min by index | select min by full scan | update by PK | update by full scan | +-------------------------------------------------------------+-------+---------------------------+---------------+---------------------+-------------------------+--------------+---------------------+ | ROW_FORMAT=COMPACT | 10.92 | | Instant | Instant | Instant | blocked | blocked | | AUTO_INCREMENT=16051 | 0.06 | Instant, no table rebuild | N/A | N/A | N/A | N/A | N/A | | ADD INDEX(last_update) | 2.37 | | blocked | blocked | blocked | blocked | blocked | | ADD INDEX(last_update), ALGORITHM=INPLACE | 1.83 | | blocked | blocked | blocked | blocked | blocked | | ADD INDEX(last_update), ALGORITHM=INPLACE, LOCK=NONE | 0.00 | ERROR 1235 (42000): ... | N/A | N/A | N/A | N/A | N/A | | ADD COLUMN c CHAR(1) NOT NULL | 11.20 | | Instant | Instant | Instant | blocked | blocked | | ADD COLUMN c CHAR(1) NOT NULL, ALGORITHM=INPLACE, LOCK=NONE | 0.00 | ERROR 1235 (42000): . | N/A | N/A | N/A | N/A | N/A | +-------------------------------------------------------------+-------+---------------------------+---------------+---------------------+-------------------------+--------------+---------------------+
Rather surprising, I would say.
- None of my tests resolved with online write (UPDATE). At best I could get online read (SEELCT).
- AUTO_INCREMENT is instantaneous. High time for that! It’s just some number in the .frm file, never understood the need for table rebuild.
- Apparently ADD COLUMN is more online than ADD INDEX, and I’ve tested this again and again and again to make sure I was doing it right. This is quite weird, even according to the docs.
- In none of the above tests (and others, non listed), have I been able to specify LOCK=NONE. It’s always ERROR 1235 (42000): This version of MySQL doesn’t yet support ‘alter table sakila.rental <whatever>, algorithm=inplace, lock=none’.
So what’s so online about this? Online reads are nice, but most everyone cannot accept blocking writes (for same reason no one would use mysqlhotcopy, also so wrongly named). This leaves us again with oak-online-alter-table and pt-online-schema-change.
The butler did it
Apologies to the butler, the FOREIGN KEYs did it. Let’s try the same again without foreign keys:
node1 5.6.8-rc-log sakila> create table rental2 like rental; node1 5.6.8-rc-log sakila> insert into rental2 select * from rental; node1 5.6.8-rc-log sakila> rename table rental to rental_old, rental2 to rental; Query OK, 0 rows affected (0.31 sec)
Here are the results:
+-------------------------------------------------------------+-------+---------------------------+----------------+---------------------+-------------------------+----------------+---------------------+ | ALTER statement | Time | General comments | select max PK | select min by index | select min by full scan | update by PK | update by full scan | +-------------------------------------------------------------+-------+---------------------------+----------------+---------------------+-------------------------+----------------+---------------------+ | ROW_FORMAT=COMPACT | 11.03 | | Instant | Instant | Instant | Instant | Instant | | AUTO_INCREMENT=16051 | 0.05 | Instant, no table rebuild | N/A | N/A | N/A | N/A | N/A | | ADD INDEX(last_update) | 2.04 | | * Inconsistent | * Inconsistent | * Inconsistent | blocked | blocked | | ADD INDEX(last_update), ALGORITHM=INPLACE, LOCK=NONE | 3.14 | | * Inconsistent | * Inconsistent | * Inconsistent | blocked | blocked | | ADD COLUMN c CHAR(1) NOT NULL | ** | | * Inconsistent | * Inconsistent | * Inconsistent | * Inconsistent | * Inconsistent | | ADD COLUMN c CHAR(1) NOT NULL, ALGORITHM=INPLACE, LOCK=NONE | ** | | * Inconsistent | * Inconsistent | * Inconsistent | * Inconsistent | * Inconsistent | +-------------------------------------------------------------+-------+---------------------------+----------------+---------------------+-------------------------+----------------+---------------------+
What’s going on here?
- ALGORITHM=INPLACE, LOCK=NONE is accepted! Bad, bad foreign keys!
- * ADD INDEX usually allows for concurrent reads, but after repeated tests SELECTs start to block. Then they don’t work concurrently anymore until table is recreated. But even that not always, so I’m not sure what the inconsistency is.
- * ADD COLUMN is still more concurrent than ADD INDEX, and actually allows for concurrent writes! Though, inconsistently. Sometimes it does not allow for concurrent writes.
- ** ADD COLUMN runtime highly affected by concurrent queries. It wents as high as 45 seconds on my laptop. Now, to make things clear, I’m not running an automated benchmark here: I’m copying+pasting the statements from my editor to the mysql CLI. So, maybe 10 or 15SELECT and UPDATE queries executes. How does that justify 35 seconds delay in table rebuild?
Some conclusions:
- The documentation does not specify anything about FOREIGN KEYs crashing the party. It should.
- The documentation specifically mentions the ADD/DROP INDEX statements to be online. ADD INDEX is less online than ADD COLUMN.
- Everything is still shaky. Sometimes things work, sometimes they don’t.
- Runtimes are unproportionally affected by concurrent queries.
- For the meantime, I keep to my online alter table scripts. Been using them for 3.5 years now.
It’s still an active work in progress at the moment but there are some improvements to the documentation for this that may help. See http://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html .
The inconsistency you saw is probably explained in this paragraph: “An online DDL statement for an InnoDB table always waits for currently executing transactions that are accessing the table to commit or roll back, because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, it requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement waits for any transactions that are started while the DDL is in progress, and query or modify the table, to commit or roll back before the DDL completes.”
So an open SELECT can prevent the commit (final) phase from getting its lock until the SELECT has finished, delaying the ALTER in a time when it’s locking exclusively. We’re looking into improving this since it’s not really desirable to hold an exclusive lock while waiting for a SELECT to finish, no guarantee that this will happen, nor when.
Best to check the documentation from time to time as it improves and use bug reports for feedback and requests for clarifications of things, that feedback is at least as welcome as blog posts. This applies to all new (and existing) feature descriptions. Lots of ongoing work on their documentation.
Please do keep up the feedback, it’s much appreciated!
Views are my own, for an official view, seek a PR person.
James Day, MySQL Senior Principal Support Engineer, Oracle