A new utility in openark kit allows for online ALTER TABLE operation. That is, the modification of table structure without locking down the entire table for the duration of the operation. The oak-online-alter-table utility works under the following restrictions:
- The table has at least one single-column UNIQUE KEY [*]
- Altered table shares a single-column UNIQUE KEY with the original table [*]
- No ‘AFTER’ triggers are defined on the table (the utility creates its own triggers for the duration of the operation)
- The table has no FOREIGN KEYs [*][#]
- Table name is no longer than 57 characters
[*]: Restriction is scheduled to be removed or partly removed.
[#]: ‘Child-side’ foreign keys may actually work, but have not been tested.
Follows is a mini FAQ which attempts to introduce the utility.
So what exactly does this utility provide?
- First and foremost, the ability to perform a non blocking ALTER TABLE. This has long been an issue with MySQL, and complex Master-Master, application aware solutions are currently required in order to perform an ALTER TABLE with minimal downtime. The utility offers a no-downtime solution, albeit there is performance penalty for the duration of its runtime, and some requirements to meet.
- It also supports a ‘null’ ALTER. That is, an ALTER TABLE which does not change anything. This effectively means rebuilding of the table. For InnoDB tables with innodb_file_per_table, for example, this could be the means of regaining disk space after removing many rows from the table. Also, while it does not strictly act like OPTIMIZE TABLE, the effect of running this utility should build a better organized table on disk (this as yet unverified).
- Another thing this utility supports is the building of a ghost table: a duplicate of a given table, which keeps mirroring the original table via triggers. [May be removed in future versions]
How does it work?
Well, there are some wiki pages [1] [2] [3] which explain the details, but in general it goes as follows:
- The utility creates an empty ‘ghost’ table, a copy of the original
- The ghost table is ALTERed (or left unchanged, if no ALTER specified)
- Triggers are created on the original table, which propagate changes to the ghost table.
- In two passes, data is copied from the original table to the ghost table, then reviewed for deletion.
- The ghost table is renamed to take place of the original table; the original table is dropped.
Here’s an invocation sample, in which we modify the type of the AUTO_INCREMENT column, add a column, add a key and drop a column – all at the same time, all online:
oak-online-alter-table -u root --ask-pass --socket=/tmp/mysql.sock --database=world --table=City --alter="MODIFY Id BIGINT UNSIGNED AUTO_INCREMENT, ADD COLUMN Mayor VARCHAR(64) CHARSET utf8, DROP COLUMN District, ADD KEY(Population)"
What kind of ALTERs are supported?
- ADD COLUMN (new column must have a default value)
- DROP COLUMN (as long as there’s still a shared single-column UNIQUE KEY between the old table and the altered one)
- MODIFY COLUMN (change type, including UNIQUE KEY columns)
- ADD KEY (normal, unique, fulltext etc.)
- DROP KEY (as long as there’s still a shared single-column UNIQUE KEY between the old table and the altered one)
- Change ENGINE: works, but great care should be taken when dealing with non-transactional engines
- Adding FOREIGN KEY constraints is possible
- More… Not all ALTERS have been tested, e.g. CONVERT, partition handling…
- none: it is possible to just do an online rebuild of the table, with no modification to its structure
The following are not allowed while the utility is running:
- An ALTER TABLE on the original table (well, obviously)
- A TRUNCATE on the original table
- LOAD DATA INFILE into the original table
- OPTIMIZE on the original table
In short, while this utility runs, just act normal…
Which engines are supported?
MyISAM & InnoDB are both supported. Any engine which uses table level locks is implicitly supported (Memory, Archive). There is currently no support for Maria (transactional), PBXT or Falcon. No plans for the near future include these.
It is possible to ALTER the ENGINE of a table, but this is highly unrecommended (due to differences in transactional behavior, or lack of transactional support altogether).
With InnoDB, and when running long transactions during the runtime of this utility, one of the steps (which requires a LOCK TABLES…WRITE) may take some time to complete: it repeatedly attempts to set up locks until successful. However, there is no blocking during these attempts, and the lock time, once the locks are gained, is minor.
What impact does this utility have?
For the duration of runtime, you can expect a performance penalty. There are many reasons for that:
- Triggers are added to your table. This, by itself, is performance penalty.
- Any changes to your table is propagated to the ghost table. So every INSERT makes for two, actually.
- More locks are held. While this utility is non-blocking, it does lock small chunks in turns, and for very short periods. But still locks are held.
- While the ghost table is being built, disk space of up to the same size as your original table is consumed. Depending on your ENGINE and storage means, some disk space may be released later on.
I was drunk when I decided to run it, but now that I’m sober, I find out it’s still running and I want to undo everything!
OK. First, don’t drink and DBA. Next, stop the utility, run it again with –cleanup. All should be back to normal.
How mature is this utility? I think there’s a bug!
It is very experimental. It seems to be working well as far as I know, but community feedback is appreciated. Please report bugs here. Thank you!
@flytox,
I’m actually running this on live servers (my own…) and find that load average does not increase significantly. What’s most affected is slave lag. Using the correct –sleep-time allows for slaves for better follow ups.
Great work, I interest in test it
Mohammad,
please do, I will appreciate any feedback
Regards
Thanks for the tool im using it now… Ive been doing similar things myself manually, first create schema, then do a first pass, then copy the remaining rows. Then rename/drop. Great script.