A new release of openark kit is out. Some interesting additions.changes are:
- oak-online-alter-table: now works with multicolumn UNIQUE KEYs
- New: oak-chunk-update: break an long running or non indexed UPDATE/DELETE to small chunks
- oak-purge-master-logs, oak-show-replication-status: support for non standard slave ports
openark kit is a set of utilities for MySQL, helping in easing out everyday’s work. Let’s look more closely at the changes.
oak-online-alter-table
The utility allows for non-blocking ALTER TABLE operations, under certain limitations. One limitation which has been removed in the current release was the single-column UNIQUE KEY limitation. As of now, a requirement for running oak-online-alter-table is that the altered table has some UNIQUE KEY. It could be numerical, textual, single column, multi-column (compound), anything.
Other requirements:
- The table must not have any FOREIGN KEY association (neither parent nor child). This will be the next requirement to change. Support for child-side FOREIGN KEY will be added. At current, I do not see a solution for parent-side.
- The altered table must share a UNIQUE KEY with the original one (that is, at least one UNIQUE KEY must stay in place after altering the table).
- The table must not have any ‘AFTER’ triggers defined. At current, I do not see a solution for that, the reason being that the utility needs to create those triggers as port of its mechanism, and that MySQL does not allow for atomic alteration of triggers. Also see this post.
- Some operations, such as TRUNCATE, OPTIMIZE etc. are not permitted on the table while the utility works.
There was need (and there still is) for many MySQL issues workarounds. I’ll update as work progresses.
oak-chunk-update
Using similar techniques to oak-online-alter-table, and requiring a UNIQUE KEY on a table, this utility allows for breaking of a long running, or a non-indexed query, into small chunks, thereby allowing for relatively non blocking UPDATE/DELETE operations.
For example, a DELETE FROM log WHERE log_ts < ‘2009-01-01 00:00:00’ might prove to be a long running query. If the log_ts column is not indexed, the table is virtually locked, whether this was MyISAM or InnoDB. Assuming an AUTO_INCREMENT column called ‘id’ exists (any UNIQUE KEY will do, so this is just an example), we can provide the following query as argument to oak-chunk-update:
DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND OAK_CHUNK(log.id)
The utility will parse the magic token OAK_CHUNK(…), and will work the query over distinct ranges, e.g.:
DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND log.id > 11000 AND log.id <= 12000 DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND log.id > 12000 AND log.id <= 13000 etc.
While the above can also be easily solved using a simple stored procedure, the magic becomes more interesting when you have a compound UNIQUE KEY, which may have non integral columns.
The utility also supports operations on multiple tables. Thus, one can use DELETE FROM t1 USING t1,t2…, for example, as long as an OAK_CHUNK(…) is proivded on one of the tables. It is interesting to see how the choice of table affects the chunking process.
oak-purge-master-logs, oak-show-replication-status
The two now support non default ports for slaves. This is achieved by utilizing the report_port parameter on the slaves, reading the value on SHOW SLAVE HOSTS (see this post). Both utilities support skipping SHOW SLAVE HOSTS and just picking up slaves by looking at SHOW PROCESSLIST.
Bugs
I’ve had issues with Google’s Issues, which made me leave a couple of unanswered issues for a couple of months. Brrrr. (BTW both were only related to prerequisites). I’ll resolve my issues issues. I’m looking forward for input, even though my spare time is little these days.
Acknowledgments
Writing of the toolkit is mostly done at my spare time, though being utilized in my professional work. Recently, a customer in urgent need has agreed for paying out for required development time on these scripts, while recognizing the open source model and license under which it is being developed and released, and removing all claims for copyright. I very much appreciate the customer’s approach and wish to acknowledge his contribution.
very good job!
i saw the source code, i’m not sure that ” if table_engine == “innodb” engine_flags = “LOCK IN SHARE MODE”” is useful,maybe,wo can play no locks,as the triger makes the gost table has most currend version, am i right?
looking forward your replay
Hi Lyz,
Take a look at the wiki page: OakOnlineAlterTableConcurrency to see why LOCK IN SHARED MODE is required.
Regards,
Shlomi
hello I am trying to run this update using Openark kit but I get an erorr:
update inquiry_tradelines it set it.tracking_id = (select st.tracking_id from supplier_tradelines st where st.id=it.supplier_tradeline_id) where it.tracking_id is null;
[root@tpareldb opt]# oak-chunk-update -u root –ask-pass –socket=/var/lib/mysql/mysql.sock –database=clarity_production –execute=”update inquiry_tradelines it set it.tracking_id = (
> select st.tracking_id
> from supplier_tradelines st
> where st.id=it.supplier_tradeline_id
> ) where it.tracking_id is null AND OAK_CHUNK(clarity_production.inquiry_tradelines)” –chunk-size=5 –sleep=20000 –verbose
Password:
— ERROR: Failed chunk: (1054, “Unknown column ‘inquiry_tradelines.id’ in ‘where clause'”)