New and Noteworthy in openark kit

A new release of openark kit is out. Some interesting additions.changes are:

openark kit is a set of utilities for MySQL, helping in easing out everyday’s work. Let’s look more closely at the changes.

oak-online-alter-table

The utility allows for non-blocking ALTER TABLE operations, under certain limitations. One limitation which has been removed in the current release was the single-column UNIQUE KEY limitation. As of now, a requirement for running oak-online-alter-table is that the altered table has some UNIQUE KEY. It could be numerical, textual, single column, multi-column (compound), anything.

Other requirements:

  • The table must not have any FOREIGN KEY association (neither parent nor child). This will be the next requirement to change. Support for child-side FOREIGN KEY will be added. At current, I do not see a solution for parent-side.
  • The altered table must share a UNIQUE KEY with the original one (that is, at least one UNIQUE KEY must stay in place after altering the table).
  • The table must not have any ‘AFTER’ triggers defined. At current, I do not see a solution for that, the reason being that the utility needs to create those triggers as port of its mechanism, and that MySQL does not allow for atomic alteration of triggers. Also see this post.
  • Some operations, such as TRUNCATE, OPTIMIZE etc. are not permitted on the table while the utility works.

There was need (and there still is) for many MySQL issues workarounds. I’ll update as work progresses.

oak-chunk-update

Using similar techniques to oak-online-alter-table, and requiring a UNIQUE KEY on a table, this utility allows for breaking of a long running, or a non-indexed query, into small chunks, thereby allowing for relatively non blocking UPDATE/DELETE operations.

For example, a DELETE FROM log WHERE log_ts < ‘2009-01-01 00:00:00’ might prove to be a long running query. If the log_ts column is not indexed, the table is virtually locked, whether this was MyISAM or InnoDB. Assuming an AUTO_INCREMENT column called ‘id’ exists (any UNIQUE KEY will do, so this is just an example), we can provide the following query as argument to oak-chunk-update:

DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND OAK_CHUNK(log.id)

The utility will parse the magic token OAK_CHUNK(…), and will work the query over distinct ranges, e.g.:

DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND log.id > 11000 AND log.id <= 12000
DELETE FROM log WHERE log_ts < '2009-01-01 00:00:00' AND log.id > 12000 AND log.id <= 13000
etc.

While the above can also be easily solved using a simple stored procedure, the magic becomes more interesting when you have a compound UNIQUE KEY, which may have non integral columns.

The utility also supports operations on multiple tables. Thus, one can use DELETE FROM t1 USING t1,t2…, for example, as long as an OAK_CHUNK(…) is proivded on one of the tables. It is interesting to see how the choice of table affects the chunking process.

oak-purge-master-logs, oak-show-replication-status

The two now support non default ports for slaves. This is achieved by utilizing the report_port parameter on the slaves, reading the value on SHOW SLAVE HOSTS (see this post). Both utilities support skipping SHOW SLAVE HOSTS and just picking up slaves by looking at SHOW PROCESSLIST.

Bugs

I’ve had issues with Google’s Issues, which made me leave a couple of unanswered issues for a couple of months. Brrrr. (BTW both were only related to prerequisites). I’ll resolve my issues issues. I’m looking forward for input, even though my spare time is little these days.

Acknowledgments

Writing of the toolkit is mostly done at my spare time, though being utilized in my professional work. Recently, a customer in urgent need has agreed for paying out for required development time on these scripts, while recognizing the open source model and license under which it is being developed and released, and removing all claims for copyright. I very much appreciate the customer’s approach and wish to acknowledge his contribution.

4 thoughts on “New and Noteworthy in openark kit

  1. very good job!
    i saw the source code, i’m not sure that ” if table_engine == “innodb” engine_flags = “LOCK IN SHARE MODE”” is useful,maybe,wo can play no locks,as the triger makes the gost table has most currend version, am i right?
    looking forward your replay

  2. hello I am trying to run this update using Openark kit but I get an erorr:

    update inquiry_tradelines it set it.tracking_id = (select st.tracking_id from supplier_tradelines st where st.id=it.supplier_tradeline_id) where it.tracking_id is null;

  3. [root@tpareldb opt]# oak-chunk-update -u root –ask-pass –socket=/var/lib/mysql/mysql.sock –database=clarity_production –execute=”update inquiry_tradelines it set it.tracking_id = (
    > select st.tracking_id
    > from supplier_tradelines st
    > where st.id=it.supplier_tradeline_id
    > ) where it.tracking_id is null AND OAK_CHUNK(clarity_production.inquiry_tradelines)” –chunk-size=5 –sleep=20000 –verbose
    Password:

    — ERROR: Failed chunk: (1054, “Unknown column ‘inquiry_tradelines.id’ in ‘where clause'”)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.