openark kit – code.openark.org

common_schema & openark-kit in the media: #DBHangOps, OurSQL

MySQLcommon_schema community INFORMATION_SCHEMA openark kit Speaking Stored routinesJune 26, 2013

#DBHangOps

I had the pleasure of joining into @DBHangOps today, and speak about common_schema and openark-kit. What was meant to be a 15 minute session turned to be 50 — sorry, people, I don’t talk as much at home, but when it comes to my pet projects…

I also realized I was missing on a great event: DBHangOps is a hangout where you can chat and discuss MySQL & related technologies with friends and colleagues, with whom you typically only meet at conferences. I will certainly want to attend future events.

Thanks to John Cesario and Geoffrey Anderson who invited me to talk, and to the friends and familiar faces who attended; I was happy to talk about my work, and very interested in hearing about how it’s being put to use. We also had time to discuss ps_helper with no other than Mark Leith!

The video is available on Twitter/YouTube.

OurSQL

openark-kit has also been featured on the OurSQL podcast by Sheeri & Gerry, who did great coverage of some tools. I will disclose that more is to come; I’m happy this is in capable hands and look further to hear the next episode!

opeark-kit revision 196 released

MySQLDevelopment Open Source openark kit QueryScriptMay 7, 2013

This is a long due maintenance release of openark-kit. This release includes bugfixes and some enhancements, mainly to oak-online-alter-table.

oak-online-alter-table Changes / bug fixes include:

Support for keyword-named columns
Use of FORCE INDEX due to lack of MySQL’s ability for figure out the chunking key at all times
–sleep-ratio option added; allows for sleep time proportional to execution time (as opposed to constant sleep time with –sleep)
Support for chunk-retry (in case of deadlock) via –max-lock-retries)
Fixed order of cleanup
Fixed bug with verbose messages with non-integer chunking key
Fixed bug with single-row tables (people, no need for this tool for single row tables :))
Friendly verbose messages to remind you what’s being executed

oak-chunk-update changes includes:

Verbosing query comment if exists (friendly printing of what’s being executed)

Continue reading » “opeark-kit revision 196 released”

Speaking at Percona Live 2013: common_schema, lightning talks

MySQLcommon_schema openark kit PerconaLiveApril 11, 2013April 11, 2013

In two weeks time I will be giving these talks at Percona Live:

common_schema: DBA’s framework for MySQL: an introduction to common_schema, my evolving server side solutions project. This will be a revised version of the talk I gave at Percona Live London; I have felt some weaknesses during that talk, which I’ve thrown out, letting room for cool stuff. I will discuss common_schema‘s various views, interesting and useful routines, the power of QueryScript, and a brief intro to the newcomer rdebug, debugger and debugging API for MySQL. If you’re not familiar with common_schema, it’s a good time to pick up on what I (being most biased) consider to be your smart assistant to MySQL maintenance and administration!
The query which is the peak of my career: this is a 6 minute lightning talk. You’re bound to attend if you’re at the community reception (which you are), so I don’t need to do promotional. You already payed the ticket and the doors will be locked. No escapees.

As far as I’m concerned the conference can be closed down the moment I provide these two talks, and we can all go to the beach.

Wait, no, I will also be at the DotOrg Pavillion at the Exhibit Hall, where I present common_schema and openark-kit. Come by to hear more about these!

common_schema over traditional scripts

MySQLcommon_schema openark kit python QueryScript scripts Stored routinesDecember 12, 2012

If you are familiar with both openark kit and common_schema, you’ll notice I’ve incorporated some functionality already working in openark kit into common_schema, essentially rewriting what used to be a Python script into SQL/QueryScript.

What was my reasoning for rewriting good code? I wish to explain that, and provide with a couple examples.

I’m generally interested in pushing as much functionality into the MySQL server. When using an external script, one:

Needs the right dependencies (OS, Perl/Python version, Perl/Python modules).
Needs to provide with connection params,
Needs to get acquainted with a lot of command line options,
Is limited by whatever command line options are provided.
Has to invoke that script (duh!) to get the work done.

This last bullet is not so trivial: it means you can’t work some operation with your favorite GUI client, because it has no notion of your Perl script; does not run on the same machine where your Python code resides; simply can’t run those scripts for you.

With server-side code, functionality is accessible via any client. You run your operation via a query (e.g. CALL some_procedure). That can be done from your GUI client, your command line client, your event scheduler, your cronjob, all equally. You only need access to your MySQL server, which is trivial.

Of course, server side scripting is limited. Some stuff simply can’t be written solely on server side. If you want to consult your replicating slave; gracefully take action on user’s Ctrl+C, send data over the web, you’ll have to do it with an external tool. There are actually a lot of surprising limitations to things one would assume are possible on server side. You may already know how frustrated I am by the fact one can hardly get info from SHOW commands.

But, when it works, it shines

Let’s review a couple examples. The first one is nearly trivial. The second less so. Continue reading » “common_schema over traditional scripts”

State of InnDB Online DDL in MySQL 5.6.8-RC

MySQLInnoDB New Features openark kit ReviewNovember 20, 2012November 20, 2012

5.6.8-rc is out, and so I’m following up on InnoDB’s online DDL new feature: the ability to SELECT, INSERT, DELETE, UPDATE a table even while an ALTER TABLE is executing on same table.

The brief summary

Not as advertised; many things can’t be done.

The longer review

I’m using 5.6.8-rc 64bit binary distribution for Linux, installed via mysqlsandbox. My hardware is irrelevant, but the fact I’m testing on my laptop assists me in that ALTER TABLE operations take a while, so that I’m able to easily type commands in two terminals and have the time to watch them being executed. Query cache is disabled. Continue reading » “State of InnDB Online DDL in MySQL 5.6.8-RC”

Experimenting with 5.6 InnoDB Online DDL (bugs included)

MySQLInnoDB New Features openark kit Opinions scriptsOctober 18, 2012October 18, 2012

MySQL 5.6 offers the groundbreaking online DDL operations for InnoDB. Most common use cases will enjoy this feature, and the need for online alter table scripts will decrease. This is a killer feature!

I’ve put this new feature to the usability test. How did it go? Not too well, I’m afraid.

[Updates to this text inline], also see this followup.

sakila & DDL

sakila is still a very useful database. I say “still” because it is not very large, and computing power is getting stronger; yet on my laptop some operations can still take many seconds to complete, which is just fine for my tests.

Sakila tables are mostly InnoDB, and rental being the largest, I do:

node1 (sakila) > alter table sakila.rental engine=InnoDB;
Query OK, 16044 rows affected (6.94 sec)
Records: 16044  Duplicates: 0  Warnings: 0

So what can be executed during these 6.94 seconds? In a second terminal, I try the following: Continue reading » “Experimenting with 5.6 InnoDB Online DDL (bugs included)”

How common_schema split()s tables – internals

MySQLcommon_schema Indexing INFORMATION_SCHEMA openark kit QueryScript scriptsSeptember 6, 2012September 6, 2012

This post exposes some of the internals, and the SQL behind QueryScript’s split. common_schema/QueryScript 1.1 introduces the split statement, which auto-breaks a “large” query (one which operates on large tables as a whole or without keys) into smaller queries, and executes them in sequence.

This makes for easier transactions, less locks held, potentially (depending on the user) more idle time released back to the database. split has similar concepts to oak-chunk-update and pt-archiver, but works differently, and implemented entirely in SQL on server side.

Take the following statement as example:

split (UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR)
  pass;

It yields with (roughly) the following statements:

UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1')) OR ((`inventory`.`inventory_id` = '1'))) AND (((`inventory`.`inventory_id` < '1000')) OR ((`inventory`.`inventory_id` = '1000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1000'))) AND (((`inventory`.`inventory_id` < '2000')) OR ((`inventory`.`inventory_id` = '2000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '2000'))) AND (((`inventory`.`inventory_id` < '3000')) OR ((`inventory`.`inventory_id` = '3000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '3000'))) AND (((`inventory`.`inventory_id` < '4000')) OR ((`inventory`.`inventory_id` = '4000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '4000'))) AND (((`inventory`.`inventory_id` < '4581')) OR ((`inventory`.`inventory_id` = '4581'))));

(I say “roughly” because internally there are user defined variables at play, but for convenience, I verbose the actual values as constants.)

How does that work?

common_schema works on server side. There is no Perl script or anything. It must therefore use server-side operations to:

Identify table to be split
Analyze the table in the first place, deciding how to split it
Analyze the query, deciding on how to rewrite it
Split the table (logically) into unique and distinct chunks
Work out the query on each such chunk

Following is an internal look at how common_schema does all the above. Continue reading » “How common_schema split()s tables – internals”

DELETE, don’t INSERT

MySQLcommon_schema openark kit Performance QueryScriptJune 27, 2012June 27, 2012

Have just read INSERT, Don’t DELETE by Aaron Brown, and have some lengthy response, which is why I write this post instead of commenting on said post.

I wish to offer my counter thought and suggest that DELETEs are probably the better choice.

Aaron suggests that, when one wishes to purge rows from some table, a trick can be used: instead of DELETEing unwanted rows, one can INSERT “good” rows into a new table, then switch over with RENAME (but please read referenced post for complete details).

I respectfully disagree on several points discussed.

Lockdown

The fact one needs to block writes during the time of creation of new table is problematic: you need to essentially turn off parts of your application. The posts suggests one could use a slave – but this solution is far from being trivial as well. To switch over, you yet again need to turn off access to DB, even if for a short while. Continue reading » “DELETE, don’t INSERT”

Webinar review: Zero-Downtime Schema Changes In MySQL

MySQLopenark kitMay 3, 2012

Yesterday I attended the Zero-Downtime Schema Changes In MySQL webinar by Baron Schwartz, Percona (do you say “attended” for something you listened to from your home office?)

I was keen to learn about possible enhancements and improvements of pt-online-schema-change over oak-online-alter-table. Here are my impressions:

The base logic of pt-online-schema-change is essentially the same as of oak-online-alter-table. You create a ghost/shadow table, create complex triggers, copy in chunks, freeze and swap. Both work on any type of PRIMARY KEY (oak-online-alter-table can work with any UNIQUE KEY, I’m not sure about pt-online-schema-change on this), be it an INTEGER, other type, or a multi column one.

However, pt-online-schema-change also adds the following:

It supports FOREIGN KEYs (to some extent). This is something I’ve wanted to do with oak-online-alter-table but never got around to it. Foreign keys are very tricky, as Baron noted. With child-side keys, things are reasonably manageable. With parent-side this becomes a nightmare, sometimes unsolvable (when I say “unsolvable”, I mean that under the constraint of having the operation run in a non-blocking, transparent way).
Chunk size is auto-calculated by the script. This is a cool addition. Instead of letting the user throwing out numbers like 1,000 rows per chunk, in the hope that this is neither too small nor too large, the tool monitors the time it takes a chunk to complete, then adjusts the size of next chunk accordingly. Hopefully this leads to a more optimized run, where locks are only held for very short periods, yet enough rows are being processed at a time.
The tool looks into replicating slaves to verify they’re up to the job. If the slave lags too far, the tool slows down the work. This is an excellent feature, and again, one that I always wanted to have. Great work!

So the three bullets above are what I understand to be the major advantages of Percona’s tool over oak-online-alter-table.

Q & A

The presentation itself was very good, and Baron answered some questions. There was one question he did not answer during the webinar, nor here, and I though I may pop in and answer it. Although I can’t speak for the coders of pt-online-schema-change, I safely assume that since the logic follows that of oak-online-alter-table, the same answer applies in the case of Percona’s toolkit. Continue reading » “Webinar review: Zero-Downtime Schema Changes In MySQL”

Documentation in SQL: CALL for help()

Development MySQLcommon_schema documentation mycheckpoint Open Source openark kit Stored routinesJanuary 11, 2012

Documentation is an important part of any project. On the projects I maintain I put a lot of effort on documentation, and, frankly, the majority of time spent on my projects is on documentation.

The matter of keeping the documentation faithful is a topic of interest. I’d like to outline a few documentation bundling possibilities, and the present the coming new documentation method for common_schema. I’ll talk about any bundling that is NOT man pages.

High level: web docs

This is the initial method of documentation I used for openark kit and mycheckpoint. It’s still valid for mycheckpoint. Documentation is web-based. You need Internet access to read it. It’s in HTML format.

Well, not exactly HTML format: I wrote it in WordPress. Yes, it’s HTML, but there’s a lot of noise around (theme, menus, etc.) which is not strictly part of the documentation.

While this is perhaps the easiest way to go, here’s a few drawbacks: Continue reading » “Documentation in SQL: CALL for help()”