Introducing gh-ost: triggerless online schema migrations

I’m thoroughly happy to introduce gh-ost: triggerless, controllable, auditable, testable, trusted online schema change tool released today by GitHub.

gh-ost now powers our production schema migrations. We hit some serious limitations using pt-online-schema-change on our large volume, high traffic tables, to the effect of driving our database to a near grinding halt or even to the extent of causing outages. With gh-ost, we are now able to migrate our busiest tables at any time, peak hours and heavy workloads included, without causing impact to our service.

gh-ost supports testing in production. It goes a long way to build trust, both in integrity and in control. Are your databases just too busy and you cannot run existing online-schema-change tools? Have you suffered outages due to migrations? Are you tired of babysitting migrations that run up to 3:00am? Tired of being the only one tailing logs? Please, take a look at gh-ost. I believe it changes online migration paradigm.

For a more thorough overview, please read the announcement on the GitHub Engineering Blog, and proceed to the documentation.

gh-ost is open sourced under the MIT license.

Orchestrator progress

This comes mostly to reassure, having moved into GitHub: orchestrator development continues.

I will have the privilege of working on this open source solution in GitHub. There are a few directions we can take orchestrator to, and we will be looking into the possibilities. We will continue to strengthen the crash recovery process, and in fact I’ve got a couple ideas on drastically shortening Pseudo-GTID recovery time as well as other debts. We will look into yet other directions, which we will share. My new and distinguished team will co-work on/with orchestrator and will no doubt provide useful and actionable input.

Orchestrator continues to be open for pull requests, with a temporal latency in response time (it’s the Holidays, mostly).

Some Go(lang) limitations (namely the import path, I’ll blog more about it) will most probably imply some changes to the code, which will be well communicated to existing collaborators.

Most of all, we will keep orchestrator a generic solution, while keeping focus on what we think is most important – and there’s some interesting vision here. Time will reveal as we make progress.

 

Speaking at Percona Live: common_schema, MySQL DevOps

In less than a month I’ll be giving these two talks at Percona Live:

If you are still unfamiliar with common_schema, this will make for a good introduction. I’ll give you multiple reasons why you would want to use it, and how it would come to immediate use at your company. I do mean immediate, as in previous common_schema presentations I happened to get feedback emails from attendees within the same or next day letting me know how common_schema solved an insistent problem of theirs or how it exposed an unknown status.

I’ll review some useful views & routines, and discuss the ease and power of QueryScript. common_schema is a Swiss-knife of solutions, and all from within your MySQL server.

I am using common_schema in production on a regular basis, and it happened to be hero of the day in multiple occasions. I’ll present a couple such cases.

common_schema 2.2: DBA's framework for MySQL (April 2014) from Shlomi Noach

This is a technical talk touching at some cultural issues.

At Outbrain, where I work, we have two blessings: a large group of engineers and a large dataset. We at the infrastructure team, together with the ops team, are responsible for the availability of the data. What we really like is technology which lets the owners of a problem be able to recognize it and take care of it. We want ops guys to do ops, and engineers to do engineering. And we want them to be able to talk to each other and understand each other.

What tools can you use to increase visibility? To allow sharing of data between the teams? I’ll share some tools and techniques that allow us to automate deployments, detect a malfunctioning/abusing service, deploy schema changes across dozens of hosts, control data retention, monitor connections, and more.

We like open source. The tools discussed are mostly open source, or open sourced by Outbrain.

I’ll explain why these tools matter, and how they serve the purpose of removing friction between teams, allowing for quick analysis of problems and overall visibility on all things that happen.

MySQL DevOps at Outbrain from Shlomi Noach

Do come by!

Why delegating code to MySQL Stored Routines is poor engineering practice

I happen to use stored routines with MySQL. In fact, my open source project common_schema heavily utilizes them. DBA-wise, I think they provide with a lot of power (alas, the ANSI:SQL 2003 syntax feels more like COBOL than a sane programming language, which is why I use QueryScript instead).

However I wish to discuss the use of stored routines as integral part of your application code, which I discourage.

The common discussion on whether to use or not use stored routines typically revolves around data transfer (with stored routines you transfer less data since it’s being processed on server side), security (with stored routines you can obfuscate/hide internal datasets, and provide with limited and expected API) and performance (with MySQL this is not what you would expect, as routines are interpreted & their queries re-evaluated, as opposed to other RDBMS you may be used to).

But I wish to discuss the use of stored routines from an engineering standpoint. The first couple of points I raise are cultural/behavioural.

2nd grade citizens

Your stored routines are not likely to integrate well with your IDE. While your Java/Scala/PHP/Ruby/whatnot code comfortably lies within your home directory, the stored routines live in their own space: a database container. They’re not as visible to you as your standard code. Your IDE is unaware of their existence and is unlikely to have the necessary plugin/state of mind to be able to view these.

This leads to difficulty in maintaining the code. People typically resort to using some SQL-oriented GUI tool such as MySQL Workbench, SequelPro or other, commercial tools. But these tools, while make it easy to edit your routine code, do not integrate (well?) with your source control. I can’t say I’ve used all GUI tools; but how many of them will have Git/SVN/Mercurial connectors? How many of them will keep local history changes once you edit a routine? I’m happy to get introduced to such a tool.

Even with such integration, you’re split between two IDEs. And if you’re the command line enthusiast, well, you can’t just svn ci -m “fixed my stored procedure bug”. Your code is simply not in your trunk directory.

It can be done. You could maintain the entire routine code from within your source tree, and hats off to all those who do it. Most will not. See later on about deployments for more on this. Continue reading » “Why delegating code to MySQL Stored Routines is poor engineering practice”

On MySQL plugin configuration

MySQL offers plugin API, with which you can add different types of plugins to the server. The API is roughly the same for all plugin types: you implement an init() function, a deinit(); you declare status variables and global variables associated with your plugin, and of course you implement the particular implementation of plugin call.

I wish to discuss the creation and use of global variables for plugins.

Consider the following declaration of a global variable in audit_login:

static MYSQL_SYSVAR_BOOL(enabled, plugin_enabled, PLUGIN_VAR_NOCMDARG,
"enable/disable the plugin's operation, namely writing to file", NULL, NULL, 1);

static struct st_mysql_sys_var * audit_login_sysvars[] = {
    MYSQL_SYSVAR(enabled),
    NULL
};

The above creates a new global variables called “simple_login_audit_enabled”: it is composed of the plugin name (known to be “simple_login_audit” in our example) and declared name (“enabled”). It is a boolean, defaults to 1, and is associated with the internal plugin_enabled variable.

Once this variable is declared, you can expect to be able to: Continue reading » “On MySQL plugin configuration”

Introducing audit_login: simple MySQL login logfile based auditing

audit_login is a simple MySQL login auditing plugin, logging any login or login attempt to log file in JSON format.

It seems that audit plugins are all the rage lately… We’ve developed out simple plugin a month ago as part of our database securing efforts; by auditing any login or login attempt we could either intercept or later investigate suspicious logins.

However we quickly realized there is much more to be gathered by this info.

In very short, you install this plugin onto your MySQL server, and your server starts writing into a text file called audit_login.log entries such as follows:

{"ts":"2013-09-11 09:11:47","type":"successful_login","myhost":"gromit03","thread":"74153868","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 09:11:55","type":"failed_login","myhost":"gromit03","thread":"74153869","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:11:57","type":"failed_login","myhost":"gromit03","thread":"74153870","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:12:48","type":"successful_login","myhost":"gromit03","thread":"74153871","user":"root","priv_user":"root","host":"localhost","ip":"10.0.0.111"}
{"ts":"2013-09-11 09:13:26","type":"successful_login","myhost":"gromit03","thread":"74153872","user":"web_user","priv_user":"web_user","host":"web-11.localdomain","ip":"10.0.0.11"}
{"ts":"2013-09-11 09:13:44","type":"successful_login","myhost":"gromit03","thread":"74153873","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 09:13:51","type":"successful_login","myhost":"gromit03","thread":"74153874","user":"web_user","priv_user":"web_user","host":"web-03.localdomain","ip":"10.0.0.03"}
{"ts":"2013-09-11 09:14:09","type":"successful_login","myhost":"gromit03","thread":"74153875","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 10:55:25","type":"successful_login","myhost":"gromit03","thread":"74153876","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 10:55:59","type":"successful_login","myhost":"gromit03","thread":"74153877","user":"web_user","priv_user":"web_user","host":"web-12.localdomain","ip":"10.0.0.12"}
{"ts":"2013-09-11 10:55:59","type":"failed_login","myhost":"gromit03","thread":"74153878","user":"(null)","priv_user":"(null)","host":"(null)","ip":"10.0.0.1"}

In the above your MySQL server is on gromit03, and is accepting connections from other hosts; some successful, some not. What kind of information can you gather from the above?

  • You can tell how many connections are being created on your server
  • Where they came from
  • Where ‘root’ connections come from
  • Port scans (see last row) can be identified by no credentials. These don’t have to be port scans per se; any telnet localhost 3006 followed by Ctrl+D will show the same. Typically these would be either load balancer or monitoring tools checks to see that the 3306 port is active.
  • You can tell which accounts connect, and how many times
  • And you can infer which accounts are stale and can be dropped — if an account does not connect within a week’s time, it’s probably stale (pick your own timeframe)

The above is quite interesting on one host; but we have dozens. We’ve installed this plugin on all our MySQL servers, and we use logstash to aggregate them. We aggregate to two destinations: Continue reading » “Introducing audit_login: simple MySQL login logfile based auditing”

common_schema 2.2: better QueryScript isolation & cleanup; TokuDB; table_rotate, split params

common_schema 2.2 is released. This is shortly after the 2.1 release; it was only meant as bug fixes release but some interesting things came up, leading to new functionality.

Highlights of the 2.2 release:

  • Better QueryScript isolation & cleanup: isolation improved across replication topology, cleanup done even on error
  • Added TokuDB related views
  • split with “index” hint (Ike, this is for you)
  • table_rotate(): a logrotate-like mechanism for tables
  • better throw()

Drill down:

Better QueryScript isolation & cleanup

common_schema 2.1 introduced persistent tables for QueryScript. This also introduced the problem of isolating concurrent scripts, all reading from and writing to shared tables. In 2.1 isolation was based on session id. However although unique per machine, collisions were possible across replication topology: a script could be issued on master, another on slave (I have such use cases) and both use same (local) session id.

With 2.2 isolation is based on server_id & session id combination; this is unique across a replication topology.

Until 2.1, QueryScript used temporary tables. This meant any error would just break the script, and the tables were left (isolated as they were, and auto-destroyed in time). With persistent tables a script throwing an error meant legacy code piling up. With common_schema 2.2 and on MySQL >= 5.5 all exceptions are caught, cleanup is made, leaving exceptions to be RESIGNALled.

TokuDB views

A couple TokuDB related views help out in converting to TokuDB and in figuring out tables status on disk: Continue reading » “common_schema 2.2: better QueryScript isolation & cleanup; TokuDB; table_rotate, split params”

common_schema roadmap thoughts

I’m happy with common_schema; it is in fact a tool I use myself on an almost daily basis. I’m also happy to see that it gains traction; which is why I’m exposing a little bit of my thoughts on general future development. I’d love to get feedback.

Supported versions

At this moment, common_schema supports MySQL >= 5.1, all variants. This includes 5.5, 5.6, MySQL, Percona Server & MariaDB.

5.1 is today past end of line, and I’m really missing the SIGNAL/RESIGNAL syntax that I would like to use; I can do in the meanwhile with version-specific code such as /*!50500 … */. Nevertheless, I’m wondering whether I will eventually have to:

  • Support different branches of common_schema (one that supports 5.1, one that supports >= 5.5)
  • Stop support for 5.1

Of course community-wise, the former is preferred; but I have limited resources, so I would like to make a quick poll here:

[poll id=”3″]

I’ll use the poll’s results as a vague idea of what people use and want. Or please use comments below to sound your voice!

rdebug

This was a crazy jump at providing a stored routine debugger and debugging API. From some talk I made I don’t see this getting traction. For the time being, I don’t see that I will concentrate my efforts on this. Actually it is almost complete. You can step-into, step-out, step-over, set breakpoints, read variables, modify variables — it’s pretty cool. Continue reading » “common_schema roadmap thoughts”

common_schema 2.1 released: advanced & improved split(), persistent script tables, more schema analysis, and (ahem) charts!

common_schema 2.1 is released! common_schema is your free & open source companion schema within your MySQL server, providing with a function library, scripting capabilities, powerful routines and ready-to-apply information and recommendations.

New and noteworthy in version 2.1:

  • Better QueryScript’s split() functionality
  • Persistent tables for QueryScript: no long held temporary tables
  • Index creation analysis, further range partition analysis
  • grant_access(): allow everyone to use common_schema
  • Ascii charts, google charts
  • debugged_routines: show routines with debug code

Other minor enhancements and bugfixes not listed.

Here’s a breakdown of the above:

split() enhancements

split is one of those parts of common_schema that (should) appeal to every DBA. Break a huge transaction automagically into smaller chunks, and don’t worry about how it’s done. If you like, throttle execution, or print progress, or…

split enhancements include:

  • A much better auto-detection-and-selection of the chunking index. split now consults all columns covered by the index, and uses realistic heuristics to decide which UNIQUE KEY on your table is best for the chunking process. A couple bugs are solved on the way; split is much smarter now.
  • Better support for multi-column chunking keys. You may now utilize the start/stop parameters even on multi column keys, passing a comma delimited of values for the split operation to start/end with, respectively. Also fixed issue for nonexistent start/stop values, which are now valid: split will just keep to the given range.
  • split no longer requires a temporary table open through the duration of its operation. See next section. Continue reading » “common_schema 2.1 released: advanced & improved split(), persistent script tables, more schema analysis, and (ahem) charts!”

mycheckpoint revision 231 released

A new release for mycheckpoint: lightweight, SQL oriented MySQL monitoring solution.

If you’re unfamiliar with mycheckpoint, well, the one minute sales pitch is: it’s a free and open source monitoring tool for MySQL, which is extremely easy to install and execute, and which includes custom queries, alerts (via emails), and out of the box HTTP server and charting.

This is mostly a maintenance release, with some long-time requested features, and of course solved bugs. Here are a few highlights:

  • Supports MariaDB and MySQL 5.6 (issues with new variables, space padded variables, text-valued variables)
  • Supports alerts via function invocation on monitored host (so not only checking alerts via aggregated data like ‘Seconds_behind_master’ but also by SELECT my_sanity_check_function() on monitored instance). See alerts.
  • Supports single-running-instance via “–single” command line argument
  • Supports strict sql_mode, including ONLY_FULL_GROUP_BY, overcoming bug #69310.
  • Supports sending of pending email HTML report
  • Better re-deployment process
  • Better recognizing of SIGNED/UNSIGNED values
  • Some other improvements in charting, etc.

mycheckpoint is released under the BSD license.

Downloads are available from the project’s page.