Development – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Tue, 27 Dec 2016 16:45:41 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 Introducing gh-ost: triggerless online schema migrations https://shlomi-noach.github.io/blog/mysql/introducing-gh-ost-triggerless-online-schema-migrations https://shlomi-noach.github.io/blog/mysql/introducing-gh-ost-triggerless-online-schema-migrations#comments Mon, 01 Aug 2016 17:19:00 +0000 https://shlomi-noach.github.io/blog/?p=7596 I’m thoroughly happy to introduce gh-ost: triggerless, controllable, auditable, testable, trusted online schema change tool released today by GitHub.

gh-ost now powers our production schema migrations. We hit some serious limitations using pt-online-schema-change on our large volume, high traffic tables, to the effect of driving our database to a near grinding halt or even to the extent of causing outages. With gh-ost, we are now able to migrate our busiest tables at any time, peak hours and heavy workloads included, without causing impact to our service.

gh-ost supports testing in production. It goes a long way to build trust, both in integrity and in control. Are your databases just too busy and you cannot run existing online-schema-change tools? Have you suffered outages due to migrations? Are you tired of babysitting migrations that run up to 3:00am? Tired of being the only one tailing logs? Please, take a look at gh-ost. I believe it changes online migration paradigm.

For a more thorough overview, please read the announcement on the GitHub Engineering Blog, and proceed to the documentation.

gh-ost is open sourced under the MIT license.

]]>
https://shlomi-noach.github.io/blog/mysql/introducing-gh-ost-triggerless-online-schema-migrations/feed 1 7596
Orchestrator progress https://shlomi-noach.github.io/blog/mysql/orchestrator-progress https://shlomi-noach.github.io/blog/mysql/orchestrator-progress#comments Wed, 23 Dec 2015 16:01:59 +0000 https://shlomi-noach.github.io/blog/?p=7538 This comes mostly to reassure, having moved into GitHub: orchestrator development continues.

I will have the privilege of working on this open source solution in GitHub. There are a few directions we can take orchestrator to, and we will be looking into the possibilities. We will continue to strengthen the crash recovery process, and in fact I’ve got a couple ideas on drastically shortening Pseudo-GTID recovery time as well as other debts. We will look into yet other directions, which we will share. My new and distinguished team will co-work on/with orchestrator and will no doubt provide useful and actionable input.

Orchestrator continues to be open for pull requests, with a temporal latency in response time (it’s the Holidays, mostly).

Some Go(lang) limitations (namely the import path, I’ll blog more about it) will most probably imply some changes to the code, which will be well communicated to existing collaborators.

Most of all, we will keep orchestrator a generic solution, while keeping focus on what we think is most important – and there’s some interesting vision here. Time will reveal as we make progress.

 

]]>
https://shlomi-noach.github.io/blog/mysql/orchestrator-progress/feed 4 7538
Speaking at Percona Live: common_schema, MySQL DevOps https://shlomi-noach.github.io/blog/mysql/speaking-at-perconalive-common_schema-mysql-devops https://shlomi-noach.github.io/blog/mysql/speaking-at-perconalive-common_schema-mysql-devops#respond Mon, 10 Mar 2014 08:27:42 +0000 https://shlomi-noach.github.io/blog/?p=6769 In less than a month I’ll be giving these two talks at Percona Live:

If you are still unfamiliar with common_schema, this will make for a good introduction. I’ll give you multiple reasons why you would want to use it, and how it would come to immediate use at your company. I do mean immediate, as in previous common_schema presentations I happened to get feedback emails from attendees within the same or next day letting me know how common_schema solved an insistent problem of theirs or how it exposed an unknown status.

I’ll review some useful views & routines, and discuss the ease and power of QueryScript. common_schema is a Swiss-knife of solutions, and all from within your MySQL server.

I am using common_schema in production on a regular basis, and it happened to be hero of the day in multiple occasions. I’ll present a couple such cases.

This is a technical talk touching at some cultural issues.

At Outbrain, where I work, we have two blessings: a large group of engineers and a large dataset. We at the infrastructure team, together with the ops team, are responsible for the availability of the data. What we really like is technology which lets the owners of a problem be able to recognize it and take care of it. We want ops guys to do ops, and engineers to do engineering. And we want them to be able to talk to each other and understand each other.

What tools can you use to increase visibility? To allow sharing of data between the teams? I’ll share some tools and techniques that allow us to automate deployments, detect a malfunctioning/abusing service, deploy schema changes across dozens of hosts, control data retention, monitor connections, and more.

We like open source. The tools discussed are mostly open source, or open sourced by Outbrain.

I’ll explain why these tools matter, and how they serve the purpose of removing friction between teams, allowing for quick analysis of problems and overall visibility on all things that happen.

Do come by!

]]>
https://shlomi-noach.github.io/blog/mysql/speaking-at-perconalive-common_schema-mysql-devops/feed 0 6769
Why delegating code to MySQL Stored Routines is poor engineering practice https://shlomi-noach.github.io/blog/mysql/why-delegating-code-to-mysql-stored-routines-is-poor-engineering-practice https://shlomi-noach.github.io/blog/mysql/why-delegating-code-to-mysql-stored-routines-is-poor-engineering-practice#comments Thu, 06 Feb 2014 08:32:17 +0000 https://shlomi-noach.github.io/blog/?p=6713 I happen to use stored routines with MySQL. In fact, my open source project common_schema heavily utilizes them. DBA-wise, I think they provide with a lot of power (alas, the ANSI:SQL 2003 syntax feels more like COBOL than a sane programming language, which is why I use QueryScript instead).

However I wish to discuss the use of stored routines as integral part of your application code, which I discourage.

The common discussion on whether to use or not use stored routines typically revolves around data transfer (with stored routines you transfer less data since it’s being processed on server side), security (with stored routines you can obfuscate/hide internal datasets, and provide with limited and expected API) and performance (with MySQL this is not what you would expect, as routines are interpreted & their queries re-evaluated, as opposed to other RDBMS you may be used to).

But I wish to discuss the use of stored routines from an engineering standpoint. The first couple of points I raise are cultural/behavioural.

2nd grade citizens

Your stored routines are not likely to integrate well with your IDE. While your Java/Scala/PHP/Ruby/whatnot code comfortably lies within your home directory, the stored routines live in their own space: a database container. They’re not as visible to you as your standard code. Your IDE is unaware of their existence and is unlikely to have the necessary plugin/state of mind to be able to view these.

This leads to difficulty in maintaining the code. People typically resort to using some SQL-oriented GUI tool such as MySQL Workbench, SequelPro or other, commercial tools. But these tools, while make it easy to edit your routine code, do not integrate (well?) with your source control. I can’t say I’ve used all GUI tools; but how many of them will have Git/SVN/Mercurial connectors? How many of them will keep local history changes once you edit a routine? I’m happy to get introduced to such a tool.

Even with such integration, you’re split between two IDEs. And if you’re the command line enthusiast, well, you can’t just svn ci -m “fixed my stored procedure bug”. Your code is simply not in your trunk directory.

It can be done. You could maintain the entire routine code from within your source tree, and hats off to all those who do it. Most will not. See later on about deployments for more on this.

Testing

While engineers are keen on writing unit tests for every class and method they create, they are less keen on doing the same for stored routines. This is an observation, having seen many instalments. And I can tell you why: your stored routine testing will not integrate well with your JUnit/PHPUnit/…

There are testing frameworks for databases, and indeed I hacked my own mini unit testing code with common_schema. But it’s a different testing framework. You might also have realized by now that testing databases is somewhat different. It can be done, and hats off again to those that implement it as common practice. Many don’t. Database are often more heavyweight to test. Not all operations done by routines are easily rolled back, which leads to having to rebuild the entire dataset before tests. This in itself leads to longer test periods and a need for multiple test databases so as to allow for concurrent builds.

How many companies practice both version control and unit testing over their routine code? I believe not many (and am happy to hear about those who do). To be more direct, of all the companies I ever consulted to: I have never seen one that does both.

Debugging

MySQL stored routines do not have built in debugging capabilities. To debug your routines, you will have to use one of two methods:

  • Simulate your routine code (ie mimic their execution on top of some interpreter). There are tools to do that. For me this is a complete NO GO and utterly untrustworthy. You can mimic what you think is how the routine should behave, but never they full behaviour. While developing common_schema I came upon plenty weird behaviour, some of it bugs, that you just can’t build into your emulation.
  • Inject debugging code into your routine code. I do that with RDebug. You can do breakpoints, step into, step out, most of the interesting stuff. Other tools do that as well. It is not the right way to go: you’re essentially modifying your code, placing more locks, communicating, and losing some functionality. It is a necessary evil solution for a necessary evil programming method… How good can that be?

The right way to go would be to have debugging API built into the MySQL server.

But, wait, that would still be next to worthless, since our discussion is over programming with stored routines: letting your application call upon stored routines in your database. Until the day where I could use my IntelliJ debugger to step from my java method which calls upon a stored procedure, and into the stored procedure itself, debugging your code is completely detached from your stored routine debugging.

Refactoring & deploying

Say you wanted to add a column to your table: you would go ahead and add it, and perhaps populate it. You would then modify your application code to support this new column, and deploy. Say you wanted to drop a table column. You would first deploy changes to your application code that ignore said column, and once the code is in place you would go and actually make the DROP.

How do you do the same with a stored routine? Support your routine accepts two parameters, and you wish to add a third?

There is no support for optional parameters. Your routine either accepts two parameters or three. Your application code will have to provide the exact number of parameters. You will have to deploy both your SQL changes and your application changes at the same time. This is by definition impossible, unless you are OK with a stop the world approach, which is unlikely in production.

Code constraints

One solution to the above is to create a new routines. Somehow “overload” it. But you can’t overload a stored routine; you’ll have to create a routine by a new name. This will allow you to slowly and smoothly migrate between the two.

Ahem, smoothly? How easy is it to find all invocations of a certain routines from your code? It will be typically lie in some String, or within some XML config file. There is no safe “find references to this procedure” IDE mechanism. There is no constraint in your IDE that will tell you “there is no such procedure” if you misspell the name.

Trash bin

Suppose you overcame the above. You now have two routines. You need to remember to DROP the old one, right? Will you?

When presenting common_schema, a common question I ask the audience is as follows:

Suppose I accessed your database and listed the entire set of stored functions and procedures. How many of them are you not even sure are in use anymore? How many of them you think you can DROP, but are too afraid to, and keep them in just in case?

I wouldn’t commonly ask that question had it not always provides a common nodding and smiling in the audience. People forget to drop their routines, and then forget about them, and are never sure whether they are used (your IDE doesn’t easily tell you that, remember? Sure, you can grep around; that’s not what most engineers would do). And those routines pile up to become trash.

Data or code?

Last but not least: a stored routine is a piece of code, right? Well, as far as the database is concerned, it’s really a piece of data. It’s located within a schema. It’s stored. It is an integral part of your data set: when you back up your data, you’re most likely to backup the code as well. When you restore, you’re likely to restore both. There are obvious advantages to that, DB-wise. Or should I say, DBA-wise. Engineering-wise? Does a database-restore operation count as code deployment? We can argue over beer.

Final notes

Having said all that: yes, I’m using an occasional stored routine. I see these occasions as a necessary evil, and sometimes it’s just the correct solution.

I’m happy to know what methods have been developed out there to overcome the above, please share; and please feel free to contradict the above.

]]>
https://shlomi-noach.github.io/blog/mysql/why-delegating-code-to-mysql-stored-routines-is-poor-engineering-practice/feed 11 6713
On MySQL plugin configuration https://shlomi-noach.github.io/blog/mysql/on-mysql-plugin-configuration https://shlomi-noach.github.io/blog/mysql/on-mysql-plugin-configuration#comments Tue, 01 Oct 2013 06:50:08 +0000 https://shlomi-noach.github.io/blog/?p=6579 MySQL offers plugin API, with which you can add different types of plugins to the server. The API is roughly the same for all plugin types: you implement an init() function, a deinit(); you declare status variables and global variables associated with your plugin, and of course you implement the particular implementation of plugin call.

I wish to discuss the creation and use of global variables for plugins.

Consider the following declaration of a global variable in audit_login:

static MYSQL_SYSVAR_BOOL(enabled, plugin_enabled, PLUGIN_VAR_NOCMDARG,
"enable/disable the plugin's operation, namely writing to file", NULL, NULL, 1);

static struct st_mysql_sys_var * audit_login_sysvars[] = {
    MYSQL_SYSVAR(enabled),
    NULL
};

The above creates a new global variables called “simple_login_audit_enabled”: it is composed of the plugin name (known to be “simple_login_audit” in our example) and declared name (“enabled”). It is a boolean, defaults to 1, and is associated with the internal plugin_enabled variable.

Once this variable is declared, you can expect to be able to:

mysql> show global variables like '%audit%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| simple_login_audit_enabled | ON    |
+----------------------------+-------+

mysql> set global simple_login_audit_enabled := 0;
Query OK, 0 rows affected (0.00 sec)

and you can expect using the following in your my.cnf file:

[mysqld]
...
simple_login_audit_enabled=1

Assuming your server agrees to start

Here’s the catch: the simple_login_audit_enabled variable is only recognized as long as the plugin is installed. As you may know, plugins can be loaded upon startup time using an explicit my.cnf entry such as:

plugin_load=audit_login.so

But you may also, at any given time, INSTALL or UNINSTALL the plugin dynamically.

install plugin SIMPLE_LOGIN_AUDIT soname 'audit_login.so';
uninstall plugin SIMPLE_LOGIN_AUDIT;

In fact there are good reasons to do so: you may be upgrading your plugin. You can’t just throw in the new binary (it’s a guaranteed crash on next server shutdown). You need to first UNINSTALL it; you then put the new binary, and re-INSTALL. This works well, and the price is some downtime for your plugin.

But what happens if you restart the server while your plugin is uninstalled? Yep: the global variable is unrecognised, and your MySQL server refuses to start:

130919  8:11:30 [ERROR] /usr/bin/mysqld: unknown variable 'simple_login_audit_enabled=1'
130919  8:11:30 [ERROR] Aborting
130919  8:11:30  InnoDB: Starting shutdown...
130919  8:11:31  InnoDB: Shutdown completed; log sequence number 40185651
130919  8:11:31 [Note] /usr/bin/mysqld: Shutdown complete

I did happen on this case a couple times; it is frustrating.

What are the alternatives?

So adding variables to my.cnf may prevent MySQL from starting. In my dictionary, this spells “NO GO”.

With audit_login I chose to (additionally) support an external config file, audit_login.cnf, expected to be found in the @@datadir. It is similar in essence to the master.info file which is expected by replication. The plugin reads this file (if existing) upon init(), which makes it execute upon server startup or upon INSTALL PLUGIN. I can’t argue that this is the best solution, but it is a solution that does not interfere with anyone. The file is ignored by all and does not disturb the public peace. The plugin does not require it to exist.

I was hoping to be able to directly read from my.cnf, but am unsure if there is a definitive way to do so from within the plugin. I did not go deep into this.

What would be best?

Hopefully I’m not missing on anything. But it would be nice to have plugin-dedicated variables in my.cnf which are ignored by the server. These could take the form of:

[mysql_plugin]
simple_login_audit_enabled=0

or similar. It would be the server’s responsibility to pass these declarations to the plugins, but ignore them itself (or just pass warning).

]]>
https://shlomi-noach.github.io/blog/mysql/on-mysql-plugin-configuration/feed 9 6579
Introducing audit_login: simple MySQL login logfile based auditing https://shlomi-noach.github.io/blog/mysql/introducing-audit_login-simple-mysql-login-logfile-based-auditing https://shlomi-noach.github.io/blog/mysql/introducing-audit_login-simple-mysql-login-logfile-based-auditing#comments Tue, 17 Sep 2013 07:24:09 +0000 https://shlomi-noach.github.io/blog/?p=6523 audit_login is a simple MySQL login auditing plugin, logging any login or login attempt to log file in JSON format.

It seems that audit plugins are all the rage lately… We’ve developed out simple plugin a month ago as part of our database securing efforts; by auditing any login or login attempt we could either intercept or later investigate suspicious logins.

However we quickly realized there is much more to be gathered by this info.

In very short, you install this plugin onto your MySQL server, and your server starts writing into a text file called audit_login.log entries such as follows:

{"ts":"2013-09-11 09:11:47","type":"successful_login","myhost":"gromit03","thread":"74153868","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 09:11:55","type":"failed_login","myhost":"gromit03","thread":"74153869","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:11:57","type":"failed_login","myhost":"gromit03","thread":"74153870","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:12:48","type":"successful_login","myhost":"gromit03","thread":"74153871","user":"root","priv_user":"root","host":"localhost","ip":"10.0.0.111"}
{"ts":"2013-09-11 09:13:26","type":"successful_login","myhost":"gromit03","thread":"74153872","user":"web_user","priv_user":"web_user","host":"web-11.localdomain","ip":"10.0.0.11"}
{"ts":"2013-09-11 09:13:44","type":"successful_login","myhost":"gromit03","thread":"74153873","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 09:13:51","type":"successful_login","myhost":"gromit03","thread":"74153874","user":"web_user","priv_user":"web_user","host":"web-03.localdomain","ip":"10.0.0.03"}
{"ts":"2013-09-11 09:14:09","type":"successful_login","myhost":"gromit03","thread":"74153875","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 10:55:25","type":"successful_login","myhost":"gromit03","thread":"74153876","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 10:55:59","type":"successful_login","myhost":"gromit03","thread":"74153877","user":"web_user","priv_user":"web_user","host":"web-12.localdomain","ip":"10.0.0.12"}
{"ts":"2013-09-11 10:55:59","type":"failed_login","myhost":"gromit03","thread":"74153878","user":"(null)","priv_user":"(null)","host":"(null)","ip":"10.0.0.1"}

In the above your MySQL server is on gromit03, and is accepting connections from other hosts; some successful, some not. What kind of information can you gather from the above?

  • You can tell how many connections are being created on your server
  • Where they came from
  • Where ‘root’ connections come from
  • Port scans (see last row) can be identified by no credentials. These don’t have to be port scans per se; any telnet localhost 3006 followed by Ctrl+D will show the same. Typically these would be either load balancer or monitoring tools checks to see that the 3306 port is active.
  • You can tell which accounts connect, and how many times
  • And you can infer which accounts are stale and can be dropped — if an account does not connect within a week’s time, it’s probably stale (pick your own timeframe)

The above is quite interesting on one host; but we have dozens. We’ve installed this plugin on all our MySQL servers, and we use logstash to aggregate them. We aggregate to two destinations:

  • All logs are being sent over to kibana, where they are easily searchable via lucene queries
  • They are also all aggregated into one CSV formatted logfile, rotating daily; this format makes it easier for me to grep and uniq -c and generally produce very valuable data by scripting.

By aggregating the logs from multiple hosts, we are able to gather such insight as:

  • How well our load balancers are distributing connections
  • How our various applications servers connect on our multiple data centres to our MySQL hosts
  • Catching misbehaving apps with hardcoded server IPs
  • Catching naughty developers logging into our masters directly (we have more mechanisms now to prevent and audit such cases)
  • Identifying differences in behaviour of monitoring tools across our data centres
  • etc.

Free to use

The audit_login plugin has been made free to use, open source, licensed under GNU GPL v2. It is publicly available in our github repository, where more documentation on build and installation can be found.

As MySQL plugins go, you must compile your plugin with your particular MySQL version. For you convenience, pre-compiled binaries are available in the repository (at this time for 5.5.32 and 5.5.21 linux 64 bit; likely to add more)

]]>
https://shlomi-noach.github.io/blog/mysql/introducing-audit_login-simple-mysql-login-logfile-based-auditing/feed 9 6523
common_schema 2.2: better QueryScript isolation & cleanup; TokuDB; table_rotate, split params https://shlomi-noach.github.io/blog/mysql/common_schema-2-2-better-queryscript-isolation-tokudb-table_rotate-split-params https://shlomi-noach.github.io/blog/mysql/common_schema-2-2-better-queryscript-isolation-tokudb-table_rotate-split-params#comments Tue, 13 Aug 2013 03:39:12 +0000 https://shlomi-noach.github.io/blog/?p=6468 common_schema 2.2 is released. This is shortly after the 2.1 release; it was only meant as bug fixes release but some interesting things came up, leading to new functionality.

Highlights of the 2.2 release:

  • Better QueryScript isolation & cleanup: isolation improved across replication topology, cleanup done even on error
  • Added TokuDB related views
  • split with “index” hint (Ike, this is for you)
  • table_rotate(): a logrotate-like mechanism for tables
  • better throw()

Drill down:

Better QueryScript isolation & cleanup

common_schema 2.1 introduced persistent tables for QueryScript. This also introduced the problem of isolating concurrent scripts, all reading from and writing to shared tables. In 2.1 isolation was based on session id. However although unique per machine, collisions were possible across replication topology: a script could be issued on master, another on slave (I have such use cases) and both use same (local) session id.

With 2.2 isolation is based on server_id & session id combination; this is unique across a replication topology.

Until 2.1, QueryScript used temporary tables. This meant any error would just break the script, and the tables were left (isolated as they were, and auto-destroyed in time). With persistent tables a script throwing an error meant legacy code piling up. With common_schema 2.2 and on MySQL >= 5.5 all exceptions are caught, cleanup is made, leaving exceptions to be RESIGNALled.

TokuDB views

A couple TokuDB related views help out in converting to TokuDB and in figuring out tables status on disk:

  • sql_alter_table_tokudb will help you out to generate the complex ALTER statement to TokuDB engine if you happen to used COMPRESSED InnoDB tables with KEY_BLOCK_SIZE specified. The view generates a complex DROP KEYs & ADD KEYs statementl this is due to bug …
  • tokudb_file_map simplifies the INFORMATION_SCHEMA.Tokudb_file_map table: the original view is not normalized and is difficult to interpret and follow when your table had many indexes or is partitioned (I will write more on this shortly). with common_schema‘s tokudb_file_map you get, per table, the list of files representing that table, along with a couple Shell commands to tell you the thing you want to know most: “what is the size of my TokuDB table on disk?”

split

QueryScript’s split device now supports the index” parameter (or hint), which instructs the split() operation to use an explicitly named index. If used, the index must exist and must be UNIQUE.

table_rotate()

Rotate your tables a-la logrotate with table_rotate(): generate a new, identical, empty table, version your table, pushing older versions along the line; optionally drop older versions. You get the picture. Got some nice use case behind this on cleaning up a test database.

throw()

On MySQL >= 5.5 throw() uses SIGNAL. No more weird “table `Unknown column ‘$t’ in ‘field list’` does not exist” messages. Just plain old:

ERROR 1054 (42S22): Unknown column '$t' in 'field list'

Get it

common_schema is free and open source. It is licensed under GPL v2. This is where you can find and download latest common_schema release.

Your input is welcome! Please submit your bugs, or otherwise share your experience with common_schema.

]]>
https://shlomi-noach.github.io/blog/mysql/common_schema-2-2-better-queryscript-isolation-tokudb-table_rotate-split-params/feed 1 6468
common_schema roadmap thoughts https://shlomi-noach.github.io/blog/mysql/common_schema-roadmap-thoughts https://shlomi-noach.github.io/blog/mysql/common_schema-roadmap-thoughts#comments Mon, 22 Jul 2013 12:36:08 +0000 https://shlomi-noach.github.io/blog/?p=6435 I’m happy with common_schema; it is in fact a tool I use myself on an almost daily basis. I’m also happy to see that it gains traction; which is why I’m exposing a little bit of my thoughts on general future development. I’d love to get feedback.

Supported versions

At this moment, common_schema supports MySQL >= 5.1, all variants. This includes 5.5, 5.6, MySQL, Percona Server & MariaDB.

5.1 is today past end of line, and I’m really missing the SIGNAL/RESIGNAL syntax that I would like to use; I can do in the meanwhile with version-specific code such as /*!50500 … */. Nevertheless, I’m wondering whether I will eventually have to:

  • Support different branches of common_schema (one that supports 5.1, one that supports >= 5.5)
  • Stop support for 5.1

Of course community-wise, the former is preferred; but I have limited resources, so I would like to make a quick poll here:

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

I’ll use the poll’s results as a vague idea of what people use and want. Or please use comments below to sound your voice!

rdebug

This was a crazy jump at providing a stored routine debugger and debugging API. From some talk I made I don’t see this getting traction. For the time being, I don’t see that I will concentrate my efforts on this. Actually it is almost complete. You can step-into, step-out, step-over, set breakpoints, read variables, modify variables — it’s pretty cool.

But someone will eventually have to write a GUI front-end for this (eclipse/IntelliJ/whatever); I know not many will use a command line approach for a debugger. I also know I’m not going to write the GUI front-end. So the API is there, let’s see how it rolls.

QueryScript

I will keep on improving QueryScript, and in particular split, error handling, and otherwise simplification of common tasks. I have no doubt QueryScript goes the right way: I just see how easy it is to solve complex problems with a QueryScript one-liner. Other bullets on my TODO for QueryScript:

  • Script tracking (a semi-debugging mechanism, which allows one to recognize status of script)
  • Message passing to scripts (again, a semi-debugger approach)
  • Error recovery; ability to resume script from point of failure or point of suspension. I have plenty use cases for that.

performance_schema

I will most probably include parts of Mark Leith’s ps_helper, which is released under a permissive license, and otherwise draw ideas from his work. I’m happy to learn parts of ps_helper were influenced by common_schema itself.

Hosting

common_schema will most probably move out of Google Code; by Jan 2014 there will no longer be a “Downloads” section, and I really, really, want there to be a “Downloads” section.

I could go LaunchPad, GitHub, BitBucket (they don’t have “Downloads” sections, either, do they?), other; any advice?

World domination

Yep. This is still common_schema‘s goal. More seriously, I would want to see it installed on every single MySQL server instance. Then I would control your fate. bwahahaha!

Even more seriously, if you are a happy user, please do pass the word. I can only blog so much and present so much; there are no financing resources for this project, and I need all the help I can get in promoting common_schema. Thank you!

]]>
https://shlomi-noach.github.io/blog/mysql/common_schema-roadmap-thoughts/feed 4 6435
common_schema 2.1 released: advanced & improved split(), persistent script tables, more schema analysis, and (ahem) charts! https://shlomi-noach.github.io/blog/mysql/common_schema-2-1-released-advanced-improved-split-persistent-script-tables-more-schema-analysis-and-ahem-charts https://shlomi-noach.github.io/blog/mysql/common_schema-2-1-released-advanced-improved-split-persistent-script-tables-more-schema-analysis-and-ahem-charts#comments Wed, 17 Jul 2013 18:57:06 +0000 https://shlomi-noach.github.io/blog/?p=6398 common_schema 2.1 is released! common_schema is your free & open source companion schema within your MySQL server, providing with a function library, scripting capabilities, powerful routines and ready-to-apply information and recommendations.

New and noteworthy in version 2.1:

  • Better QueryScript’s split() functionality
  • Persistent tables for QueryScript: no long held temporary tables
  • Index creation analysis, further range partition analysis
  • grant_access(): allow everyone to use common_schema
  • Ascii charts, google charts
  • debugged_routines: show routines with debug code

Other minor enhancements and bugfixes not listed.

Here’s a breakdown of the above:

split() enhancements

split is one of those parts of common_schema that (should) appeal to every DBA. Break a huge transaction automagically into smaller chunks, and don’t worry about how it’s done. If you like, throttle execution, or print progress, or…

split enhancements include:

  • A much better auto-detection-and-selection of the chunking index. split now consults all columns covered by the index, and uses realistic heuristics to decide which UNIQUE KEY on your table is best for the chunking process. A couple bugs are solved on the way; split is much smarter now.
  • Better support for multi-column chunking keys. You may now utilize the start/stop parameters even on multi column keys, passing a comma delimited of values for the split operation to start/end with, respectively. Also fixed issue for nonexistent start/stop values, which are now valid: split will just keep to the given range.
  • split no longer requires a temporary table open through the duration of its operation. See next section.

Persistent script tables

QueryScript used to use several temporary tables for its operation. Thus, a script could hold open two or three temporary tables for the entire execution duration. For long split operations, for example, this could mean hours and days.

Temporary tables are nice and quick to respond (well, MyISAM tables are, until MySQL 5.7 is out), but make for an inherent problem: stopped slaves must not shut down nor restart when replication has an open temporary table. Why? Well, because the slave forgets about the temporary tables. When it resumes operation, it will not recognize DML issued against those tables it has forgotten. That’s why oak-prepare-shutdown is so good for slaves.

When temporary tables are short-lived, this is typically not an issue. But if you are not allowed to restart your slave throughout a 24 hour operation, that’s a limitation.

As of 2.1, QueryScript does not require long held temporary tables. In fact, typical scripts do not create any temporary tables. A split operation creates and immediately drops a series of temporary tables. These are dropped even before actual split operation begins. All tables operated on are persistent InnoDB tables.

Result: safer script replication. There’s another nice side effect I may take advantage of in a later release: ability to monitor and control flow of concurrent scripts.

Schema analysis

Two noteworthy additions to schema analysis views:

  • sql_alter_table now includes the sql_drop_keys & sql_add_keys columns. For each table, you get the SQL statements to create and drop the existing indexes. I developed this when I hit this problem with TokuDB.
  • sql_range_partitions now includes the count_past_partitions & count_future_partitions; when your table is partitioned by some type of time range, these views tell you how many partitions are in the past, and how many are to be written to in the future. This turns useful when you want to rotate or otherwise set a retention policy for your range partitions.

grant_access()

The grant_access() routine GRANTs all accounts on your server with SELECT & EXECUTE privileges on common_schema. This is a quick complementary to the installation process (though you have to invoke it yourself; it’s up to you).

Ascii/google line charts

Laugh all you want! And find how cool it is to get (poor man’s) instant charting like this:

mysql> call line_chart('select ts, com_insert_psec, com_update_psec from mycheckpoint.sv_hour limit 100', 'insert per second, update per second') ;
+---------+------------------------------------------------------------------------------------------------------+
| y_scale | chart                                                                                                |
+---------+------------------------------------------------------------------------------------------------------+
| 162     | -#-------------------------------------------------------------------------------------------------- |
| 152     | ---------------------------------------------------------------------------------------------------- |
| 143     | ---------------------------------------------------------------------------------------------------- |
| 134     | ---------------------------------------------------------------------------------------------------- |
| 124     | ---------------------------------------------------------------------------------------------------- |
| 115     | ------------------------------------------------------------#--------------------------------------- |
| 106     | ---------------------------------------------------------------------------------------------------- |
| 96      | -*-------------------------------------------------------------------------------------------------- |
| 87      | ---------------------------------#-------------------------#---------------------------------------- |
| 77      | ---------------------------------------------------------------------------------#------------------ |
| 68      | ---------------------------------------------------------------------------#------------------------ |
| 59      | #-------------------------------#------------------------------------------------------------------- |
| 49      | ---##------#-#-##-#-#--#--###----------------##---------------------------------#-----#---###------- |
| 40      | --#------#--#-#--#-#-##-##----##--###########--######--------#############-*#-##--####-###---####### |
| 31      | *-**--#-#-*-**-**------**--**#-****-**-*****-*******-#---#-*------------**---#--*------------------- |
| 21      | ----*#*#*--*--*--******--**--**----*--*-----*-------**#*#**-************--#-****-******************* |
| 12      | -----*-*-*--------------------------------------------*-*-----------------*------------------------- |
|         | v::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 2010-10-06 20:00:00                                                              2010-10-10 23:00:00 |
|         |     # insert per second                                                                              |
|         |     * update per second                                                                              |
+---------+------------------------------------------------------------------------------------------------------+

You can get the same in Google Image Charts format. Yes, it’s deprecated (and has been for a year — it’s still working)

mysql> call google_line_chart('select ts, com_insert_psec, com_update_psec from mycheckpoint.sv_hour limit 100', 'insert per second, update per second') \G

google_chart_url: http://chart.apis.google.com/chart?cht=lc&chs=800x350&chtt=SQL+chart+by+common_schema&chxt=x,y&chxr=1,11.9,161.7&chd=s:S9NOOGKFGKHQMONPONONNKNONNOOQINMRgLLNMMNNNNNNOONMNNNMHEFFJFfsLLMMMLLMNMMNNDVNIMKPaKLLMOMNNNONNNMMMMM,IiGGFCDBCBGFGGGGGFGGFEFGGGGGHDGGJGGGGGFGGGGGGGGGGGGGFCBCCCEHGGGFFFFFGGGFGGAKFDFFIFFFFFFFFFFFFFFFFFFF&chxs=0,505050,10,0,lt&chxl=0:|2010-10-06%2020:00:00||||||||||||||||||||||||2010-10-07%2020:00:00|||||||||||||||||||||||||2010-10-08%2021:00:00|||||||||||||||||||||||||2010-10-09%2022:00:00|||||||||||||||||||||||||2010-10-10%2023:00:00&chg=1.010101010,25,1,2,0,0&chco=ff8c00,4682b4&chdl=insert%20per%20second|update%20per%20second&chdlp=b

The above translates into the following image:

Throw you own query in. Make 1st column your ordering column, 2nd [, 3rd…] value columns. Provide your own legend. Watch it instantly. And laugh all you want.

Read more about common_schema’s charting routines.

debugged_routines

The new debugged_routines view shows you which routines are currently “compiled with debug mode“.

I will write more on the state of rdebug in a future post.

Try it, get it

  • common_schema 2.1 comes with over 500 tests and fast growing.
  • It supports MySQL 5.1, 5.5, 5.6, Percona Server and MariaDB.
  • It has superb documentation (may I say so?) with a lot of examples & drill down into edge cases.

You are free to download and use it.

Your feedback is welcome! Indeed, many of this version’s improvements originated with community feedback.

]]>
https://shlomi-noach.github.io/blog/mysql/common_schema-2-1-released-advanced-improved-split-persistent-script-tables-more-schema-analysis-and-ahem-charts/feed 4 6398
mycheckpoint revision 231 released https://shlomi-noach.github.io/blog/mysql/mycheckpoint-revision-released https://shlomi-noach.github.io/blog/mysql/mycheckpoint-revision-released#comments Thu, 23 May 2013 12:21:52 +0000 https://shlomi-noach.github.io/blog/?p=6353 A new release for mycheckpoint: lightweight, SQL oriented MySQL monitoring solution.

If you’re unfamiliar with mycheckpoint, well, the one minute sales pitch is: it’s a free and open source monitoring tool for MySQL, which is extremely easy to install and execute, and which includes custom queries, alerts (via emails), and out of the box HTTP server and charting.

This is mostly a maintenance release, with some long-time requested features, and of course solved bugs. Here are a few highlights:

  • Supports MariaDB and MySQL 5.6 (issues with new variables, space padded variables, text-valued variables)
  • Supports alerts via function invocation on monitored host (so not only checking alerts via aggregated data like ‘Seconds_behind_master’ but also by SELECT my_sanity_check_function() on monitored instance). See alerts.
  • Supports single-running-instance via “–single” command line argument
  • Supports strict sql_mode, including ONLY_FULL_GROUP_BY, overcoming bug #69310.
  • Supports sending of pending email HTML report
  • Better re-deployment process
  • Better recognizing of SIGNED/UNSIGNED values
  • Some other improvements in charting, etc.

mycheckpoint is released under the BSD license.

Downloads are available from the project’s page.

]]>
https://shlomi-noach.github.io/blog/mysql/mycheckpoint-revision-released/feed 2 6353