Baffling 5.7 global/status variables issues, unclean migration path

MySQL 5.7 introduces a change in the way we query for global variables and status variables: the INFORMATION_SCHEMA.(GLOBAL|SESSION)_(VARIABLES|STATUS) tables are now deprecated and empty. Instead, we are to use the respective performance_schema.(global|session)_(variables|status) tables.

But the change goes farther than that; there is also a security change. Oracle created a pitfall of 2 changes at the same time:

  1. Variables/status moved to a different table
  2. Privileges required on said table

As an example, my non-root user gets:

mysql> show session variables like 'tx_isolation';
ERROR 1142 (42000): SELECT command denied to user 'normal_user'@'my_host' for table 'session_variables'

Who gets affected by this? Nearly everyone and everything.

  • Your Nagios will not be able to read status variables
  • Your ORM will not be able to determine session variables
  • Your replication user will fail connecting (see this post by Giuseppe)
  • And most everyone else.

The problem with the above is that involves two unrelated changes to your setup, which are not entirely simple to coordinate:

  1. Change your app code to choose the correct schema (information_schema vs. performance_schema)
  2. GRANT the permissions on your database

Perhaps at this point you still do not consider this to be a problem. You may be thinking: well, let’s first prepare by creating the GRANTs, and once that is in place, we can, at our leisure, modify the code.

Not so fast. Can you really that simply create those GRANTs? Continue reading » “Baffling 5.7 global/status variables issues, unclean migration path”

Monitoring DML/slow queries with graphite

pt-query-digest, Anemometer or “Anemomaster” do a great job of analysing your queries and giving you visibility into what’s going on with your MySQL servers. However, the place where the query digests are written is just some MySQL tables on some server. Do you have monitoring/alerts on that table? How will you verify a specific query does not exceed some runtime/execution count threshold, and get notified when it does?

At Outbrain we use Graphite to collect almost all of our data. We like it for its simplicity and for the fact it has a “push” strategy as opposed to “pull” strategy: every service/server/collectd writes (pushes) its own data to Graphite, as opposed to having some centralized monitoring service trying to pull data from thousands of servers & services. We also have a great Graphite dashboard (developed at our company by Erez Mazor) called graphitus, which is a very sophisticated and easily configurable visualization solution (see documentation).

Our love/hate relationship with Nagios boil down to having a single Nagios plugin: one that reads data from Graphite. We use Nagios to generate our alerts, and dream of the day we will substitute it with something else (there’s not too much love in this love/hate relationship).

Graphite is a numeric timeseries data monitoring solution. How do you throw MySQL query analysis into Graphite, then?

The answer lies within the flexible structure of a Graphite metric entry, which is a freely composed path, such as Graphite does not require you to pre-define paths, and you can use anything that makes sense to you. Thus, you can use a slow query’s text, for example, as part of the Graphite entry path. This is not entirely simple as the graphite path limits the allowed characters. So this is what we do: Continue reading » “Monitoring DML/slow queries with graphite”

“Anemomaster”: DML visibility. Your must-do for tomorrow

Here’s our take of master DML query monitoring at Outbrain (presented April 2014). It took a half-day to code, implement, automate and deploy, and within the first hour of work we managed to catch multiple ill-doing services and scripts. You might want to try this out for yourself.

What’s this about?

What queries do you monitor on your MySQL servers? Many don’t monitor queries at all, and only look up slow queries on occasion, using pt-query-digest. Some monitor slow queries, where Anemometer (relying on pt-query-digest) is a very good tool. To the extreme, some monitor TCP traffic on all MySQL servers — good for you! In between, there’s a particular type of queries that are of special interest: DML (INSERT/UPDATE/DELETE) queries issued against the master.

They are of particular interest because they are only issued once against the master, yet propagate through replication topology to execute on all slaves. These queries have a direct impact on your slave lag and on your overall replication capacity. I suggest you should be familiar with your DMLs just as you are with your slow queries.

In particular, we had multiple occasions in the past where all or most slaves started lagging. Frantically we would go to our metrics; yes! We would see a spike in com_insert. Someone (some service) was obviously generating more INSERTs than usual, at a high rate that the slaves could not keep up with. But, which INSERT was that? Blindly, we would look at the binary logs. Well, erm, what are we looking for, exactly?

Two such occasions convinced us that there should be a solution, but it took some time till it hit us. We were already using Anemometer for monitoring our slow logs. We can do the same for monitoring our binary logs. Thus was born “Anemomaster”.

Quick recap on how Anemometer works: you issue pt-query-digest on your slow logs on all MySQL hosts (we actually first ship the slow logs to a central place where we analyse them; same thing). This is done periodically, and slow logs are then rotated. You throw the output of pt-query-digest to a central database (this is built in with pt-query-digest; it doesn’t necessarily produce human readable reports). Anemometer would read this central database and visualize the slow queries.

Analysing DMLs

But then, pt-query-digest doesn’t only parse slow logs. It can parse binary logs. Instead of asking for total query time, we ask for query count, and on we go to establish the same mechanism, using same pt-query-digest and same Anemometer to store and visualize the DMLs issued on our masters.

When analysing DMLs we’re interested in parsing binary logs — and it makes no sense to do the same on all slaves. All slaves just have same copy of binlog entries as the master produces. It only takes one server to get an accurate picture of the DMLs on your replication topology.

Continue reading » ““Anemomaster”: DML visibility. Your must-do for tomorrow”

mycheckpoint, discontinued

Time to admit to myself: mycheckpoint has to be discontinued.

I started mycheckpoint back in 2009, as a free & open source lightweight monitoring tool for MySQL. Over some years it evolved and became an actual (lightweight) monitoring solution, used by many. It has a unique and original design, which, alas, is also its bane.

mycheckpoint uses the relational model & SQL to store and query monitored metrics. This leads to quite a sophisticated service, which can make practically anything visible to the user. The raw data is just numbers. but with some SQL-Fu one can generate charts out of it,  (interactive ones as well), human readable reports and full blown email messages. It is still the only common solution I’m aware of that keeps track of variable changes and provides with clear “what changed, when, from value & to_value”. I caught many deployment bugs by just observing this. It’s a single file that provides with full blown HTTP service, alerting, mail notifications, multi-database monitoring, custom monitoring queries, query execution time monitoring, OS metrics, …

While developing mycheckpoint I learned a lot on MySQL status & configuration, complex SQL queries, Python, linux, packaging and more. I got a lot of feedback from users, as I still do (thank you!). Didn’t always manage to fix all bugs or answer all questions.

The design of mycheckpoint does not meet today’s reality. Heck, today there are more counters & variables than possible table columns. The schema-per-monitored-instance design makes for simplicity, but does not fare well with dozens or hundreds of servers to monitor. There is no cross-instance aggregation or visualization of data. The per-10 minute aggregation is too rough. There isn’t a test suite.

Some of the above issues can be fixed, and if you like, the source code is still freely available. I’ll even migrate the entire SVN to GitHub at some stage. But I believe the current state might only be good for small scale deployments;  not something you would consider to scale up with.

For me, there’s nothing more motivating in code development than knowing the code will go public. The efforts in making the code look as best it can, as easily deployable as possibly can, with good documentation, makes for a lot of effort – but very satisfying. Open Source FTW!!!1


Seconds_behind_master vs. Absolute slave lag

I am unable to bring myself to trust the Seconds_behind_master value on SHOW SLAVE STATUS. Even with MySQL 5.5‘s CHANGE MASTER TO … MASTER_HEARTBEAT_PERIOD (good thing, applied when no traffic goes from master to slave) it’s easy and common to find fluctuations in Seconds_behind_master value.

And, when sampled by your favourite monitoring tool, this often leads to many false negatives.

At Outbrain we use HAProxy as proxy to our slaves, on multiple clusters. More about that in a future post. What’s important here is that our decision whether a slave enters or leaves a certain pool (i.e. gets UP or DOWN status in HAProxy) is based on replication lag. Taking slaves out when they are actually replicating well is bad, since this reduces the amount of serving instances. Putting slaves in the pool when they are actually lagging too much is bad as they contain invalid, irrelevant data.

To top it all, even when correct, the Seconds_behind_master value is practically irrelevant on 2nd level slaves. In a Master -> Slave1 -> Slave2 setup, what does it mean that Slave2 has Seconds_behind_master = 0? Nothing much to the application: Slave1 might be lagging an hour behind the master, or may not be replicating at all. Slave2 might have an hour’s data missing even though it says its own replication is fine.

None of the above is news, and yet many fall in this pitfall. The solution is quite old as well; it is also very simple: do your own heartbeat mechanism, at your favourite time resolution, and measure slave lag by timestamp you yourself updated on the master.

Maatkit/percona-toolkit did this long time ago with mk-heartbeat/pt-heartbeat. We’re doing it in a very similar manner. The benefit is obvious. Consider the following two graphs; the first shows Seconds_behind_master, the seconds shows our own Absolute_slave_lag measurement. Continue reading » “Seconds_behind_master vs. Absolute slave lag”

Bash script: report largest InnoDB files

The following script will report the largest InnoDB tables under the data directory: schema, table & length in bytes. The tables could be non-partitioned, in which case this is simply the size of the corresponding .ibd file, or they can be partitioned, in which case the reported size is the sum of all partition files. It is assumed tables reside in their own tablespace files, i.e. created with innodb_file_per_table=1.

    mysql_datadir=$(grep datadir /etc/my.cnf | cut -d "=" -f 2)
    cd $mysql_datadir
    for frm_file in $(find . -name "*.frm")
        table_schema=$(echo $frm_file | cut -d "/" -f 2)
        table_name=$(echo $frm_file | cut -d "/" -f 3 | cut -d "." -f 1)
        if [ -f $tbl_file ]
            # unpartitioned table
            file_size=$(du -cb $tbl_file 2> /dev/null | tail -n 1) 
            # attempt partitioned innodb table
            file_size=$(du -cb $tbl_file_partitioned 2> /dev/null | tail -n 1)
        # Replace the below with whatever action you want to take,
        # for example, push the values into graphite.
        echo $file_size $table_schema $table_name
) | sort -k 1 -nr | head -n 20

We use this to push table statistics to our graphite service; we keep an eye on table growth (we actually do not limit to top 20 but just monitor them all). File size does not report the real table data size (this can be smaller due to tablespace fragmentation). It does give the correct information if you’re concerned about disk space. For table data we also monitor SHOW TABLE STATUS / INFORMATION_SCHEMA.TABLES, themselves being inaccurate. Gotta go by something.

Trick: recovering from “no space left on device” issues with MySQL

Just read Ronald Bradford’s post on an unnecessary 3am (emergency) call. I sympathize! Running out of disk space makes for some weird MySQL behaviour, and in fact whenever I encounter weird behaviour I verify disk space.

But here’s a trick I’ve been using for years to avoid such cases and to be able to recover quickly. It helped me on such events as running out of disk space during ALTER TABLEs or avoiding purging of binary logs when slave is known to be under maintenance.

Ronald suggested it — just put a dummy file in your @@datadir! I like putting a 1GB dummy file: I typically copy+paste a 1GB binary log file and call it “placeholder.tmp”. Then I forget all about it. My disk space should not run out — if it does it’s a cause for emergency. I have monitoring, but sometimes I’m hoping to make an operation on 97%99% utilization.

If I do run out of disk space: well, MySQL won’t let me connect; won’t complete an important statement; not sync transaction to disk — bad situation. Not a problem in our case: we can magically recover 1GB worth of data from the @@datadir, buying us enough time (maybe just minutes) to gracefully complete so necessary operations; connect, KILL, shutdown, abort etc.

mycheckpoint revision 231 released

A new release for mycheckpoint: lightweight, SQL oriented MySQL monitoring solution.

If you’re unfamiliar with mycheckpoint, well, the one minute sales pitch is: it’s a free and open source monitoring tool for MySQL, which is extremely easy to install and execute, and which includes custom queries, alerts (via emails), and out of the box HTTP server and charting.

This is mostly a maintenance release, with some long-time requested features, and of course solved bugs. Here are a few highlights:

  • Supports MariaDB and MySQL 5.6 (issues with new variables, space padded variables, text-valued variables)
  • Supports alerts via function invocation on monitored host (so not only checking alerts via aggregated data like ‘Seconds_behind_master’ but also by SELECT my_sanity_check_function() on monitored instance). See alerts.
  • Supports single-running-instance via “–single” command line argument
  • Supports strict sql_mode, including ONLY_FULL_GROUP_BY, overcoming bug #69310.
  • Supports sending of pending email HTML report
  • Better re-deployment process
  • Better recognizing of SIGNED/UNSIGNED values
  • Some other improvements in charting, etc.

mycheckpoint is released under the BSD license.

Downloads are available from the project’s page.

MySQL monitoring: storing, not caching

I’ve followed with interest on Baron’s Why don’t our new Nagios plugins use caching? and Sheeri’s Caching for Monitoring: Timing is Everything. I wish to present my take on this, from mycheckpoint‘s point of view.

So mycheckpoint works in a completely different way. On one hand, it doesn’t bother with caching. On the other hand, it doesn’t bother with re-reads of data.

There are no staleness issues, the data is consistent as it can get (you can never get a completely atomic read of everything in MySQL), and you can issue as many calculations as you want at the price of one take of monitoring. As in Sheere’s example, you can run Threads_connected/max_connections*100, mix status variables, system variables, meta-variables (e.g. Seconds_behind_master), user-created variables (e.g. number of purchases in your online shop) etc.

mycheckpoint‘s concept is to store data. And store it in relational format. That is, INSERT it to a table.

A sample-run generates a row, which lists all status, server, OS, user, meta variables. It’s a huge row, with hundreds of columns. Columns like threads_connected, max_connections, innodb_buffer_pool_size, seconds_behind_master, etc.

mycheckpoint hardly cares about these columns. It identifies them dynamically. Have you just upgraded to MySQL 5.5? Oh, there’s a new bunch of server and status variables? No problem, mycheckpoint will notice it doesn’t have the matching columns and will add them via ALTER TABLE. There you go, now we have a place to store them.

Running a formula like Threads_connected/max_connections*100 is as easy as issuing the following query:

SELECT Threads_connected/max_connections*100 FROM status_variables WHERE id = ...

Hmmm. This means I can run this formula on the most recent row I’ve just added. But wait, this also means I can run this formula on any row I’ve ever gathered. Continue reading » “MySQL monitoring: storing, not caching”

Announcing common_schema: common views & routines for MySQL

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

Continue reading » “Announcing common_schema: common views & routines for MySQL”