"Anemomaster": DML visibility. Your must-do for tomorrow

April 18, 2014

Here's our take of master DML query monitoring at Outbrain (presented April 2014). It took a half-day to code, implement, automate and deploy, and within the first hour of work we managed to catch multiple ill-doing services and scripts. You might want to try this out for yourself.

What's this about?

What queries do you monitor on your MySQL servers? Many don't monitor queries at all, and only look up slow queries on occasion, using pt-query-digest. Some monitor slow queries, where Anemometer (relying on pt-query-digest) is a very good tool. To the extreme, some monitor TCP traffic on all MySQL servers -- good for you! In between, there's a particular type of queries that are of special interest: DML (INSERT/UPDATE/DELETE) queries issued against the master.

They are of particular interest because they are only issued once against the master, yet propagate through replication topology to execute on all slaves. These queries have a direct impact on your slave lag and on your overall replication capacity. I suggest you should be familiar with your DMLs just as you are with your slow queries.

In particular, we had multiple occasions in the past where all or most slaves started lagging. Frantically we would go to our metrics; yes! We would see a spike in com_insert. Someone (some service) was obviously generating more INSERTs than usual, at a high rate that the slaves could not keep up with. But, which INSERT was that? Blindly, we would look at the binary logs. Well, erm, what are we looking for, exactly?

Two such occasions convinced us that there should be a solution, but it took some time till it hit us. We were already using Anemometer for monitoring our slow logs. We can do the same for monitoring our binary logs. Thus was born "Anemomaster".

Quick recap on how Anemometer works: you issue pt-query-digest on your slow logs on all MySQL hosts (we actually first ship the slow logs to a central place where we analyse them; same thing). This is done periodically, and slow logs are then rotated. You throw the output of pt-query-digest to a central database (this is built in with pt-query-digest; it doesn't necessarily produce human readable reports). Anemometer would read this central database and visualize the slow queries.

Analysing DMLs

But then, pt-query-digest doesn't only parse slow logs. It can parse binary logs. Instead of asking for total query time, we ask for query count, and on we go to establish the same mechanism, using same pt-query-digest and same Anemometer to store and visualize the DMLs issued on our masters.

When analysing DMLs we're interested in parsing binary logs -- and it makes no sense to do the same on all slaves. All slaves just have same copy of binlog entries as the master produces. It only takes one server to get an accurate picture of the DMLs on your replication topology.

One server could be the master, and this can indeed be done: just FLUSH MASTER LOGS, parse the binary logs with pt-query-digest, and you're done. But like others, we tend to look at our masters as tender babies. We care for them, and do not wish to overload them unnecessarily. We chose to get the binlog entries from our slaves, instead. We also chose to get the entries from the relay logs, since these are unaffected by slave performance and as long as network is good, we can expect the relay logs to be very up to date. At any given time we have two slaves that take this role (this is automated and verified). On a 10 minute period we would flush the relay logs on these servers, and analyse whatever relay logs we have not analysed as yet.

The script below is a slightly modified version of our own, and should work for the standard installation. Modify to fit your own data (in particular, it assumes relay logs are named mysqld-relay-bin; datadir is specified in /etc/my.cnf, and please don't ask me how to do this on Windows):


# Digest latest relay logs file, write results to "anemomaster"
# This script can run from any machine; it only needs to execute on a single machine per mysql cluster, but for
# analysis availability it should execute on at least two hosts per cluster.

DATADIR=`grep datadir /etc/my.cnf|awk -F= '{print $2}'`

chown mysql:mysql $DIGESTED_RELAY_LOGS_FILE


echo "deleting old relay logs from ${TMPDIR}"
rm ${TMPDIR}/mysqld-relay-bin.[0-9]*

echo "Getting current relay log files"
existing_relay_log_files=$(ls -tr ${DATADIR}/mysqld-relay-bin.[0-9]* | head -n -1)
for existing_relay_log_file in $existing_relay_log_files
    cp -u $existing_relay_log_file $TMPDIR
echo "flushing relay logs"
/usr/bin/mysql -umyself -psecret -e 'flush relay logs\G;' 2>/dev/null
# sleep because the log file takes some time to disappear
sleep 1

echo "Getting current relay log files"
existing_relay_log_files=$(ls -tr ${DATADIR}/mysqld-relay-bin.[0-9]* | head -n -1)
for existing_relay_log_file in $existing_relay_log_files
    cp -u $existing_relay_log_file $TMPDIR

for relay_log_file in mysqld-relay-bin.[0-9]*
    # Filter this relay log file, since it's already been digested
    grep $relay_log_file $DIGESTED_RELAY_LOGS_FILE && rm $relay_log_file

for relay_log_file in mysqld-relay-bin.[0-9]*
    echo "digesting $relay_log_file"
    mysqlbinlog $relay_log_file | /usr/bin/pt-query-digest \
      --type binlog --order-by Query_time:cnt --group-by fingerprint --limit 100 \
      --review  P=3306,u=anemomaster,p=secret,h=anemomaster_host,D=anemomaster,t=global_query_review \
      --history P=3306,u=anemomaster,p=secret,h=anemomaster_host,D=anemomaster,t=global_query_review_history \
      --filter=" \$event->{Bytes} = length(\$event->{arg}) and \$event->{hostname}=\"$(hostname)\" " \
    echo "$relay_log_file" >> $DIGESTED_RELAY_LOGS_FILE
    rm $relay_log_file
# make sure the file does not bloat. 20 entries is more than enough.
echo "done"

As for Anemometer, we patched it to support multiple environments ("clusters") -- but irrelevant to the DML change. If you just want to make it visualize DMLs, here's the major configuration changes to config.inc.php (marked with bold):

$conf['history_defaults'] = array(
	'output'		=> 'table',
	'fact-group'	=> 'date',
	'fact-order'	=> 'date DESC',
	'fact-limit' => '90',
	'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-90 day')),
	'dimension-ts_min_end'	=> date("Y-m-d H:i:s"),
	'table_fields' => array('date', 'query_time_avg','ts_cnt','Query_time_sum')

$conf['report_defaults'] = array(
	'fact-group'	=> 'checksum',
	'fact-order'	=> 'ts_cnt DESC',
	'fact-limit' => '20',
	'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-1 day')),
	'dimension-ts_min_end'	=> date("Y-m-d H:i:s"),
	'table_fields' => array('checksum','snippet', 'query_time_avg','ts_cnt','Query_time_sum'),
	'dimension-pivot-hostname_max' => null,
	'dimension-pivot-clustername_max' => null

$conf['graph_defaults'] = array(
	'fact-group'	=> 'minute_ts',
	'fact-order'	=> 'minute_ts',
	'fact-limit' => '',
	'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-7 day')),
	'dimension-ts_min_end'	=> date("Y-m-d H:i:s"),
	'table_fields' => array('minute_ts'),
	// hack ... fix is to make query builder select the group and order fields,
	// then table fields only has to contain the plot_field
	'plot_field' => 'ts_cnt',

With a 10 minute rotation & digestion, we are able to analyze near real-time what's been done on our masters. If we see a spike in com_insert/com_update/com_delete, or just see slave lags, we turn to Anemomaster and within a couple minutes know exactly what service is guilty of abusing our database. We are also working to protect our database against abuse, but that's for another discussion.

  • Pingback: Monitoring DML/slow queries with graphite | code.openark.org()

  • Hi Shlomi, thx for that.

    I tried myself to include relay logs informations in anemometer but I encountered 3 problems :
    - MIXED or RAW replication could be a problem
    - With MySQL 5.5 (percona server), LOAD DATA INFILE could be a problem with pt-query-digest and type=binlog (need to update the script)
    - With MySQL 5.6 (percona server), pt-query-digest and type=logbin didn't work as expected

    Could you explain details on your environnement and explain if you fell on these issues?

  • Hi Cédric,

    1. Correct. This currently works only on STATEMENT based replication. I'm writing a script which transforms a ROW binlog into STATEMENT-like binlog (via mysqlbinlog --verbose). We are using STATEMENT most of the time, ourselves, and so I'm not in pressure to do so...

    2. Possible; haven't noticed. Will take a further look

    3. I'm exhausted on 5.6 evaluation, and am taking a rest. I don't know.

  • Great, I'm not alone!
    Thanks for sharing.

  • Pingback: Reading RBR binary logs with pt-query-digest | code.openark.org()

Powered by Wordpress and MySQL. Theme by openark.org