Reading RBR binary logs with pt-query-digest

For purposes of auditing anything that goes on our servers we’re looking to parse the binary logs of all servers (masters), as with “Anemomaster“. With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).

I’ve written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I’m able to now feed it to pt-query-digest which seems to be happy.

The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:

mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001

The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.

The script can be found in my gist repositories. Current version is as follows: Continue reading » “Reading RBR binary logs with pt-query-digest”

Monitoring DML/slow queries with graphite

pt-query-digest, Anemometer or “Anemomaster” do a great job of analysing your queries and giving you visibility into what’s going on with your MySQL servers. However, the place where the query digests are written is just some MySQL tables on some server. Do you have monitoring/alerts on that table? How will you verify a specific query does not exceed some runtime/execution count threshold, and get notified when it does?

At Outbrain we use Graphite to collect almost all of our data. We like it for its simplicity and for the fact it has a “push” strategy as opposed to “pull” strategy: every service/server/collectd writes (pushes) its own data to Graphite, as opposed to having some centralized monitoring service trying to pull data from thousands of servers & services. We also have a great Graphite dashboard (developed at our company by Erez Mazor) called graphitus, which is a very sophisticated and easily configurable visualization solution (see documentation).

Our love/hate relationship with Nagios boil down to having a single Nagios plugin: one that reads data from Graphite. We use Nagios to generate our alerts, and dream of the day we will substitute it with something else (there’s not too much love in this love/hate relationship).

Graphite is a numeric timeseries data monitoring solution. How do you throw MySQL query analysis into Graphite, then?

The answer lies within the flexible structure of a Graphite metric entry, which is a freely composed path, such as collectd.hosts.us-east.myhost01.mysql.gauge-Threads_running.value. Graphite does not require you to pre-define paths, and you can use anything that makes sense to you. Thus, you can use a slow query’s text, for example, as part of the Graphite entry path. This is not entirely simple as the graphite path limits the allowed characters. So this is what we do: Continue reading » “Monitoring DML/slow queries with graphite”

Bash script: report largest InnoDB files

The following script will report the largest InnoDB tables under the data directory: schema, table & length in bytes. The tables could be non-partitioned, in which case this is simply the size of the corresponding .ibd file, or they can be partitioned, in which case the reported size is the sum of all partition files. It is assumed tables reside in their own tablespace files, i.e. created with innodb_file_per_table=1.

(
    mysql_datadir=$(grep datadir /etc/my.cnf | cut -d "=" -f 2)
    cd $mysql_datadir
    for frm_file in $(find . -name "*.frm")
    do
        tbl_file=${frm_file//.frm/.ibd}
        table_schema=$(echo $frm_file | cut -d "/" -f 2)
        table_name=$(echo $frm_file | cut -d "/" -f 3 | cut -d "." -f 1)
        if [ -f $tbl_file ]
        then
            # unpartitioned table
            file_size=$(du -cb $tbl_file 2> /dev/null | tail -n 1) 
        else
            # attempt partitioned innodb table
            tbl_file_partitioned=${frm_file//.frm/#*.ibd}
            file_size=$(du -cb $tbl_file_partitioned 2> /dev/null | tail -n 1)
        fi
        file_size=${file_size//total/}
        # Replace the below with whatever action you want to take,
        # for example, push the values into graphite.
        echo $file_size $table_schema $table_name
    done
) | sort -k 1 -nr | head -n 20

We use this to push table statistics to our graphite service; we keep an eye on table growth (we actually do not limit to top 20 but just monitor them all). File size does not report the real table data size (this can be smaller due to tablespace fragmentation). It does give the correct information if you’re concerned about disk space. For table data we also monitor SHOW TABLE STATUS / INFORMATION_SCHEMA.TABLES, themselves being inaccurate. Gotta go by something.

common_schema: 1.3: security goodies, parameterized split(), json-to-xml, query checksum

common_schema 1.3 is released and is available for download. New and noteworthy in this version:

  • Parameterized split(): take further control over huge transactions by breaking them down into smaller chunks, now manually tunable if needed
  • duplicate_grantee(): copy+paste existing accounts along with their full set of privileges
  • similar_grants: find which accounts share the exact same set of privileges (i.e. have the same role)
  • json_to_xml(): translate any valid JSON object into its equivalent XML form
  • extract_json_value(): use XPath notation to extract info from JSON data, just as you would from XML
  • query_checksum(): given a query, calculate a checksum on the result set
  • random_hash(): get a 40 hexadecimal digits random hash, using a reasonably large changing input

Let’s take a closer look at the above:

Parameterized split()

split takes your bulk query and automagically breaks it down into smaller pieces. So instead of one huge UPDATE or DELETE or INSERT..SELECT transaction, you get many smaller transactions, each with smaller impact on I/O, locks, CPU.

As of 1.3, split() gets more exposed: you can have some control on its execution, and you also get a lot of very interesting info during operation.

Here’s an example of split() control:

set @script := "
  split({start:7015, step:2000} : UPDATE sakila.rental SET return_date = return_date + INTERVAL 1 DAY) 
    throttle 1;
";
call common_schema.run(@script);

In the above we choose a split size of 2,000 rows at a time; but we also choose to only start with 7015, skipping all rows prior to that value. Just what is that value? It depends on the splitting key (and see next example for just that); but in this table we can safely assume this is the rental_id PRIMARY KEY of the table.

You don’t have to use these control parameters. But they can save you some time and effort. Continue reading » “common_schema: 1.3: security goodies, parameterized split(), json-to-xml, query checksum”

common_schema over traditional scripts

If you are familiar with both openark kit and common_schema, you’ll notice I’ve incorporated some functionality already working in openark kit into common_schema, essentially rewriting what used to be a Python script into SQL/QueryScript.

What was my reasoning for rewriting good code? I wish to explain that, and provide with a couple examples.

I’m generally interested in pushing as much functionality into the MySQL server. When using an external script, one:

  • Needs the right dependencies (OS, Perl/Python version, Perl/Python modules).
  • Needs to provide with connection params,
  • Needs to get acquainted with a lot of command line options,
  • Is limited by whatever command line options are provided.
  • Has to invoke that script (duh!) to get the work done.

This last bullet is not so trivial: it means you can’t work some operation with your favorite GUI client, because it has no notion of your Perl script; does not run on the same machine where your Python code resides; simply can’t run those scripts for you.

With server-side code, functionality is accessible via any client. You run your operation via a query (e.g. CALL some_procedure). That can be done from your GUI client, your command line client, your event scheduler, your cronjob, all equally. You only need access to your MySQL server, which is trivial.

Of course, server side scripting is limited. Some stuff simply can’t be written solely on server side. If you want to consult your replicating slave; gracefully take action on user’s Ctrl+C, send data over the web, you’ll have to do it with an external tool. There are actually a lot of surprising limitations to things one would assume are possible on server side. You may already know how frustrated I am by the fact one can hardly get info from SHOW commands.

But, when it works, it shines

Let’s review a couple examples. The first one is nearly trivial. The second less so. Continue reading » “common_schema over traditional scripts”

Killing InnoDB idle transactions

The issue of terminating long-time idle open InnoDB transaction has been discussed recently by many. I wish to add my share, by proposing a quick and clean solution via common_schema.

common_schema 1.2 provides with the innodb_transactions view, which relies on INNODB_TRX – one of the InnoDB Plugin views in INFORMATION_SCHEMA – as well as on PROCESSLIST, and so is able to determine with certainty that a transaction has been idle for a long time.

innodb_transactions offers us with a sql_kill_query column, which produces a ‘KILL QUERY 12345’ type of value. So we can:

SELECT sql_kill_query FROM innodb_transactions WHERE trx_idle_seconds >= 10; 
+-------------------+
| sql_kill_query    |
+-------------------+
| KILL QUERY 292509 |
| KILL QUERY 292475 |
+-------------------+ 

common_schema‘s useful eval() routine allows us to actually invoke those KILL statements, all in a one-liner:

call eval('SELECT sql_kill_query FROM innodb_transactions WHERE trx_idle_seconds >= 10');

Technical details Continue reading » “Killing InnoDB idle transactions”

Purging old rows with QueryScript: three use cases

Problem: you need to purge old rows from a table. This may be your weekly/monthly cleanup task. The table is large, the amount of rows to be deleted is large, and doing so in one big DELETE is too heavy.

You can use oak-chunk-update or pt-archiver to accomplish the task. You can also use server side scripting with QueryScript, offering a very simple syntax with no external scripting, dependencies and command line options.

I wish to present three cases of row deletion, with three different solutions. In all cases we assume some TIMESTAMP column exists in table, by which we choose to purge the row. In all cases we assume we wish to purge rows older than 1 month.

We assume the naive query is this:

DELETE FROM my_schema.my_table WHERE row_timestamp < CURDATE() - INTERVAL 1 MONTH

Case 1: TIMESTAMP column is indexed

I almost always index a timestamp column, if only for being able to quickly purge data (but usually also to slice data by date). In this case where the column is indexed, it’s very easy to figure out which rows are older than 1 month.

We break the naive query into smaller parts, and execute these in sequence: Continue reading » “Purging old rows with QueryScript: three use cases”

Experimenting with 5.6 InnoDB Online DDL (bugs included)

MySQL 5.6 offers the groundbreaking online DDL operations for InnoDB. Most common use cases will enjoy this feature, and the need for online alter table scripts will decrease. This is a killer feature!

I’ve put this new feature to the usability test. How did it go? Not too well, I’m afraid.

[Updates to this text inline], also see this followup.

sakila & DDL

sakila is still a very useful database. I say “still” because it is not very large, and computing power is getting stronger; yet on my laptop some operations can still take many seconds to complete, which is just fine for my tests.

Sakila tables are mostly InnoDB, and rental being the largest, I do:

node1 (sakila) > alter table sakila.rental engine=InnoDB;
Query OK, 16044 rows affected (6.94 sec)
Records: 16044  Duplicates: 0  Warnings: 0

So what can be executed during these 6.94 seconds? In a second terminal, I try the following: Continue reading » “Experimenting with 5.6 InnoDB Online DDL (bugs included)”

How common_schema split()s tables – internals

This post exposes some of the internals, and the SQL behind QueryScript’s split. common_schema/QueryScript 1.1 introduces the split statement, which auto-breaks a “large” query (one which operates on large tables as a whole or without keys) into smaller queries, and executes them in sequence.

This makes for easier transactions, less locks held, potentially (depending on the user) more idle time released back to the database. split has similar concepts to oak-chunk-update and pt-archiver, but works differently, and implemented entirely in SQL on server side.

Take the following statement as example:

split (UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR)
  pass;

It yields with (roughly) the following statements:

UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1')) OR ((`inventory`.`inventory_id` = '1'))) AND (((`inventory`.`inventory_id` < '1000')) OR ((`inventory`.`inventory_id` = '1000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1000'))) AND (((`inventory`.`inventory_id` < '2000')) OR ((`inventory`.`inventory_id` = '2000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '2000'))) AND (((`inventory`.`inventory_id` < '3000')) OR ((`inventory`.`inventory_id` = '3000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '3000'))) AND (((`inventory`.`inventory_id` < '4000')) OR ((`inventory`.`inventory_id` = '4000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '4000'))) AND (((`inventory`.`inventory_id` < '4581')) OR ((`inventory`.`inventory_id` = '4581'))));

(I say “roughly” because internally there are user defined variables at play, but for convenience, I verbose the actual values as constants.)

How does that work?

common_schema works on server side. There is no Perl script or anything. It must therefore use server-side operations to:

  • Identify table to be split
  • Analyze the table in the first place, deciding how to split it
  • Analyze the query, deciding on how to rewrite it
  • Split the table (logically) into unique and distinct chunks
  • Work out the query on each such chunk

Following is an internal look at how common_schema does all the above. Continue reading » “How common_schema split()s tables – internals”

Table split(…) for the masses

(pun intended)

common_schema‘s new split statement (see release announcement) auto-splits complex queries over large tables into smaller ones: instead of issuing one huge query, split breaks one’s query into smaller queries, each working on a different set of rows (a chunk).

Thus, it is possible to avoid holding locks for long times, allowing for smaller transactions. It also makes for breathing space for the RDBMS, at times boosting operation speed, and at times prolonging operation speed at will.

In this post I show how split exposes itself to the user, should the user wish so.

split can manage queries of the following forms:

  • DELETE FROM table_name [WHERE]…
  • DELETE FROM table_name USING <multi table syntax> [WHERE]…
  • UPDATE table_name SET … [WHERE]…
  • UPDATE <multiple tables> SET … [WHERE]…
  • INSERT INTO some_table SELECT … FROM <single or multiple tables> [WHERE]…
  • REPLACE INTO some_table SELECT … FROM <single or multiple tables> [WHERE]…
  • SELECT … FROM <multiple tables> [WHERE]…

The latter being a non-obvious one at first sight.

Basically, it’ automatic

You just say:

split (UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR)
  throttle 2;

And split identifies sakila.inventory as the table which needs to be split, and injects appropriate conditions so as to work on a subset of the rows, in multiple steps.

By the way, here’s how to execute a QueryScript code like the above. Continue reading » “Table split(…) for the masses”