Analysis – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Fri, 04 Mar 2016 11:43:18 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 Reading RBR binary logs with pt-query-digest https://shlomi-noach.github.io/blog/mysql/reading-rbr-binary-logs-with-pt-query-digest https://shlomi-noach.github.io/blog/mysql/reading-rbr-binary-logs-with-pt-query-digest#comments Mon, 26 Jan 2015 15:50:46 +0000 https://shlomi-noach.github.io/blog/?p=7180 For purposes of auditing anything that goes on our servers we’re looking to parse the binary logs of all servers (masters), as with “Anemomaster“. With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).

I’ve written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I’m able to now feed it to pt-query-digest which seems to be happy.

The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:

mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001

The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.

The script can be found in my gist repositories. Current version is as follows:

#!/usr/bin/python
#
# Convert a Row-Based-Replication binary log to Statement-Based-Replication format, cheating a little.
# This script exists since Percona Toolkit's pt-query-digest cannot digest RBR format. The script
# generates enough for it to work with.
# Expecting standard input
# Expected input is the output of "mysqlbinlog --verbose --base64-output=DECODE-ROWS <binlog_file_name>"
# For example:
# $ mysqlbinlog --verbose --base64-output=DECODE-ROWS mysql-bin.000006 | python binlog-rbr-to-sbr.py | pt-query-digest --type=binlog --order-by Query_time:cnt --group-by fingerprint
#

import fileinput

def convert_rbr_to_pseudo_sbr():
    inside_rbr_statement = False
    for line in fileinput.input():
        line = line.strip()
        if line.startswith("#") and "end_log_pos" in line:
            for rbr_token in ["Update_rows:", "Write_rows:", "Delete_rows:", "Rows_query:", "Table_map:",]:
                if rbr_token in line:
                    line = "%s%s" % (line.split(rbr_token)[0], "Query\tthread_id=1\texec_time=0\terror_code=0")
        if line.startswith("### "):
            inside_rbr_statement = True
            # The "### " commented rows are the pseudo-statement interpreted by mysqlbinlog's "--verbose",
            # and which we will feed into pt-query-digest
            line = line[4:]
        else:
            if inside_rbr_statement:
                print("/*!*/;")
            inside_rbr_statement = False
        print(line) 

convert_rbr_to_pseudo_sbr()

 

 

 

 

 

]]>
https://shlomi-noach.github.io/blog/mysql/reading-rbr-binary-logs-with-pt-query-digest/feed 2 7180
Bash script: report largest InnoDB files https://shlomi-noach.github.io/blog/mysql/bash-script-report-largest-innodb-files https://shlomi-noach.github.io/blog/mysql/bash-script-report-largest-innodb-files#comments Thu, 19 Dec 2013 08:58:17 +0000 https://shlomi-noach.github.io/blog/?p=6682 The following script will report the largest InnoDB tables under the data directory: schema, table & length in bytes. The tables could be non-partitioned, in which case this is simply the size of the corresponding .ibd file, or they can be partitioned, in which case the reported size is the sum of all partition files. It is assumed tables reside in their own tablespace files, i.e. created with innodb_file_per_table=1.

(
    mysql_datadir=$(grep datadir /etc/my.cnf | cut -d "=" -f 2)
    cd $mysql_datadir
    for frm_file in $(find . -name "*.frm")
    do
        tbl_file=${frm_file//.frm/.ibd}
        table_schema=$(echo $frm_file | cut -d "/" -f 2)
        table_name=$(echo $frm_file | cut -d "/" -f 3 | cut -d "." -f 1)
        if [ -f $tbl_file ]
        then
            # unpartitioned table
            file_size=$(du -cb $tbl_file 2> /dev/null | tail -n 1) 
        else
            # attempt partitioned innodb table
            tbl_file_partitioned=${frm_file//.frm/#*.ibd}
            file_size=$(du -cb $tbl_file_partitioned 2> /dev/null | tail -n 1)
        fi
        file_size=${file_size//total/}
        # Replace the below with whatever action you want to take,
        # for example, push the values into graphite.
        echo $file_size $table_schema $table_name
    done
) | sort -k 1 -nr | head -n 20

We use this to push table statistics to our graphite service; we keep an eye on table growth (we actually do not limit to top 20 but just monitor them all). File size does not report the real table data size (this can be smaller due to tablespace fragmentation). It does give the correct information if you’re concerned about disk space. For table data we also monitor SHOW TABLE STATUS / INFORMATION_SCHEMA.TABLES, themselves being inaccurate. Gotta go by something.

]]>
https://shlomi-noach.github.io/blog/mysql/bash-script-report-largest-innodb-files/feed 3 6682
common_schema 1.1 released: split(), try-catch, killall(), profiling https://shlomi-noach.github.io/blog/mysql/common_schema-1-1-released-split-try-catch-killall-profiling https://shlomi-noach.github.io/blog/mysql/common_schema-1-1-released-split-try-catch-killall-profiling#comments Tue, 04 Sep 2012 06:15:25 +0000 https://shlomi-noach.github.io/blog/?p=4999 I’m very happy to announce the release of common_schema, version 1.1 (revision 300).

This version boasts with compelling new features: innovative QueryScript syntax, libraries, views which add to your skills as a DBA, making some maintenance and management tasks a breeze.

  • QueryScript, split statement: automagically break long queries into smaller chunks, avoid long locks and reduce query/transaction overhead
  • QueryScript, try-catch statement: just try { something; } catch { act_on_error; }.
  • killall(): quickly kill connections based on grantee/user/host information.
  • profiling/profiling_last: utility views to assist in query profiling diagnostics
  • 1 size fits all: a single installer which auto-recognizes available server features and enables respective common_schema features accordingly.
  • QueryScript performance boost
  • much much more…

Not familiar with common_schema? It allows you to do stuff on server side, by selecting from views, calling upon useful routines or writing easy-to-manage scripts.

I’m suggesting that common_schema should be a really-should-have tool to accompany your MySQL install. Did I say “tool”? It’s merely a schema. But it makes for a great framework:

In High Performance MySQL, 3rd edition, Baron Schwartz describes common_schema:

The common_schema is to MySQL as jQuery is to javaScript

Reviewing highlights for version 1.1:

QueryScript

QueryScript is a scripting language. It sees some major improvements here. I’ve made some speed boosts by avoiding using temporary tables, and by using string parsing instead.

Without doubt the two most handy statements added to QueryScript are:

  • split: automagically break a long query into smaller, distinct chunks, and execute those iteratively. Just write your query; split will parse it, analyze it, rewrite it, break your table into parts, iterate your table and apply query for each chunk of rows. You can reduce lock time, avoid huge transactions and give your server room to breathe on operations such as massive updates of rows, transferring of rows between tables, massive purging of rows etc. Consider: the following query will execute 1,000 rows at a time, and the script will throttle execution so as to sleep in between chunks. And you need know nothing about how it works internally (though it’s quite interesting):
create table world.City_dup like world.City;
split (insert into world.City_dup select * from world.City)
{
  throttle 2;
}
  • try-catch: if, like me, you are frustrated with stored routines way of handling errors, QueryScript now offers you familiar (yet enhanced) form of try something catch do_something_on_error. It is limited in that you cant have a catch for particular error codes – MySQL does not provide such info on server side. Nevertheless, consider:
while (true)
{
  try
  {
    -- Attempt query which is expected to abort on deadlock:
    UPDATE some_table SET some_column = 1 WHERE some_condition;
    -- Got here? This means query is successful! We can leave now.
    break;
  }
  catch
  {
    -- Apparently there was a deadlock. Rest, then loop again until succeeds
    sleep 1;
  }
}

QueryScript also adds:

  • eval: evaluate SQL statements on the fly. I’ve got some very cool use cases already in production.
  • sleep: just… sleep.
  • pass: similar to Python’s pass statement, this statement does nothing and makes for a placeholder.

QueryScript variables now support:

  • Declare & assign syntax: var $sum := 0
  • New expansion syntax: DELETE FROM t LIMIT :${number_of_rows}, or CREATE TABLE customer_:${shard_number}_details
  • Support for expanded variables in expressions, throttle, sleep, throw statements.

Routines:

Plenty of new routines. Most notable:

  • killall(): much like Unix killall command, this routine kills connections based on names, rather than process IDs. Names are grantee name, or just the user part, or just the host part. Which allows for quick killing of all connections coming from a specific user or host:
CALL killall('host3.analytics.mycompany.com');
CALL killall('reporting_user');
  • table_exists(): test for (isn’t it clear?) table existence. This uses INFORMATION_SCHEMA optimizations: it’s a lightweight query.
SELECT table_exists('sakila', 'rental') AS does_it_exist;

We also have text manipulation routines: encode_xml() and decode_xml(), strip_urls(), prettify_message()

Views

mysql> SET PROFILING := 1;

mysql> SELECT COUNT(*) FROM sakila.nicer_but_slower_film_list INTO @dummy;

mysql> SELECT * FROM last_query_profiling;
+----------+----------------------+-------------+--------------------+-------------------------+--------------------+------------+
| QUERY_ID | STATE                | state_calls | state_sum_duration | state_duration_per_call | state_duration_pct | state_seqs |
+----------+----------------------+-------------+--------------------+-------------------------+--------------------+------------+
|       41 | checking permissions |           5 |           0.000320 |            0.0000640000 |               0.33 | 5,6,7,8,9  |
|       41 | cleaning up          |           1 |           0.000007 |            0.0000070000 |               0.01 | 31         |
|       41 | closing tables       |           1 |           0.000016 |            0.0000160000 |               0.02 | 29         |
|       41 | Copying to tmp table |           1 |           0.042363 |            0.0423630000 |              44.34 | 15         |
|       41 | Creating tmp table   |           1 |           0.000123 |            0.0001230000 |               0.13 | 13         |
|       41 | end                  |           1 |           0.000004 |            0.0000040000 |               0.00 | 23         |
|       41 | executing            |           2 |           0.000014 |            0.0000070000 |               0.01 | 14,22      |
|       41 | freeing items        |           2 |           0.000216 |            0.0001080000 |               0.23 | 25,27      |
|       41 | init                 |           1 |           0.000012 |            0.0000120000 |               0.01 | 20         |
|       41 | logging slow query   |           1 |           0.000004 |            0.0000040000 |               0.00 | 30         |
|       41 | Opening tables       |           1 |           0.028909 |            0.0289090000 |              30.26 | 2          |
|       41 | optimizing           |           2 |           0.000026 |            0.0000130000 |               0.03 | 10,21      |
|       41 | preparing            |           1 |           0.000018 |            0.0000180000 |               0.02 | 12         |
|       41 | query end            |           1 |           0.000004 |            0.0000040000 |               0.00 | 24         |
|       41 | removing tmp table   |           3 |           0.000130 |            0.0000433333 |               0.14 | 18,26,28   |
|       41 | Sending data         |           2 |           0.016823 |            0.0084115000 |              17.61 | 17,19      |
|       41 | Sorting result       |           1 |           0.006302 |            0.0063020000 |               6.60 | 16         |
|       41 | starting             |           1 |           0.000163 |            0.0001630000 |               0.17 | 1          |
|       41 | statistics           |           1 |           0.000048 |            0.0000480000 |               0.05 | 11         |
|       41 | System lock          |           1 |           0.000017 |            0.0000170000 |               0.02 | 3          |
|       41 | Table lock           |           1 |           0.000018 |            0.0000180000 |               0.02 | 4          |
+----------+----------------------+-------------+--------------------+-------------------------+--------------------+------------+

common_schema meta info is in the status view, which can be used, for example, in bug reports. It indicated version, revision, time and status of installation process.

Installer

common_schema comes with an invisible installer. It’s just a SQL file, imported via SOURCE command or your favorite import method. But, once base components are installed, it activates itself to spawn a smart-mode install phase, where it checks upon existing MySQL server features, and adding respective common_schema features. So, if InnoDB plugin is present, you get the InnoDB plugin views in common_schema. If this is a Percona Server, you also get those related views. This makes for a single distribution file, as opposed to 3 different distributions in previous versions.

Documentation

There are no compromises here. Documenting common_schema takes more time than writing & testing it. But everything is well documented. You can read the documentation online, or download a bundle, or call for help() from within common_schema: the documentation is internal, too.

root@mysql-5.1.51> CALL help('try');
+--------------------------------------------------------------------------------+
| help                                                                           |
+--------------------------------------------------------------------------------+
| QueryScript Flow Control: try-catch statement                                  |
|                                                                                |
| SYNOPSIS                                                                       |
|                                                                                |
|                                                                                |
|                                                                                |
|        try                                                                     |
|          statement;                                                            |
|        catch                                                                   |
|          statement;                                                            |
|                                                                                |
|                                                                                |
|                                                                                |
| DESCRIPTION                                                                    |
|                                                                                |
| try-catch is an error handling flow control structure. Flow is determined      |
| based on the appearance or non-appearance of execution errors.                 |
| The try statement (or block of statements) is executed. If no error occurs, it |
| completes, and the catch statement is never executed.                          |
| If an error is detected within execution of the try statement, the try         |
| statement is aborted at the point of error (i.e. all statements following the  |
| point of error are discarded), and the catch statement (or block of            |
| statements) is executed.                                                       |
...
+--------------------------------------------------------------------------------+

Tests

With over 350 tests and counting, common_schema and QueryScript are well tested. There are still tests to write, the cover is not complete, and I’m working on it.

Bugfixes

Changed view definitions affected by MySQL bug #65388.

Download

Download common_schema. You will find it is rich and smart.

]]>
https://shlomi-noach.github.io/blog/mysql/common_schema-1-1-released-split-try-catch-killall-profiling/feed 2 4999
common_schema, rev. 178: foreach(), repeat_exec(), Roland Bouman, query analysis https://shlomi-noach.github.io/blog/mysql/common_schema-rev-178-foreach-repeat_exec-roland-bouman-query-analysis https://shlomi-noach.github.io/blog/mysql/common_schema-rev-178-foreach-repeat_exec-roland-bouman-query-analysis#respond Thu, 01 Dec 2011 09:33:01 +0000 https://shlomi-noach.github.io/blog/?p=4133 common_schema, revision 178 is now released, with major additions. This revision turns common_schema into a framework, rather than a set of views and functions.

common_schema provides with query scripting, analysis & informational views, and a function library, allowing for easier administration and diagnostics for MySQL. It introduces SQL based tools which simplify otherwise complex shell and client scripts, allowing the DBA to be independent of operating system, installed packages and dependencies.

There’s no Perl nor Python, and no dependencies to install. It’s just a schema.

Some highlights for the new revision:

  • foreach(), aka $(): loop through a collection, execute callback commands per element.
  • repeat_exec(): a repeat-until device: execute queries until some condition holds.
  • exec_file(): execute files a-la SOURCE, but on server side
  • Query analysis: analyze query text, view or routine definitions to detect dependency objects.
  • Improvements to views and routines, new routines introduced.

Let’s take a closer look:

rpbouman

I’m very happy to have Roland Bouman working on this project. He introduced some sophisticated code without which some functionality could not take place. I’m sure I don’t need to introduce his great capabilities; I’ll just pass the note that it is very good working with him!

foreach()

Introducing a looping device which can iterate a collection and execute callback commands.

What’s a collection? A range of numbers; a set of constants; the result set of a SELECT query; tables in your database and more.

What is a callback? A query or set of queries to invoke on the specific elements in the collection. For example:

call foreach('table in sakila', 'ALTER TABLE ${schema}.${table} ENGINE=InnoDB ROW_FORMAT=COMPRESSED');

I’ll publish dedicated posts on foreach(), aka $(), following this post. Official documentation is here.

repeat_exec()

Repeat executing queries in a given interval, until some condition holds.

What kind of condition? You can loop forever, or until a given time has passed, a given number of iteration has passed.

You can iterate until no rows are affected by your commands (your callbacks), or until some dynamic condition holds (a query evaluates to true).

For example: purge rows from a table until no more rows are affected; in interval of 3 second:

call repeat_exec(3, 'DELETE FROM test.event WHERE ts < CURDATE() ORDER BY id LIMIT 1000', 0);

Official documentation is here.

exec_file()

If you need to execute commands from a file, you usually invoke SOURCE:

mysql> SOURCE '/tmp/somefile.sql';

Or you invoke mysql client and redirect its input to read from file:

bash$ mysql some_db < /tmp/somefile.sql

SOURCE is a MySQL client command. The file must reside on your client. Running the mysql client is great, but you need to work it out from outside the server.

call_exec() will let you import a file on server side, from within the server:

call exec_file('/tmp/some_file.sql');

You will need to have the file readable; it is limited to 64K at this moment; it may not use DELIMITER, and it may not include dynamic SQL. These are the limitations.

Official documentation is here.

exec() / exec_single()

All of the above rely on the exec() / exec_single() routines, which dynamically execute a set of queries. One one hand, it’s no big deal: they only have to use prepared statements in order to invoke the queries. But then, they knows how to parse multiple queries (find the “;” delimiter correctly), plus they allow for configuration: if you set @common_schema_dryrun, queries are not actually executes; just printed out. If you set @common_schema_verbose, queries are verbosed in addition to being executed. Since all execution routines rely on these,we get a standardized pattern.

Official documentation is here.

Query analysis

Query parsing routines allow for detection of dependencies within queries. While not full-blown SQL parser, these allow one to realize on which tables or routines a view depends on; or a routines depends on; or an event; or just any query.

These routines can analyze the text of not only a SELECT query, but also UPDATE, DELETE, CREATE, etc. They can read the code of a stored routines, including queries and control flow constructs; thus, they are also able to analyze events and triggers.

At this stage forward-dependencies resolution is supported. This can eventually lead to dependency graphs or to reverse-dependency resolution (i.e. “which view, routine, trigger or event depend on table t?”)

Example:

mysql> call get_view_dependencies('sakila', 'actor_info');
+-------------+---------------+-------------+--------+
| schema_name | object_name   | object_type | action |
+-------------+---------------+-------------+--------+
| sakila      | actor         | table       | select |
| sakila      | category      | table       | select |
| sakila      | film          | table       | select |
| sakila      | film_actor    | table       | select |
| sakila      | film_category | table       | select |
+-------------+---------------+-------------+--------+

The query analysis routines are in BETA stage.

Official documentation is here.

Test quite

common_schema is now tested. Not all code is as yet under tests; all new code is, and some of the older code. Work is in progress to add more and more tests.

Further changes:

  • candidate_keys does not give higher score for PRIMARY KEYs any longer. It ranks all unique keys according to its own heuristic; it also provides with the  is_primary and is_nullable columns.
  • Added candidate_keys_recommended view, recommending best candidate key per table (while noting whether it qualifies as PRIMARY KEY in terms of NULLable columns).
  • Added many text parsing and text manipulation routines, such as better trim, tokenizing, etc. Improved existing code significantly.

Get it

common_schema is available for downloaded. It is released under the BSD license, and is free.

I’ve put very hard work into common_schema’s documentation. It is very thorough and provides with clear examples. The documentation is also available for download.

If you encounter problems, please report on the issues page.

common_schema is meant to be downloaded & installed on any MySQL server. It provides with general and essential functionality. Spread the word!

]]>
https://shlomi-noach.github.io/blog/mysql/common_schema-rev-178-foreach-repeat_exec-roland-bouman-query-analysis/feed 0 4133
MySQL Global status difference using single query https://shlomi-noach.github.io/blog/mysql/mysql-global-status-difference-using-single-query https://shlomi-noach.github.io/blog/mysql/mysql-global-status-difference-using-single-query#comments Fri, 12 Aug 2011 16:31:12 +0000 https://shlomi-noach.github.io/blog/?p=3908 Have just read MySQL Global status difference using MySQL procedures / functions, by Andres Karlsson. Have commented, but realized I did not provide with a direct answer. In the comment, I suggested checking out a solution based on views, found in common_schema. But the solution in common_schema is split into two views, due to the fact views cannot handle derived tables subqueries.

Well, here’s a single query to do that: it checks GLOBAL_STATUS twice, 10 seconds apart in the following sample. It uses SLEEP() to actually wait between the two reads. Yes, you can do that with a query.

The following query shows all GLOBAL_STATUS values that have changed during the sample period.

[UPDATE] query updated to work with MySQL 5.6 optimizer

SELECT STRAIGHT_JOIN
   LOWER(gs0.VARIABLE_NAME) AS variable_name,
   gs0.VARIABLE_VALUE AS variable_value_0,
   gs1.VARIABLE_VALUE AS variable_value_1,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) AS variable_value_diff,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) / 10 AS variable_value_psec,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) * 60 / 10 AS
variable_value_pminute
FROM
   (
     SELECT
       VARIABLE_NAME,
       VARIABLE_VALUE
     FROM
       INFORMATION_SCHEMA.GLOBAL_STATUS
     UNION ALL
     SELECT
       '',
       SLEEP(10)
     FROM DUAL
   ) AS gs0
   JOIN (
     SELECT 
       VARIABLE_NAME,
       VARIABLE_VALUE
     FROM 
       INFORMATION_SCHEMA.GLOBAL_STATUS
   ) gs1 USING (VARIABLE_NAME)
WHERE
   gs1.VARIABLE_VALUE != gs0.VARIABLE_VALUE
;
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+
| variable_name                     | variable_value_0 | variable_value_1 | variable_value_diff | variable_value_psec | variable_value_pminute |
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+
| aborted_clients                   | 2210669          | 2210686          |                  17 |                 1.7 |                    102 |
| bytes_received                    | 53259933210      | 53260211104      |              277894 |             27789.4 |                1667364 |
| bytes_sent                        | 351130988015     | 351132884956     |             1896941 |            189694.1 |               11381646 |
| com_change_db                     | 3760546          | 3760584          |                  38 |                 3.8 |                    228 |
| com_delete                        | 6774784          | 6774801          |                  17 |                 1.7 |                    102 |
| com_insert                        | 52743750         | 52744012         |                 262 |                26.2 |                   1572 |
| com_insert_select                 | 13362650         | 13362740         |                  90 |                   9 |                    540 |
| com_select                        | 51722818         | 51723107         |                 289 |                28.9 |                   1734 |
| com_set_option                    | 108564134        | 108564754        |                 620 |                  62 |                   3720 |
| com_show_collations               | 3760530          | 3760568          |                  38 |                 3.8 |                    228 |
| com_show_processlist              | 366078           | 366082           |                   4 |                 0.4 |                     24 |
| com_show_status                   | 366047           | 366051           |                   4 |                 0.4 |                     24 |
| com_show_variables                | 3760535          | 3760573          |                  38 |                 3.8 |                    228 |
| com_update                        | 6271283          | 6271324          |                  41 |                 4.1 |                    246 |
| connections                       | 3781382          | 3781420          |                  38 |                 3.8 |                    228 |
| created_tmp_disk_tables           | 983223           | 983224           |                   1 |                 0.1 |                      6 |
| created_tmp_tables                | 9134044          | 9134126          |                  82 |                 8.2 |                    492 |
| handler_commit                    | 125798040        | 125798688        |                 648 |                64.8 |                   3888 |
| handler_delete                    | 6849562          | 6849578          |                  16 |                 1.6 |                     96 |
| handler_read_first                | 5332451          | 5332498          |                  47 |                 4.7 |                    282 |
| handler_read_key                  | 373910509        | 373912469        |                1960 |                 196 |                  11760 |
| handler_read_next                 | 850122025        | 850170403        |               48378 |              4837.8 |                 290268 |
| handler_read_rnd                  | 255104660        | 255105932        |                1272 |               127.2 |                   7632 |
| handler_read_rnd_next             | 992505444        | 992549948        |               44504 |              4450.4 |                 267024 |
| handler_update                    | 27930283         | 27930465         |                 182 |                18.2 |                   1092 |
| handler_write                     | 2051582925       | 2051602416       |               19491 |              1949.1 |                 116946 |
| innodb_buffer_pool_pages_data     | 77232            | 77243            |                  11 |                 1.1 |                     66 |
| innodb_buffer_pool_pages_dirty    | 626              | 645              |                  19 |                 1.9 |                    114 |
| innodb_buffer_pool_pages_flushed  | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_buffer_pool_pages_misc     | 4294922063       | 4294922052       |                 -11 |                -1.1 |                    -66 |
| innodb_buffer_pool_read_requests  | 1933796064       | 1933871603       |               75539 |              7553.9 |                 453234 |
| innodb_buffer_pool_reads          | 11360212         | 11360214         |                   2 |                 0.2 |                     12 |
| innodb_buffer_pool_write_requests | 1074109722       | 1074115394       |                5672 |               567.2 |                  34032 |
| innodb_data_fsyncs                | 5583880          | 5583905          |                  25 |                 2.5 |                    150 |
| innodb_data_read                  | 3339489280       | 3339501568       |               12288 |              1228.8 |                  73728 |
| innodb_data_reads                 | 11796492         | 11796494         |                   2 |                 0.2 |                     12 |
| innodb_data_writes                | 105587582        | 105588145        |                 563 |                56.3 |                   3378 |
| innodb_data_written               | 3721600000       | 3727315968       |             5715968 |            571596.8 |               34295808 |
| innodb_dblwr_pages_written        | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_dblwr_writes               | 596503           | 596506           |                   3 |                 0.3 |                     18 |
| innodb_log_write_requests         | 380978894        | 380981368        |                2474 |               247.4 |                  14844 |
| innodb_log_writes                 | 74407604         | 74407990         |                 386 |                38.6 |                   2316 |
| innodb_os_log_fsyncs              | 2310799          | 2310807          |                   8 |                 0.8 |                     48 |
| innodb_os_log_written             | 2905292800       | 2906502656       |             1209856 |            120985.6 |                7259136 |
| innodb_pages_created              | 1341584          | 1341593          |                   9 |                 0.9 |                     54 |
| innodb_pages_read                 | 13117652         | 13117654         |                   2 |                 0.2 |                     12 |
| innodb_pages_written              | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_rows_deleted               | 6849552          | 6849568          |                  16 |                 1.6 |                     96 |
| innodb_rows_inserted              | 43787980         | 43788207         |                 227 |                22.7 |                   1362 |
| innodb_rows_read                  | 4289845136       | 4289919560       |               74424 |              7442.4 |                 446544 |
| innodb_rows_updated               | 24119627         | 24119809         |                 182 |                18.2 |                   1092 |
| key_read_requests                 | 41262330         | 41262338         |                   8 |                 0.8 |                     48 |
| open_files                        | 7                | 5                |                  -2 |                -0.2 |                    -12 |
| opened_files                      | 4212920          | 4212924          |                   4 |                 0.4 |                     24 |
| questions                         | 253158874        | 253160331        |                1457 |               145.7 |                   8742 |
| select_full_join                  | 546              | 547              |                   1 |                 0.1 |                      6 |
| select_range                      | 721945           | 721947           |                   2 |                 0.2 |                     12 |
| select_scan                       | 12828865         | 12828989         |                 124 |                12.4 |                    744 |
| sort_range                        | 170971           | 170973           |                   2 |                 0.2 |                     12 |
| sort_rows                         | 255175383        | 255176655        |                1272 |               127.2 |                   7632 |
| sort_scan                         | 534078           | 534080           |                   2 |                 0.2 |                     12 |
| table_locks_immediate             | 142673687        | 142674454        |                 767 |                76.7 |                   4602 |
| threads_cached                    | 7                | 8                |                   1 |                 0.1 |                      6 |
| threads_connected                 | 5                | 10               |                   5 |                 0.5 |                     30 |
| threads_created                   | 840486           | 840509           |                  23 |                 2.3 |                    138 |
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+

Some values don’t make sense to do difference on (e.g. threads_connected), since they present with momentary status and are not incrementing as others (e.g. threads_created).

Happy SQLing!

]]>
https://shlomi-noach.github.io/blog/mysql/mysql-global-status-difference-using-single-query/feed 8 3908
Announcing common_schema: common views & routines for MySQL https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql#comments Wed, 13 Jul 2011 04:25:24 +0000 https://shlomi-noach.github.io/blog/?p=3794 Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!

]]>
https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql/feed 7 3794
Checking for AUTO_INCREMENT capacity with single query https://shlomi-noach.github.io/blog/mysql/checking-for-auto_increment-capacity-with-single-query https://shlomi-noach.github.io/blog/mysql/checking-for-auto_increment-capacity-with-single-query#comments Tue, 05 Apr 2011 05:36:56 +0000 https://shlomi-noach.github.io/blog/?p=3421 Darn! This means oak-show-limits becomes redundant. Am I not supposed to speak about it on my coming presentation? Bad timing!

You have AUTO_INCREMENT columns. How far are you pushing the limits? Are you going to run out of AUTO_INCREMENT values soon? Perhaps you wonder whether you should ALTER from INT to BIGINT?

The answer is all there in INFORMATION_SCHEMA. The TABLES table shows the current AUTO_INCREMENT value per table, and the COLUMNS table tells us all about a column’s data type.

It takes some ugly code to deduce the maximum value per column type, what with signed/unsigned and data type, but then its very simple. Here is the query:

SELECT
  TABLE_SCHEMA,
  TABLE_NAME,
  COLUMN_NAME,
  DATA_TYPE,
  COLUMN_TYPE,
  IF(
    LOCATE('unsigned', COLUMN_TYPE) > 0,
    1,
    0
  ) AS IS_UNSIGNED,
  (
    CASE DATA_TYPE
      WHEN 'tinyint' THEN 255
      WHEN 'smallint' THEN 65535
      WHEN 'mediumint' THEN 16777215
      WHEN 'int' THEN 4294967295
      WHEN 'bigint' THEN 18446744073709551615
    END >> IF(LOCATE('unsigned', COLUMN_TYPE) > 0, 0, 1)
  ) AS MAX_VALUE,
  AUTO_INCREMENT,
  AUTO_INCREMENT / (
    CASE DATA_TYPE
      WHEN 'tinyint' THEN 255
      WHEN 'smallint' THEN 65535
      WHEN 'mediumint' THEN 16777215
      WHEN 'int' THEN 4294967295
      WHEN 'bigint' THEN 18446744073709551615
    END >> IF(LOCATE('unsigned', COLUMN_TYPE) > 0, 0, 1)
  ) AS AUTO_INCREMENT_RATIO
FROM
  INFORMATION_SCHEMA.COLUMNS
  INNER JOIN INFORMATION_SCHEMA.TABLES USING (TABLE_SCHEMA, TABLE_NAME)
WHERE
  TABLE_SCHEMA NOT IN ('mysql', 'INFORMATION_SCHEMA', 'performance_schema')
  AND EXTRA='auto_increment'
;

There’s one row in the result set for each AUTO_INCREMENT column. since at most one AUTO_INCREMENT column can exist for any given table, each row also identifies a unique table. Resulting columns are mostly self-explanatory, but here’s some details on some of the columns:

  • IS_UNSIGNED: 1 when the column is UNSIGNED, 0 otherwise.
  • MAX_VALUE: maximum value that can be contained within column.
  • AUTO_INCREMENT: current AUTO_INCREMENT value for table.
  • AUTO_INCREMENT_RATIO: value in the range [0..1], where 1 means “100% full”.

A sample output:

+--------------+------------+--------------+-----------+-----------------------+-------------+------------+----------------+----------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME  | DATA_TYPE | COLUMN_TYPE           | IS_UNSIGNED | MAX_VALUE  | AUTO_INCREMENT | AUTO_INCREMENT_RATIO |
+--------------+------------+--------------+-----------+-----------------------+-------------+------------+----------------+----------------------+
| sakila       | actor      | actor_id     | smallint  | smallint(5) unsigned  |           1 |      65535 |            201 |               0.0031 |
| sakila       | address    | address_id   | smallint  | smallint(5) unsigned  |           1 |      65535 |            606 |               0.0092 |
| sakila       | category   | category_id  | tinyint   | tinyint(3) unsigned   |           1 |        255 |             17 |               0.0667 |
| sakila       | city       | city_id      | smallint  | smallint(5) unsigned  |           1 |      65535 |            601 |               0.0092 |
| sakila       | country    | country_id   | smallint  | smallint(5) unsigned  |           1 |      65535 |            110 |               0.0017 |
| sakila       | customer   | customer_id  | smallint  | smallint(5) unsigned  |           1 |      65535 |            600 |               0.0092 |
| sakila       | film       | film_id      | smallint  | smallint(5) unsigned  |           1 |      65535 |           1001 |               0.0153 |
| sakila       | inventory  | inventory_id | mediumint | mediumint(8) unsigned |           1 |   16777215 |           4582 |               0.0003 |
| sakila       | language   | language_id  | tinyint   | tinyint(3) unsigned   |           1 |        255 |              7 |               0.0275 |
| sakila       | payment    | payment_id   | smallint  | smallint(5) unsigned  |           1 |      65535 |          16050 |               0.2449 |
| sakila       | rental     | rental_id    | int       | int(11)               |           0 | 2147483647 |          16050 |               0.0000 |
| sakila       | staff      | staff_id     | tinyint   | tinyint(3) unsigned   |           1 |        255 |              3 |               0.0118 |
| sakila       | store      | store_id     | tinyint   | tinyint(3) unsigned   |           1 |        255 |              3 |               0.0118 |
+--------------+------------+--------------+-----------+-----------------------+-------------+------------+----------------+----------------------+

Bonus: free advice on increasing your AUTO_INCREMENT capacity

Make it UNSIGNED. No, really. Check your definitions now.

]]>
https://shlomi-noach.github.io/blog/mysql/checking-for-auto_increment-capacity-with-single-query/feed 11 3421
oak-hook-general-log: your poor man’s Query Analyzer https://shlomi-noach.github.io/blog/mysql/oak-hook-general-log-your-poor-mans-query-analyzer https://shlomi-noach.github.io/blog/mysql/oak-hook-general-log-your-poor-mans-query-analyzer#respond Wed, 15 Dec 2010 17:46:06 +0000 https://shlomi-noach.github.io/blog/?p=3032 The latest release of openark kit introduces oak-hook-general-log, a handy tool which allows for some analysis of executing queries.

Initially I just intended for the tool to be able to dump the general log to standard output, from any machine capable to connect to MySQL. Quick enough, I realized the power it brings.

With this tool, one can dump to standard output all queries using temporary tables; or using a specific index; or doing a full index scan; or just follow up on connections; or… For example, the following execution will only log queries which make for filesort:

oak-hook-general-log --user=root --host=localhost --password=123456 --filter-explain-filesort

The problem with using the standard logs

So you have the general log, which you don’t often enable, since it tends to grow huge within moments. You then have the slow log. Slow log is great, and is among the top tools for MySQL diagnosis.

The slow log allows for log-queries-not-using-indexes, which is yet another nice feature. Not only should you log any query running for over X seconds, but also log any query which does not use an index.

Wait. This logs all single-row tables (no single row table will use an index), as well as very small tables (a common 20 rows lookup table will most often be scanned). These are OK scans. This makes for some noise in the slow log.

And how about queries which do use an index, but do so poorly? They use an index, but retrieve some 12,500,000 rows, using temporary table & filesort?

What oak-hook-general-log does for you

This tool streams out the general log, and filters out queries based on their role or on their execution plan.

To work at all, it must enable the general log. Moreover, it directs the general log to log table. Mind that this makes for a performance impact, which is why the tool auto-terminates and restores original log settings (default is 1 minute, configurable). It’s really not a tool you should keep running for days. But during the few moments it runs, it will:

  • Routinely rotate the mysql.general_log table so that it doesn’t fill up
  • Examine entries found in the general log
  • Cross reference entries to the PROCESSLIST so as to deduce database context (bug #52554)
  • If required and appropriate, evaluate a query’s execution plan
  • Decide whether to dump each entry based on filtering rules

Filtering rules

Filtering rules are passed as command line options. At current, only one filtering rule applies (if more than one specified only one is used, so no point in passing more than one). Some of the rules are:

  • filter-connection: only log connect/quit entries
  • filter-explain-fullscan: only log full table scans
  • filter-explain-temporary: only log queries which create implicit temporary tables
  • filter-explain-rows-exceed: only log queries where more than X number of rows are being accessed on some table (estimated)
  • filter-explain-total-rows-exceed: only log queries where more than X number of rows are accessed on all tables combined (estimated, with possibly incorrect numbers on some queries)
  • filter-explain-key: only log queries using a specific index. This feature somewhat overlaps with Maatkit’s mk-index-usage (read announcement).
  • filter-explain-contains: a general purpose grep on the execution plan. Log queries where the execution plan contains some text.

There are other filters, and I will possibly add more in due time.

Here are a couple cases I used oak-hook-general-log for:

Use case: temporary tables

I have a server with this alarming chart (courtesy mycheckpoint) of temporary tables:


What could possibly create 30 temporary tables per second on average?

The slow log produced nothing helpful, even with log-queries-not-using-indexes enabled. There were a lot of queries not using indexes there, but nothing at these numbers. With:

oak-hook-general-log --filter-explain-temporary

enabled for 1 minute, nothing came out. Weird. Enabled for 5 minutes, I got one entry. Turned out a scheduled script, acting once per 5 minutes, was making a single complicated query involving many nested views, which accounted for some hundreds of temporary tables created. All of them very small, query time was very fast. There is no temporary tables problem with this server, case closed.

Use case: connections

A server had issues with some exceptions being thrown on the client side. There was a large number of new connections created per second although the client was using a connection pool. Suspecting the pool didn’t work well, I issued:

oak-hook-general-log --filter-connect

The pool was working well, all right. No entries for that client were recorder in 1 minute of testing. However, it turned out some old script was flooding the MySQL server with requests, every second. The log showed root@somehost, and sure enough, the script was disabled. Exceptions were due to another reason; it was good to eliminate a suspect.

Some of the tool’s use case is relatively easy to solve with tail, grep & awk; others are not. I am using it more and more often, and find it to make significant shortcuts in tracking down queries.

Get it

Download the tool as part of openark kit: access the openark kit project page.

Or get the source code directly.

Feedback is most welcome.

]]>
https://shlomi-noach.github.io/blog/mysql/oak-hook-general-log-your-poor-mans-query-analyzer/feed 0 3032
openark-kit (rev. 170): new tools, new functionality https://shlomi-noach.github.io/blog/mysql/openark-kit-rev-170-new-tools-new-functionality https://shlomi-noach.github.io/blog/mysql/openark-kit-rev-170-new-tools-new-functionality#comments Wed, 15 Dec 2010 06:31:24 +0000 https://shlomi-noach.github.io/blog/?p=3124 I’m pleased to announce a new release of the openark kit. There’s a lot of new functionality inside; following is a brief overview.

The openark kit is a set of utilities for MySQL. They solve everyday maintenance tasks, which may be complicated or time consuming to work by hand.

It’s been a while since the last announced release. Most of my attention was on mycheckpoint, building new features, writing documentation etc. However my own use of openark kit has only increased in the past few months, and there’s new useful solutions to common problems that have been developed.

I’ve used and improved many tools over this time, but doing the final cut, along with proper documentation, took some time. Anyway, here are the highlights:

New tool: oak-hook-general-log

oak-hook-general-log hooks up a MySQL server and dumps the general log based on filtering rules, applying to query role or execution plan. It is possible to only dump connect/disconnect entries, queries which make a full table scan, or use temporary tables, or scan more than X number of rows, or…

I’ll write more on this tool shortly.

New tool: oak-prepare-shutdown

This tool makes for an orderly and faster shutdown by safely stopping replication, and flushing InnoDB pages to disk prior to shutting down (keeping server available for connections even while attempting to flush dirty pages to disk). A typical use case would be:

oak-prepare-shutdown --user=root --ask-pass --socket=/tmp/mysql.sock && /etc/init.d/mysql stop

New tool: oak-repeat query

oak-repeat-query repeats executing a given query until some condition holds. The condition can be:

  • Number of given iterations has been reached
  • Given time has elapsed
  • No rows have been affected by query

The tool comes in handy for cleanup jobs, warming up caches, etc.

New tool: oak-get-slave-lag

This simple tool just returns the number of seconds a slave is behind master. But it also returns with an appropriate exit code, based on a given threshold: 0 when lag is good, 1 (error exit code) when lag is too great or slave fails to replicate.

This tool has been used by 3rd party applications, such as a load balancer, to determine whether a slave should be accessed.

Updated tool: oak-chunk-update

This extremely useful utility breaks down very long queries into smaller chunks. These could be queries which should affect a huge amount of rows, or queries which cannot utilize an index.

Updates to the tool include limiting the range of rows the tool scans, by specifying start and stop position (either by providing constant values or by SELECT query). Also added is auto-termination when no rows are found to be affected. Last, it is possible to override INFORMATION_SCHEMA lookup by explicitly specifying chunking key.

This tool works great for your daily/weekly/monthly batch jobs; in creating DWH tables; populating new columns; purging old entries; clearing data based on non-indexed values; generating summary tables; and more.

Frozen tool: oak-apply-ri

I haven’t been using this tool for a while. The main work down by this tool can be done with oak-chunk-update. There are some additional safety checks oak-apply-ri provides; I’m thinking over if they justify the tool’s existence.

Frozen tool: oak-online-alter-table

With the appearance of Facebook’s Online Schema Change (OSC) tool, which derives from oak-online-alter-table, I’m not sure I will continue developing the tool. I intend to wait for general feedback on OSC before making a decision.

Documentation

Documentation is now part of openark kit‘s SVN repository.

Download

The openark kit project is currently hosted by Google Code. Downloads are available at the Google Code openark kit project page.

Downloads are available in the following packaging formats:

  • .deb package, to be installed on debian, ubuntu and otherwise debian based distributions.
  • .rpm package, architecture free (noarch), for RPM supporting Linux distributions such as RedHat, Fedora, CentOS etc.
  • .tar.gz using python’s distutils installer.
  • source, directly retrieved from SVN or from above python package.
  • Some distribution specific RPM packages, courtesy Lenz Grimmer.

Feedback

Your feedback is welcome! I may not always respond promptly; and I confess that some bugs were left open for more than I would have liked them to. I hope to make for good quality of code, and bug reporting is one major factor you can control.

]]>
https://shlomi-noach.github.io/blog/mysql/openark-kit-rev-170-new-tools-new-functionality/feed 3 3124
EXPLAIN: missing db info https://shlomi-noach.github.io/blog/mysql/explain-missing-db-info https://shlomi-noach.github.io/blog/mysql/explain-missing-db-info#comments Tue, 11 May 2010 04:57:02 +0000 https://shlomi-noach.github.io/blog/?p=2368 I’m further developing a general log hook, which can stream queries from the general log.

A particular direction I’m taking is to filter queries by their type of actions. For example, the tool (oak-hook-general-log) can be instructed to only stream out those queries which involve creation of a temporary table; or those which cause for a filesort, or full index scan, etc.

This is done by evaluating of query execution plans on the fly. I suspect the MySQL query analyzer roughly does the same (as a small part of what it does).

It’s almost nothing one cannot do with sed/awk. However, I bumped into a couple of problems:

  1. The general log (and the mysql.general_log table, in  particular) does not indicate the particular database one is using for the query. Since slow log does indicate this data, I filed a bug on this. I currently solve this by crossing connection id with the process list, where the current database is listed. It’s shaky, but mostly works.
  2. Just realized: there’s no DB info in the EXPLAIN output! It’s weird, since I’ve been EXPLAINing queries for years now. But I’ve always had the advantage of knowing the schema used: either because I was manually executing the query on a known schema, or mk-query-digest was kind enough to let me know.

For example, look at the following imaginary query, involving both the world and sakila databases:

mysql> use test;
Database changed
mysql> EXPLAIN SELECT * FROM world.Country JOIN sakila.city WHERE Country.Capital = city.city_id;
+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+
| id | select_type | table   | type   | possible_keys | key     | key_len | ref                   | rows | Extra       |
+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+
|  1 | SIMPLE      | Country | ALL    | NULL          | NULL    | NULL    | NULL                  |  239 |             |
|  1 | SIMPLE      | city    | eq_ref | PRIMARY       | PRIMARY | 2       | world.Country.Capital |    1 | Using where |
+----+-------------+---------+--------+---------------+---------+---------+-----------------------+------+-------------+
2 rows in set (0.00 sec)

The query is imaginary, since the tables don’t actually have anything in common. But look at the EXPLAIN result: can you tell where city came from? Country can somehow be parsed from the ref column, but no help on city.

Moreover, table aliases show on the EXPLAIN plan (which is good), but with no reference to the original table.

So, is it back to parsing of the SQL query? I’m lazy reluctant to do that. It’s error prone, and one needs to implement, or use, a good parser, which also accepts MySQL dialect. Haven’t looked into this yet.

I’m currently at a standstill with regard to automated query execution plan evaluation where database cannot be determined.

]]>
https://shlomi-noach.github.io/blog/mysql/explain-missing-db-info/feed 13 2368