Your magical RANGE partitioning maintenance query

If you use RANGE (or RANGE COLUMNS) partitioning, and in particular when partitioning by date/time, then your are subject to the following questions: how and when do you create the “next” partition? How and when do you drop your older partitions?

Many people use in-house scripting to do that, and Giuseppe Maxia wrote Partition Helper. But I would like to take you one step forward, and provide with a query (based on views) which automagically understands which new partition you want to create, and provides you with the statement to do so. It looks somewhat like this (a demo follows later on):

mysql> SELECT * FROM sql_range_partitions \G
*************************** 1. row ***************************
            table_schema: test
              table_name: city
sql_drop_first_partition: alter table `test`.`city` drop partition `p3`
  sql_add_next_partition: alter table `test`.`city` add partition (partition `p_20160101000000` values less than (736329) /* 2016-01-01 00:00:00 */ )
*************************** 2. row ***************************
            table_schema: test
              table_name: quarterly_report_status
sql_drop_first_partition: alter table `test`.`quarterly_report_status` drop partition `p3`
  sql_add_next_partition: alter table `test`.`quarterly_report_status` reorganize partition `p_maxvalue` into (partition `p_20110401000000` values less than (1301608800) /* 2011-04-01 00:00:00 */ , partition p_maxvalue values less than MAXVALUE)

A closer look at why this is magic

This query just gave you the DROP PARTITION and ADD PARTITION for all tables in your databases that use a RANGE partitioning scheme. But, consider: Continue reading » “Your magical RANGE partitioning maintenance query”

MySQL 5.6 new features: the user’s perspective

This is a yet-another compilation of the new MySQL 5.6 feature set. It is not a complete drill down. This list reflects what I believe to be the interesting new features user and usability -wise.

For example, I won’t be listing InnoDB’s split of kernel mutex. I’m assuming it can have a great impact on overall performance due to reducing lock contention; but usability-wise, this is very internal.

The complication is an aggregate of the many announcements and other complications published earlier on. See a reference at the end of this post.

Do note I am not using 5.6 as yet; it is in RC, not GA. I am mostly excited just to write down this list.

InnoDB

  • Online ALTER TABLE: if there is one major new feature in 5.6 you would want to upgrade for, this would be it. Add columns, drop columns, rename columns, add indexes, drop indexes – now online, while your SELECT, INSERT, UPDATE and DELETE statements are running.
  • Transportable tablespace files: copy+paste your_table.ibd files with FLUSH TABLE FOR EXPORT and ALTER TABLE … IMPORT TABLESPACE.
  • FULLTEXT: for many, the one thing holding them back from leaving MyISAM behind. Now available in InnoDB with same syntax as with MyISAM.
  • Memcached API: access InnoDB data via memcahced protocol, and skip the SQL interface.
  • User defined table location: place your tables in your pre-defined location. Place other tables elsewhere. This is something I’ve been asked about for ages.

Continue reading » “MySQL 5.6 new features: the user’s perspective”

Three wishes for a new year

Another new year by Jewish calendar. What do I wish for the following year?

  1. World peace
  2. Good health to all
  3. Get a decent, long waited for, implementation of Window Functions (aka Analytic Functions) for MySQL.

I mean, I like GROUP_CONCAT, and the many hacks it provides: [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. But it makes for a poor substitution to Window Functions, and only solves a subset of issues.

My wishes in previous two years [2010], [2011] have not come true. I’m still willing to settle for two out of three.

How common_schema installs itself

Up till common_schema version 1.1, the user would need to choose from distinct distribution files: an install compatible with MySQL 5.1, one compatible with InnoDB Plugin enabled servers, and one compatible with Percona Server. The difference between the three is the availability of certain INFORMATION_SCHEMA tables.

With 1.1, this is no longer the case: common_schema auto-detects the server and available feature set, and installs accordingly.

Wait, isn’t common_schema just a SQL file?

Yes. It’s not like there’s an installer like InstallShield or anything. Nevertheless, common_schema offers a smart way of conditional handling, which is uses in itself. It’s called QueryScript.

common_schema is installed by importing the SQL file (via SOURCE command; the mysql client; your favorite GUI). This creates your usual tables, views and routines. But some of these routines make for an interpreter for QueryScript. Somewhere along the installation process (remember – it’s just a SQL import), common_schema switches over to executing scripts to manage the installation. In particular, there are a few views which depend on optional tables, such as InnoDB Plugin’s tables for INFORMATION_SCHEMA. Continue reading » “How common_schema installs itself”

How common_schema split()s tables – internals

This post exposes some of the internals, and the SQL behind QueryScript’s split. common_schema/QueryScript 1.1 introduces the split statement, which auto-breaks a “large” query (one which operates on large tables as a whole or without keys) into smaller queries, and executes them in sequence.

This makes for easier transactions, less locks held, potentially (depending on the user) more idle time released back to the database. split has similar concepts to oak-chunk-update and pt-archiver, but works differently, and implemented entirely in SQL on server side.

Take the following statement as example:

split (UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR)
  pass;

It yields with (roughly) the following statements:

UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1')) OR ((`inventory`.`inventory_id` = '1'))) AND (((`inventory`.`inventory_id` < '1000')) OR ((`inventory`.`inventory_id` = '1000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '1000'))) AND (((`inventory`.`inventory_id` < '2000')) OR ((`inventory`.`inventory_id` = '2000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '2000'))) AND (((`inventory`.`inventory_id` < '3000')) OR ((`inventory`.`inventory_id` = '3000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '3000'))) AND (((`inventory`.`inventory_id` < '4000')) OR ((`inventory`.`inventory_id` = '4000'))));
UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR WHERE ((((`inventory`.`inventory_id` > '4000'))) AND (((`inventory`.`inventory_id` < '4581')) OR ((`inventory`.`inventory_id` = '4581'))));

(I say “roughly” because internally there are user defined variables at play, but for convenience, I verbose the actual values as constants.)

How does that work?

common_schema works on server side. There is no Perl script or anything. It must therefore use server-side operations to:

  • Identify table to be split
  • Analyze the table in the first place, deciding how to split it
  • Analyze the query, deciding on how to rewrite it
  • Split the table (logically) into unique and distinct chunks
  • Work out the query on each such chunk

Following is an internal look at how common_schema does all the above. Continue reading » “How common_schema split()s tables – internals”

Table split(…) for the masses

(pun intended)

common_schema‘s new split statement (see release announcement) auto-splits complex queries over large tables into smaller ones: instead of issuing one huge query, split breaks one’s query into smaller queries, each working on a different set of rows (a chunk).

Thus, it is possible to avoid holding locks for long times, allowing for smaller transactions. It also makes for breathing space for the RDBMS, at times boosting operation speed, and at times prolonging operation speed at will.

In this post I show how split exposes itself to the user, should the user wish so.

split can manage queries of the following forms:

  • DELETE FROM table_name [WHERE]…
  • DELETE FROM table_name USING <multi table syntax> [WHERE]…
  • UPDATE table_name SET … [WHERE]…
  • UPDATE <multiple tables> SET … [WHERE]…
  • INSERT INTO some_table SELECT … FROM <single or multiple tables> [WHERE]…
  • REPLACE INTO some_table SELECT … FROM <single or multiple tables> [WHERE]…
  • SELECT … FROM <multiple tables> [WHERE]…

The latter being a non-obvious one at first sight.

Basically, it’ automatic

You just say:

split (UPDATE sakila.inventory SET last_update = last_update + INTERVAL 6 HOUR)
  throttle 2;

And split identifies sakila.inventory as the table which needs to be split, and injects appropriate conditions so as to work on a subset of the rows, in multiple steps.

By the way, here’s how to execute a QueryScript code like the above. Continue reading » “Table split(…) for the masses”

common_schema 1.1 released: split(), try-catch, killall(), profiling

I’m very happy to announce the release of common_schema, version 1.1 (revision 300).

This version boasts with compelling new features: innovative QueryScript syntax, libraries, views which add to your skills as a DBA, making some maintenance and management tasks a breeze.

  • QueryScript, split statement: automagically break long queries into smaller chunks, avoid long locks and reduce query/transaction overhead
  • QueryScript, try-catch statement: just try { something; } catch { act_on_error; }.
  • killall(): quickly kill connections based on grantee/user/host information.
  • profiling/profiling_last: utility views to assist in query profiling diagnostics
  • 1 size fits all: a single installer which auto-recognizes available server features and enables respective common_schema features accordingly.
  • QueryScript performance boost
  • much much more…

Not familiar with common_schema? It allows you to do stuff on server side, by selecting from views, calling upon useful routines or writing easy-to-manage scripts.

I’m suggesting that common_schema should be a really-should-have tool to accompany your MySQL install. Did I say “tool”? It’s merely a schema. But it makes for a great framework:

In High Performance MySQL, 3rd edition, Baron Schwartz describes common_schema:

The common_schema is to MySQL as jQuery is to javaScript

Reviewing highlights for version 1.1:

QueryScript

QueryScript is a scripting language. It sees some major improvements here. I’ve made some speed boosts by avoiding using temporary tables, and by using string parsing instead.

Without doubt the two most handy statements added to QueryScript are: Continue reading » “common_schema 1.1 released: split(), try-catch, killall(), profiling”

SQL: selecting top N records per group, another solution

A while back I presented SQL: selecting top N records per group, a “give me the top 5 countries in each continent” type of query, and which used an external numbers table and a lot of tedious casting.

Here’s another solution I came up with (*). Still using GROUP_CONCAT (how else?), but no external table and no casting. The query outputs the largest 5 countries (by surface area) per continent.

SELECT
  Continent,
  Name,
  SurfaceArea,
  Population
FROM
  world.Country,
  (
    SELECT 
      GROUP_CONCAT(top_codes_per_group) AS top_codes
    FROM
      (
        SELECT 
          SUBSTRING_INDEX(GROUP_CONCAT(Code ORDER BY SurfaceArea DESC), ',', 5) AS top_codes_per_group
        FROM
          world.Country
        GROUP BY
          Continent
      ) s_top_codes_per_group
  ) s_top_codes
WHERE
  FIND_IN_SET(Code, top_codes)
ORDER BY
  Continent,
  SurfaceArea DESC
;

+---------------+----------------------------------------------+-------------+------------+
| Continent     | Name                                         | SurfaceArea | Population |
+---------------+----------------------------------------------+-------------+------------+
| Asia          | China                                        |  9572900.00 | 1277558000 |
| Asia          | India                                        |  3287263.00 | 1013662000 |
| Asia          | Kazakstan                                    |  2724900.00 |   16223000 |
| Asia          | Saudi Arabia                                 |  2149690.00 |   21607000 |
| Asia          | Indonesia                                    |  1904569.00 |  212107000 |
| Europe        | Russian Federation                           | 17075400.00 |  146934000 |
| Europe        | Ukraine                                      |   603700.00 |   50456000 |
| Europe        | France                                       |   551500.00 |   59225700 |
| Europe        | Spain                                        |   505992.00 |   39441700 |
| Europe        | Sweden                                       |   449964.00 |    8861400 |
| North America | Canada                                       |  9970610.00 |   31147000 |
| North America | United States                                |  9363520.00 |  278357000 |
| North America | Greenland                                    |  2166090.00 |      56000 |
| North America | Mexico                                       |  1958201.00 |   98881000 |
| North America | Nicaragua                                    |   130000.00 |    5074000 |
| Africa        | Sudan                                        |  2505813.00 |   29490000 |
| Africa        | Algeria                                      |  2381741.00 |   31471000 |
| Africa        | Congo, The Democratic Republic of the        |  2344858.00 |   51654000 |
| Africa        | Libyan Arab Jamahiriya                       |  1759540.00 |    5605000 |
| Africa        | Chad                                         |  1284000.00 |    7651000 |
| Oceania       | Australia                                    |  7741220.00 |   18886000 |
| Oceania       | Papua New Guinea                             |   462840.00 |    4807000 |
| Oceania       | New Zealand                                  |   270534.00 |    3862000 |
| Oceania       | Solomon Islands                              |    28896.00 |     444000 |
| Oceania       | New Caledonia                                |    18575.00 |     214000 |
| Antarctica    | Antarctica                                   | 13120000.00 |          0 |
| Antarctica    | French Southern territories                  |     7780.00 |          0 |
| Antarctica    | South Georgia and the South Sandwich Islands |     3903.00 |          0 |
| Antarctica    | Heard Island and McDonald Islands            |      359.00 |          0 |
| Antarctica    | Bouvet Island                                |       59.00 |          0 |
| South America | Brazil                                       |  8547403.00 |  170115000 |
| South America | Argentina                                    |  2780400.00 |   37032000 |
| South America | Peru                                         |  1285216.00 |   25662000 |
| South America | Colombia                                     |  1138914.00 |   42321000 |
| South America | Bolivia                                      |  1098581.00 |    8329000 |
+---------------+----------------------------------------------+-------------+------------+

In bold are the conditions by which we nominate our selected rows (condition is SurfaceArea DESC, number of rows is 5, so 5 largest countries).

Continue reading » “SQL: selecting top N records per group, another solution”

Ways to export MySQL result set to file on client side

Problem: you wish to write table data to file, but you wish to do so on client side.

SELECT … INTO OUTFILE writes the file on server. What are your options on client?

1. mysql client

If you have direct access from your client machine to your DB server machine, and can connect via mysql client, you get a very customizable file write:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" > /tmp/output.txt

The above writes fancy tables, so you probably want to:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" --bat > /tmp/output.txt

Also try:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" --silent --raw > /tmp/output.txt

To turn off headers.

More options include the –xml or –html formats. Continue reading » “Ways to export MySQL result set to file on client side”

Things that can’t (and some that can) be done from within a MySQL stored routine

I’m doing a lot of stored routine programming lately, working on common_schema. I’m in particular touching at the extremes of abilities. Some things just can’t be done from within a stored routine. Here’s a list of can’t be done:

  • Cursor for SHOW statements: can’t be done — this is explicitly blocked from operating (it once used to work).
  • Get detailed error information on exceptions: apparently 5.6 has support for this. 5.1 and 5.5 do not.
  • Change binlog_format: this is obvious, if you think about it. binlog_format dictates how the routine itself is replicated in the first place.
  • Set sql_log_bin. Again, this makes sense.
  • Work out different results depending on current machine. For example, you can’t have a routine that returns with ‘master’ on the master and with ‘slave’ on the slave. That is, not under any condition. Consider: if Row Based Replication is used, you don’t actually have a routine executing on the slave. I’m happy to be proven wrong on this.
  • Know what database was in use by calling code. The routine executes within the context of the database where it is defined. But you can’t tell what database was in use just a couple milliseconds before.
  • Likewise, know what sql_mode was in use by calling code. Stored routines have their own sql_mode – the one they were created with. No way to check up on the calling stack.
  • And you can’t USE another database (database as in schema). USE is a client command.
  • Reconnect after failure (kind of obvious, isn’t it?)
  • Connect to other servers (not so obvious to SQL Server DBAs). You can’t issue queries on other servers. Bummer.
  • Shutdown the server
  • Fork (you’re in a connection, you can’t issue a new connection from your own connection)

Continue reading » “Things that can’t (and some that can) be done from within a MySQL stored routine”