gh-ost 1.0.17: Hooks, Sub-second lag control, Amazon RDS and more

gh-ost version 1.0.17 is now released, with various additions and fixes. Here are some notes of interest:

Hooks

gh-ost now supports hooks. These are your own executables that gh-ost will invoke at particular points of interest (validation pass, about to cut-over, success, failure, status, etc.)

gh-ost will set various environment variables for your executables to pick up, passing along such information as migrated/ghost table name, elapsed time, processed rows, migrated host etc.

Sub-second lag control

At GitHub we’re very strict about replication lag. We keep it well under 1 second at most times. gh-ost can now identify sub-second lag on replicas (well, you need to supply with the right query). Our current production migrations are set by default with --max-lag-millis=500 or less, and our most intensive migrations keep replication lag well below 1sec or even below 500ms

No SUPER

The SUPER privilege is required to set global binlog_format='ROW' and for STOP SLAVE; START SLAVE;

If you know your replica has RBR, you can pass --assume-rbr and skips those steps.

RDS

Hooks + No Super = RDS, as seems to be the case. For --test-on-replica you will need to supply your own gh-ost-on-stop-replication hook, to stop your RDS replica at cut-over phase. See this tracking issue Continue reading » “gh-ost 1.0.17: Hooks, Sub-second lag control, Amazon RDS and more”

Baffling 5.7 global/status variables issues, unclean migration path

MySQL 5.7 introduces a change in the way we query for global variables and status variables: the INFORMATION_SCHEMA.(GLOBAL|SESSION)_(VARIABLES|STATUS) tables are now deprecated and empty. Instead, we are to use the respective performance_schema.(global|session)_(variables|status) tables.

But the change goes farther than that; there is also a security change. Oracle created a pitfall of 2 changes at the same time:

  1. Variables/status moved to a different table
  2. Privileges required on said table

As an example, my non-root user gets:

mysql> show session variables like 'tx_isolation';
ERROR 1142 (42000): SELECT command denied to user 'normal_user'@'my_host' for table 'session_variables'

Who gets affected by this? Nearly everyone and everything.

  • Your Nagios will not be able to read status variables
  • Your ORM will not be able to determine session variables
  • Your replication user will fail connecting (see this post by Giuseppe)
  • And most everyone else.

The problem with the above is that involves two unrelated changes to your setup, which are not entirely simple to coordinate:

  1. Change your app code to choose the correct schema (information_schema vs. performance_schema)
  2. GRANT the permissions on your database

Perhaps at this point you still do not consider this to be a problem. You may be thinking: well, let’s first prepare by creating the GRANTs, and once that is in place, we can, at our leisure, modify the code.

Not so fast. Can you really that simply create those GRANTs? Continue reading » “Baffling 5.7 global/status variables issues, unclean migration path”

common_schema: 1.3: security goodies, parameterized split(), json-to-xml, query checksum

common_schema 1.3 is released and is available for download. New and noteworthy in this version:

  • Parameterized split(): take further control over huge transactions by breaking them down into smaller chunks, now manually tunable if needed
  • duplicate_grantee(): copy+paste existing accounts along with their full set of privileges
  • similar_grants: find which accounts share the exact same set of privileges (i.e. have the same role)
  • json_to_xml(): translate any valid JSON object into its equivalent XML form
  • extract_json_value(): use XPath notation to extract info from JSON data, just as you would from XML
  • query_checksum(): given a query, calculate a checksum on the result set
  • random_hash(): get a 40 hexadecimal digits random hash, using a reasonably large changing input

Let’s take a closer look at the above:

Parameterized split()

split takes your bulk query and automagically breaks it down into smaller pieces. So instead of one huge UPDATE or DELETE or INSERT..SELECT transaction, you get many smaller transactions, each with smaller impact on I/O, locks, CPU.

As of 1.3, split() gets more exposed: you can have some control on its execution, and you also get a lot of very interesting info during operation.

Here’s an example of split() control:

set @script := "
  split({start:7015, step:2000} : UPDATE sakila.rental SET return_date = return_date + INTERVAL 1 DAY) 
    throttle 1;
";
call common_schema.run(@script);

In the above we choose a split size of 2,000 rows at a time; but we also choose to only start with 7015, skipping all rows prior to that value. Just what is that value? It depends on the splitting key (and see next example for just that); but in this table we can safely assume this is the rental_id PRIMARY KEY of the table.

You don’t have to use these control parameters. But they can save you some time and effort. Continue reading » “common_schema: 1.3: security goodies, parameterized split(), json-to-xml, query checksum”

State of InnDB Online DDL in MySQL 5.6.9-RC (good news included)

5.6.9-RC is out, and I was curious to see how the online DDL has improved since my 5.6.8 review. I also owe James Day this review, since he came up with results inconsistent with my own.

We both agreed the dataset I was using was too small, but I got similar results even on larger scale. Then some time passed, and 5.6.9 was announced.

So for the 5.6.9 test I took one of my real tables on production. It is not extremely large: it’s a ~ 300MB .ibd file, in the following format:

mysql> show create table tbl \G

CREATE TABLE `tbl` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `a` varchar(255) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `w` smallint(11) NOT NULL DEFAULT '0',
  `d` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `icount` smallint(5) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`) KEY_BLOCK_SIZE=8,
  UNIQUE KEY `u_idx` (`a`,`w`,`d`) KEY_BLOCK_SIZE=8,
  KEY `d` (`d`) KEY_BLOCK_SIZE=8
) ENGINE=InnoDB AUTO_INCREMENT=16960441 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=16

Got some 2.5M rows in the table; desktop machine, 64 bit Linux, mysqlsandbox.

I have crossed several DDL statements with several DML statements. The DDL statements in this test are (ALTER TABLE…): Continue reading » “State of InnDB Online DDL in MySQL 5.6.9-RC (good news included)”

State of InnDB Online DDL in MySQL 5.6.8-RC

5.6.8-rc is out, and so I’m following up on InnoDB’s online DDL new feature: the ability to SELECT, INSERT, DELETE, UPDATE a table even while an ALTER TABLE is executing on same table.

The brief summary

Not as advertised; many things can’t be done.

The longer review

I’m using 5.6.8-rc 64bit binary distribution for Linux, installed via mysqlsandbox. My hardware is irrelevant, but the fact I’m testing on my laptop assists me in that ALTER TABLE operations take a while, so that I’m able to easily type commands in two terminals and have the time to watch them being executed. Query cache is disabled. Continue reading » “State of InnDB Online DDL in MySQL 5.6.8-RC”

MySQL 5.6 RC: further thoughts and questions

Here are a few questions I came up with while experimenting with MySQL 5.6.7 & 5.6.8. They are the impressions of a first-time encounter with 5.6, which is a single opportunity for a person to point out the things that strike as odd.

Bugs-wise, just submitted another crashing bug for 5.6.8. I’m just one man, so I extrapolate to realize there is still much work to be done.

The below list does not necessarily make for a bug list; mostly things that puzzle me. I hope it can stir some additional thinking.

  1. Transportable tablespace: what’s the difference between FLUSH TABLES my_table WITH READ LOCK and FLUSH TABLES my_table FOR EXPORT? Both create the .cfg file, and both seem to operate just as well. One document says READ LOCK, another says FOR EXPORT.
  2. What’s the ALGORITHM=? flag in online ALTER TABLE? Apparently one can write to altered table even on ALGORITHM=COPY. There’s not enough documentation to explain.
  3. How come there’s not a single example of online InnoDB DDL in official docs?
  4. Why the inconsistency of putting ALGORITHM=…, LOCK=… in between commas, as opposed to other flags/commands not between commas? For example: ALTER TABLE my_table ADD COLUMN i INT, ALGORITHM=COPY, LOCK=SHARED, ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
  5. Why would anyone care about FULLTEXT search word proximity by bytes? Typically, one would want proximity by words. I can find the excuse for proximity by characters. By bytes? A user is not interested in the low level representation of the text!
  6. Could we get a distinct tablespace for the mysql internal InnoDB tables? (I understand there’s a separate tablespace for UNDO logs)
  7. Why the need to configure gtid_mode=ON as well as disable-gtid-unsafe-statements so as to enable GTID replication? If only the first is set, an error is produced upon CHANGE MASTER TO MASTER_AUTO_POSITION=1
  8. And when said error is produced, why does it not mention disable-gtid-unsafe-statements, and instead read out a cryptic message? Also note this post by Giuseppe Maxia.

Further experiments with MySQL 5.6.7-RC: submit your bugs

Here’s the background: I’m testing many features of MySQL 5.6.7-RC due to two reasons:

  • I’m verifying my common_schema installs and works properly on 5.6
  • I promised I would present a 45 minute “what’s new in MySQL 5.6” seminar in the upcoming OracleWeek (Israel)

In the case of common_schema, I have managed to find one weird bug (a behavior regression from 5.5) and one server-crashing bug, by merely running the project’s tests, known to pass on 5.1 and 5.5 (and not utilizing any 5.6 features).

In the case of my presentation, I’m getting acquainted with new syntax and functionality, and am getting unexpected results. I’ve hit replication issues and locking issues in online DDL.

If I am able to find these bugs within a few hours of experimenting & testing, then it is safe to assume we can extrapolate to many more bugs. This is not surprising; 5.6 boasts some 120 new features (I didn’t actually count them). Of course they would introduce new bugs.

The moral is this: if you’re interested in 5.6 as I am – and there’s a lot to wait for – please consider experimenting with it, and report as many bugs to bugs.mysql.com as possible. Reporting bugs is the user’s authoritative way of improving an open source product and pushing towards a stable release. Provide your feedback today!

InnoDB DDL: kudos to quick responders on bugs.mysql.com

Continuing my experiments with 5.6 InnoDB online DDL, a bug which I’ve opened, and another which I commented on were quickly answered and explained by the Oracle/MySQL team.

On both accounts I’m happy to acknowledge the issue is resolved; in both cases I failed to produce a real bug scenario. Good lesson. Kudos for quick and informative responses!

What’s left of my experiment, then? Still a lot to check.

I am mainly still confused with which operations exactly can use LOCK=NONE (allowing for updated to table while ALTERing). So far I am only able to produce ALTERs with LOCK=SHARED, meaning table is readable, but cannot be updated.

I will want to test speeds. I’ve so far been content with slow response times for queries over altered tables. How well will that endure under heavy load?

Experimenting with 5.6 InnoDB Online DDL (bugs included)

MySQL 5.6 offers the groundbreaking online DDL operations for InnoDB. Most common use cases will enjoy this feature, and the need for online alter table scripts will decrease. This is a killer feature!

I’ve put this new feature to the usability test. How did it go? Not too well, I’m afraid.

[Updates to this text inline], also see this followup.

sakila & DDL

sakila is still a very useful database. I say “still” because it is not very large, and computing power is getting stronger; yet on my laptop some operations can still take many seconds to complete, which is just fine for my tests.

Sakila tables are mostly InnoDB, and rental being the largest, I do:

node1 (sakila) > alter table sakila.rental engine=InnoDB;
Query OK, 16044 rows affected (6.94 sec)
Records: 16044  Duplicates: 0  Warnings: 0

So what can be executed during these 6.94 seconds? In a second terminal, I try the following: Continue reading » “Experimenting with 5.6 InnoDB Online DDL (bugs included)”

Thoughts on MySQL 5.6 new replication features

After playing a little bit with MySQL 5.6 (RC), and following closely on Giuseppe’s MySQL 5.6 replication gotchas (and bugs), I was having some thoughts.

These are shared for a few reasons:

  • Maybe I didn’t understand it well, and someone could correct me
  • Or I understood it well, and my input could be of service to the developers
  • Or it could be of service to the users

InnoDB tables in mysql schema

The introduction of InnoDB tables in mysql makes for crash-safe replication information: the exact replication position (master log file+pos, relay log file+pos etc.) is updated on InnoDB tables; with innodb_flush_logs_at_trx_commit=1 this means replication status is durable and consistent with server data. This is great news!

However, the introduction of InnoDB tables to the mysql schema also breaks some common usage on installation and setup of MySQL servers. You can’t just drop your ib_data1 file upon dump+restore, since it also contains internal data. Giuseppe outlines the workaround for that.

I was thinking: would it be possible to have a completely different tablespace for MySQL’s internal InnoDB tables? That could be a single tablespace file (who cares about file-per-table on a few internal tables). And I’m throwing an idea without being intimate with the internals: you know how it is possible to span the shared tablespace across multiple files, as in: Continue reading » “Thoughts on MySQL 5.6 new replication features”