Three wishes for a new year

(Almost) another new year by Jewish calendar. What do I wish for the following year?

  1. World peace
  2. Good health to all
  3. Relaxed GTID constraints

I’m still not using GTID, and still see operational issues with working with GTID. As a latest example, our new schema migration solution, gh-ost, allows us to test migrations in production, on replicas. The GTID catch? gh-ost has to write something to the binary log. Thus, it “corrupts” the replica with a bogus GTID entry that will never be met in another server, thus making said replica unsafe to promote. We can work around this, but…

I understand the idea and need for the Executed GTID Set. It will certainly come in handy with multi-writer InnoDB Cluster. However for most use cases GTID poses a burden. The reason is that our topologies are imperfect, and we as humans are imperfect, and operations are most certainly imperfect. We may wish to operate on a replica: test something, by intention or mistake. We may wish to use a subchain as the seed for a new cluster split. We may wish to be able to write to downstream replicas. We may use a 3rd party tool that issues a flush tables with read lock without disabling sql_log_bin. Things just happen.

For that, I would like to suggest GTID control levels, such as:

  1. Strict: same as Oracle’s existing implementation. Executed sets, purged sets, whatnot.
  2. Last executed: a mode where the only thing that counts is the last executed GTID value. If I repoint replica, all it needs to check is “hey this is my last executed GTID entry, give me the coordinates of yours. And, no, I don’t care about comparing executed and purged sets, I will trust you and keep running from that point on”
  3. Declarative: GTIDs are generated, are visible in each and every binary log entry, but are completely ignored.

I realize Oracle MySQL GTID is out for some over 3 years now, but I’m sorry – I still have reservations and see use cases where I fear it will not serve me right.

How about my previous years wishes? World peace and good health never came through, however:

  • My 2015 wish for “decent, operations friendly built in online table refactoring” was unmet, however gh-ost is a thing now and exceeds my expectations. No, really. Please come see Tom & myself present gh-ost and how it changed our migration paradigm.
  • My 2012 wish for “decent, long waited for, implementation of Window Functions (aka Analytic Functions) for MySQL” was met by MariaDB’s window functions.
    Not strictly Window Functions, but Oracle MySQL 8.0 will support CTE (hierarchial/recursive), worth a mention.

See you in Amsterdam!

gh-ost 1.0.17: Hooks, Sub-second lag control, Amazon RDS and more

gh-ost version 1.0.17 is now released, with various additions and fixes. Here are some notes of interest:

Hooks

gh-ost now supports hooks. These are your own executables that gh-ost will invoke at particular points of interest (validation pass, about to cut-over, success, failure, status, etc.)

gh-ost will set various environment variables for your executables to pick up, passing along such information as migrated/ghost table name, elapsed time, processed rows, migrated host etc.

Sub-second lag control

At GitHub we’re very strict about replication lag. We keep it well under 1 second at most times. gh-ost can now identify sub-second lag on replicas (well, you need to supply with the right query). Our current production migrations are set by default with --max-lag-millis=500 or less, and our most intensive migrations keep replication lag well below 1sec or even below 500ms

No SUPER

The SUPER privilege is required to set global binlog_format='ROW' and for STOP SLAVE; START SLAVE;

If you know your replica has RBR, you can pass --assume-rbr and skips those steps.

RDS

Hooks + No Super = RDS, as seems to be the case. For --test-on-replica you will need to supply your own gh-ost-on-stop-replication hook, to stop your RDS replica at cut-over phase. See this tracking issue Continue reading » “gh-ost 1.0.17: Hooks, Sub-second lag control, Amazon RDS and more”