Pseudo GTID, ASCENDING

Pseudo GTID is a technique where we inject Globally Unique entries into MySQL, gaining GTID abilities without using GTID. It is supported by orchestrator and described in more detail here, here and here.

Quick recap: we can join two slaves to replicate from one another even if they never were in parent-child relationship, based on our uniquely identifiable entries which can be found in the slaves’ binary logs or relay logs. Having Pseudo-GTID injected and controlled by us allows us to optimize failovers into quick operations, especially where a large number of server is involved.

Ascending Pseudo-GTID further speeds up this process for delayed/lagging slaves.

Recap, visualized

(but do look at the presentation):

pseudo-gtid-quick

  1. Find last pseudo GTID in slave’s binary log (or last applied one in relay log)
  2. Search for exact match on new master’s binary logs
  3. Fast forward both through successive identical statements until end of slave’s applied entries is reached
  4. Point slave into cursor position on master

What happens if the slave we wish to reconnect is lagging? Or perhaps it is a delayed replica, set to run 24 hours behind its master?

The naive approach would expand bullet #2 into:

  • Search for exact match on master’s last binary logs
  • Unfound? Move on to previous (older) binary log on master
  • Repeat

The last Pseudo-GTID executed by the slave was issued by the master over 24 hours ago. Suppose the master generates one binary log per hour. This means we would need to full-scan 24 binary logs of the master where the entry will not be found; to only be matched in the 25th binary log (it’s an off-by-one problem, don’t hold the exact number against me).

Ascending Pseudo GTID

Since we control the generation of Pseudo-GTID, and since we control the search for Pseudo-GTID, we are free to choose the form of Pseudo-GTID entries. We recently switched into using Ascending Pseudo-GTID entries, and this works like a charm. Consider these Pseudo-GTID entries: Continue reading » “Pseudo GTID, ASCENDING”

What makes a MySQL server failure/recovery case?

Or: How do you reach the conclusion your MySQL master/intermediate-master is dead and must be recovered?

This is an attempt at making a holistic diagnosis of our replication topologies. The aim is to cover obvious and not-so-obvious crash scenarios, and to be able to act accordingly and heal the topology.

At Booking.com we are dealing with very large amounts of MySQL servers. We have many topologies, and many servers in each topology. See past numbers to get a feel for it. At these numbers failures happen frequently. Typically we would see normal slaves failing, but occasionally — and far more frequently than we would like to be paged for — an intermediate master or a master would crash. But our current (and ever in transition) setup also include SANs, DNS records, VIPs, any of which can fail and bring down our topologies.

Tackling issues of monitoring, disaster analysis and recovery processes, I feel safe to claim the following statements:

  • The fact your monitoring tool cannot access your database does not mean your database has failed.
  • The fact your monitoring tool can access your database does not mean your database is available.
  • The fact your database master is unwell does not mean you should fail over.
  • The fact your database master is alive and well does not mean you should not fail over.

Bummer. Let’s review a simplified topology with a few failure scenarios. Some of these scenarios you will find familiar. Some others may be caused by setups you’re not using. I would love to say I’ve seen it all but the more I see the more I know how strange things can become. Continue reading » “What makes a MySQL server failure/recovery case?”

Orchestrator visual cheatsheet

Orchestrator is growing. Supporting automatic detection of topologies, simple refactoring of topology trees, complex refactoring via Pseudo-GTID, failure detection and automated discovery, it is becoming larger and larger by the day.

One of the problems with growign projects is hwo to properly document them. Orchestrator enjoys a comprehensive manual, but as these get more and more detailed, it is becoming difficult to get oriented and pointed in the right direction. I’ve done my best to advise the simple use cases throughout the manual.

One thing that is difficult to put into words is topologies. Explaining “failover of an intermediate master S1 that has S2,…,Sn slaves onto a sibling of S1 provided that…” is too verbose. So here’s a quick visual cheatsheet for (current) topology refactoring commands. Refactoring commands are a mere subset of overall orchestrator commands, but they’re great to play with and perfect for visualization.

The “move” and related commands use normal replication commands (STOP SLAVE; CHANGE MASTER TO; START SLAVE UNTIL;”…).

The “match” and related commands utilize Pseudo-GTID and use more elaborate MySQL commands (SHOW BINLOG EVENTS, SHOW RELAYLOG EVENTS).

So without further ado, here’s what each command does (and do run “orchestrator” from the command line to get a man-like explanation of everything, or just go to the manual). Continue reading » “Orchestrator visual cheatsheet”

Speaking at Percona Live: Pseudo GTID and Easy Replication Topology Management

In two weeks time I will be presenting Pseudo GTID and Easy Replication Topology Management at Percona Live. From the time I submitted the proposal a LOT has been developed, experimented, deployed and used with both Pseudo GTID and with orchestrator. In my talk I will:

  • Suggest that you skip the “to GTID or not to GTID” question and go for the lightweight Pseudo GTID
  • Show how Pseudo GTID is used in production to recover from various replication failures and server crashes
  • Do an outrageous demonstration
  • Tell you about 50,000 successful experiments and tests done in production
  • Show off orchestrator and its support for Pseudo GTID, including automated crash analysis and recovery mechanism.

I will further show how the orchestrator tooling makes for a less restrictive, more performant, less locking, non-intrusive, trusted and lightweight replication topology management solution. Continue reading » “Speaking at Percona Live: Pseudo GTID and Easy Replication Topology Management”

Speaking at FOSDEM: Pseudo GTID and easy replication management

This coming Sunday I’ll be presenting Pseudo GTID and easy replication management at FOSDEM, Brussels.

There’s been a lot of development on Pseudo GTID these last few weeks. In this talk I’ll show you how you can use Pseudo GTID instead of “normal” GTID to easily repoint your slaves, recover from intermediate master failure, promote slaves to masters as well as emply crash safe replication without crash safe replication.

Moreover, I will show how you can achieve all the above with less constraints than GTID, and for bulk operations — with less overhead and in shorter time. You will also see that Pseudo GTID is a non intrusive solution which does not require you to change anything in your topologies.

Moral: I’ll try and convince you to drop your plans for using GTID in favor of Pseudo GTID.

We will be employing Pseudo GTID as the basis for high availability and failover at Booking.com on many topologies, and as a safety mechanism in other topologies where we will employ Binlog servers.

Pseudo gtid & easy replication topology management from Shlomi Noach

Reading RBR binary logs with pt-query-digest

For purposes of auditing anything that goes on our servers we’re looking to parse the binary logs of all servers (masters), as with “Anemomaster“. With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).

I’ve written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I’m able to now feed it to pt-query-digest which seems to be happy.

The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:

mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001

The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.

The script can be found in my gist repositories. Current version is as follows: Continue reading » “Reading RBR binary logs with pt-query-digest”

Orchestrator 1.2.9 GA released

Orchestrator 1.2.9 GA has been released. Noteworthy:

  • Added “ReadOnly” (true/false) configuration param. You can have orchestrator completely read-only
  • Added “AuthenticationMethod”: “multi”: works like BasicAuth (your normal HTTP user+password) only it also accepts the special user called “readonly”, which, surprise, can only view and not modify
  • Centralized/serialized most backend database writes (with hundreds/thousands monitored servers it was possible or probable that high concurrency led to too-many-connections openned on the backend database).
  • Fixed evil evil bug that would skip some checks if binary logs were not enabled
  • Better hostname resolve (now also asking MySQL server to resolve hostname; resolving is cached)
  • Pseudo-GTID (read here, here, here) support now considered stable (apart from being tested it has already been put to practice multiple times in production at Outbrain, in different planned and unplanned crash scenarios)

I continue developing orchestrator as free and open source at my new employer, Booking.com.

 

 

Semi-automatic slave/master promotion via Pseudo GTID

Orchestrator release 1.2.7-beta now supports semi-automatic slave promotion to master upon master death, via Pseudo GTID.

orchestrator-make-master-highlighted

When the master is dead, orchestrator automatically picks the most up-to-date slaves and marks them as “Master candidates”. It allows a /api/make-master call on such a slave (S), in which case it uses Pseudo GTID to enslave its siblings, and set S as read-only = 0. All we need to do is click the “Make master” button. Continue reading » “Semi-automatic slave/master promotion via Pseudo GTID”

Refactoring replication topologies with Pseudo GTID: a visual tour

Orchestrator 1.2.1-beta supports Pseudo GTID (read announcement): a means to refactor the replication topology and connect slaves even without direct relationship; even across failed servers. This post illustrates two such scenarios and shows the visual way of mathcing/re-synching slaves.

Of course, orchestrator is not just a GUI tool; anything done with drag-and-drop is also done via web API (in fact, the drag-and-drop invoke the web API) as well as via command line. I’m mentioning this as this is the grounds for failover automation planned for the future.

Scenario 1: the master unexpectedly dies

The master crashes and cannot be contacted. All slaves are stopped as effect, but each in a different position. Some managed to salvage relay logs just before the master dies, some didn’t. In our scenario, all three slaves are at least caught up with the relay log (that is, whatever they managed to pull through the network, they already managed to execute). So they’re otherwise sitting idle waiting for something to happen. Well, something’s about to happen.

orchestrator-pseudo-gtid-dead-master

Note the green “Safe mode” button to the right. This means operation is through calculation of binary log files & positions with relation to one’s master. But the master is now dead, so let’s switch to adventurous mode; in this mode we can drag and drop slaves onto instances normally forbidden. At this stage the web interface allows us to drop a slave onto its sibling or any of its ancestors (including its very own parent, which is a means of reconnecting a slave with its parent). Anyhow:

orchestrator-pseudo-gtid-dead-master-pseudo-gtid-mode

We notice that orchestrator is already kind enough to say which slave is best candidate to be the new master (127.0.0.1:22990): this is the slave (or one of the slaves) with most up-to-date data. So we choose to take another server and make it a slave of 127.0.0.1:22990: Continue reading » “Refactoring replication topologies with Pseudo GTID: a visual tour”

Orchestrator 1.2.1 BETA: Pseudo GTID support, reconnect slaves even after master failure

orchestrator 1.2.1 BETA is released. This version supports Pseudo GTID, and provides one with powerful refactoring of one’s replication topologies, even across failed instances.

orchestrator-pseudo-gtid-dead-relay-master-begin-drag

Depicted: moving a slave up the topology even though its local master is inaccessible

Enabling Pseudo-GTID

You will need to:

  1. Inject a periodic unique entry onto your binary logs
  2. Configure orchestrator to recognize said entry.

Pseudo GTID injection example

We will use the event scheduler (must be enabled) to inject an entry every 10 seconds, recognized both in statement-based and row-based replication.

create database if not exists meta;

drop event if exists meta.create_pseudo_gtid_view_event;

delimiter ;;
create event if not exists
  meta.create_pseudo_gtid_view_event
  on schedule every 10 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid := uuid();
      set @_create_statement := concat('create or replace view meta.pseudo_gtid_view as select \'', @pseudo_gtid, '\' as pseudo_gtid_unique_val from dual');
      PREPARE st FROM @_create_statement;
      EXECUTE st;
      DEALLOCATE PREPARE st;
    end
;;

delimiter ;

set global event_scheduler := 1;

Make sure to enable event_scheduler in your my.cnf config file.

An entry in the binary logs would look like this: Continue reading » “Orchestrator 1.2.1 BETA: Pseudo GTID support, reconnect slaves even after master failure”