Pseudo GTID – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Fri, 20 Nov 2015 09:41:13 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 State of automated recovery via Pseudo-GTID & Orchestrator @ Booking.com https://shlomi-noach.github.io/blog/mysql/state-of-automated-recovery-via-pseudo-gtid-orchestrator-booking-com https://shlomi-noach.github.io/blog/mysql/state-of-automated-recovery-via-pseudo-gtid-orchestrator-booking-com#respond Fri, 20 Nov 2015 09:41:13 +0000 https://shlomi-noach.github.io/blog/?p=7453 This post sums up some of my work on MySQL resilience and high availability at Booking.com by presenting the current state of automated master and intermediate master recoveries via Pseudo-GTID & Orchestrator.

Booking.com uses many different MySQL topologies, of varying vendors, configurations and workloads: Oracle MySQL, MariaDB, statement based replication, row based replication, hybrid, OLTP, OLAP, GTID (few), no GTID (most), Binlog Servers, filters, hybrid of all the above.

Topologies size varies from a single server to many-many-many. Our typical topology has a master in one datacenter, a bunch of slaves in same DC, a slave in another DC acting as an intermediate master to further bunch of slaves in the other DC. Something like this, give or take:

booking-topology-sample

However as we are building our third data center (with MySQL deployments mostly completed) the graph turns more complex.

Two high availability questions are:

  • What happens when an intermediate master dies? What of all its slaves?
  • What happens when the master dies? What of the entire topology?

This is not a technical drill down into the solution, but rather on overview of the state. For more, please refer to recent presentations in September and April.

At this time we have:

  • Pseudo-GTID deployed on all chains
  • Pseudo-GTID based automated failover for intermediate masters on all chains
  • Pseudo-GTID based automated failover for masters on roughly 30% of the chains.
    • The rest of 70% of chains are set for manual failover using Pseudo-GTID.

Pseudo-GTID is in particular used for:

  • Salvaging slaves of a dead intermediate master
  • Correctly grouping and connecting slaves of a dead master
  • Routine refactoring of topologies. This includes:
    • Manual repointing of slaves for various operations (e.g. offloading slaves from a busy box)
    • Automated refactoring (for example, used by our automated upgrading script, which consults with orchestrator, upgrades, shuffles slaves around, updates intermediate master, suffles back…)
  • (In the works), failing over binlog reader apps that audit our binary logs.

Furthermore, Booking.com is also working on Binlog Servers:

  • These take production traffic and offload masters and intermediate masters
  • Often co-serve slaves using round-robin VIP, such that failure of one Binlog Server makes for simple slave replication self-recovery.
  • Are interleaved alongside standard replication
    • At this time we have no “pure” Binlog Server topology in production; we always have normal intermediate masters and slaves
  • This hybrid state makes for greater complexity:
    • Binlog Servers are not designed to participate in a game of changing masters/intermediate master, unless successors come from their own sub-topology, which is not the case today.
      • For example, a Binlog Server that replicates directly from the master, cannot be repointed to just any new master.
      • But can still hold valuable binary log entries that other slaves may not.
    • Are not actual MySQL servers, therefore of course cannot be promoted as masters

Orchestrator & Pseudo-GTID makes this hybrid topology still resilient:

  • Orchestrator understands the limitations on the hybrid topology and can salvage slaves of 1st tier Binlog Servers via Pseudo-GTID
  • In the case where the Binlog Servers were the most up to date slaves of a failed master, orchestrator knows to first move potential candidates under the Binlog Server and then extract them out again.
  • At this time Binlog Servers are still unstable. Pseudo-GTID allows us to comfortably test them on a large setup with reduced fear of losing slaves.

Otherwise orchestrator already understands pure Binlog Server topologies and can do master promotion. When pure binlog servers topologies will be in production orchestrator will be there to watch over.

Summary

To date, Pseudo-GTID has high scores in automated failovers of our topologies; orchestrator’s holistic approach makes for reliable diagnostics; together they reduce our dependency on specific servers & hardware, physical location, latency implied by SAN devices.

]]>
https://shlomi-noach.github.io/blog/mysql/state-of-automated-recovery-via-pseudo-gtid-orchestrator-booking-com/feed 0 7453
Orchestrator & Pseudo-GTID for binlog reader failover https://shlomi-noach.github.io/blog/mysql/orchestrator-pseudo-gtid-for-binlog-reader-failover https://shlomi-noach.github.io/blog/mysql/orchestrator-pseudo-gtid-for-binlog-reader-failover#respond Thu, 19 Nov 2015 08:52:16 +0000 https://shlomi-noach.github.io/blog/?p=7446 One of our internal apps at Booking.com audits changes to our tables on various clusters. We used to use tungsten replicator, but have since migrated onto our own solution.

We have a binlog reader (uses open-replicator) running on a slave. It expects Row Based Replication, hence our slave runs with log-slave-updates, binlog-format=’ROW’, to translate from the master’s Statement Based Replication. The binlog reader reads what it needs to read, audits what it needs to audit, and we’re happy.

However what happens if that slave dies?

In such case we need to be able to point our binlog reader to another slave, and it needs to be able to pick up auditing from the same point.

This sounds an awful lot like slave repointing in case of master/intermediate master failure, and indeed the solutions are similar. However our binlog reader is not a real MySQL server and does not understands replication. It does not really replicate, it just parses binary logs.

We’re also not using GTID. But we are using Pseudo-GTID. As it turns out, the failover solution is already built in by orchestrator, and this is how it goes:

Normal execution

Our binlog app reads entries from the binary log. Some are of interest for auditing purposes, some are not. An occasional Pseudo-GTID entry is found, and is being stored to ZooKeeper tagged as  “last seen and processed Pseudo-GTID”.

Upon slave failure

We recognize the death of a slave; we have other slaves in the pool; we pick another. Now we need to find the coordinates from which to carry on.

We read our “last seen and processed Pseudo-GTID”. Say it reads:

drop view if exists `meta`.`_pseudo_gtid_hint__asc:56373F17:00000000012B1C8B:50EC77A1`

. We now issue:

$ orchestrator -c find-binlog-entry -i new.slave.fqdn.com --pattern='drop view if exists `meta`.`_pseudo_gtid_hint__asc:56373F17:00000000012B1C8B:50EC77A1`'

The output of such command are the binlog coordinates of that same entry as found in the new slave’s binlogs:

binlog.000148:43664433

Pseudo-GTID entries are only injected once every few seconds (5 in our case). Either:

  • We are OK to reprocess up to 5 seconds worth of data (and indeed we are, our mechanism is such that this merely overwrites our previous audit, no corruption happens)
  • Or our binlog reader also keeps track of the number of events since the last processed Pseudo-GTID entry, skipping the same amount of events after failing over.

Planned failover

In case we plan to repoint our binlog reader to another slave, we can further use orchestrator’s power in making an exact correlation between the binlog positions of two slaves. This has always been within its power, but only recently exposed as it own command. We can, at any stage:

$ sudo orchestrator -c correlate-binlog-pos -i current.instance.fqdn.com --binlog=binlog.002011:72656109 -d some.other.instance.fqdn.com

The output is the binlog coordinates in some.other.instance.fqdn.com that exactly correlate with binlog.002011:72656109 in current.instance.fqdn.com

The case of failure of the binlog reader itself is also handled, but is not the subject of this blog post.

]]>
https://shlomi-noach.github.io/blog/mysql/orchestrator-pseudo-gtid-for-binlog-reader-failover/feed 0 7446
Orchestrator visual cheatsheet, TL;DR the “smart” way https://shlomi-noach.github.io/blog/mysql/orchestrator-visual-cheatsheet-tldr-the-smart-way https://shlomi-noach.github.io/blog/mysql/orchestrator-visual-cheatsheet-tldr-the-smart-way#respond Wed, 02 Sep 2015 07:14:05 +0000 https://shlomi-noach.github.io/blog/?p=7347 Orchestrator is really growing. And the amount of users (DBAs, sys admins) using it is growing. Which gives me a lot of immediate feedback in the form of “Look, there’s just too many options to move slaves around! Which ones should we use?”

TL;DR look at the two visualized commands below

They are enough

The “smart” commands to end all commands

So all relocation commands are important, and give you fine-grained, pin-pointed control of the method of topology refactoring. However, most of the time you just want to move those servers around. Which is why there’s a new “smart” mode which support these two commands, which you should be happy using:

  • relocate: move a single slave to another position
  • relocate-slaves: move all/some slaves of some server to another position.

What makes these commands Smart? You can move slaves around from anywhere to anywhere. And orchestrator figures out the bast execution path. If possible, it uses GTID. Not possible? Is Pseudo-GTID available? Great, using Pseudo-GTID. Oh, are there binlog servers involved? Really simple, use them. None of the above? Orchestrator will use “standard” binlog file:pos math (with limitations). Orchestrator will even figure out if multiple steps are necessary and will combine any of the above.

So you don’t have to remember all the possible ways and options. The visual cheatsheet now boils down to these two:

orchestrator-cheatsheet-visualized-relocate

orchestrator-cheatsheet-visualized-relocate-slaves

Let’s take a slightly deeper look

relocate

Moves a single slave X from any point to replicate another some server Z

  • As usual, orchestrator first confirms that X can replicate from Z (Z has log-slave-updates or is a binlog server; binlog format compatible, etc.)
  • With GTID/Pseudo-GTID, move from any point to any point
  • With binlog servers, move around the binlog server environment (at this point you are not using binlog servers, so ignore)
  • With normal replication, requires an “atomic” operation: either move the slave one level up, or make it replicate from a sibling.
  • You can relocate the same master, effectively repointing the slave back to its existing position. This serves to re-resolve master hostname; to reset relay logs; to verify slave is aligned with master.
  • Or combination of the above

relocate-slaves

Moves multiple slaves of server X to replicate from some other server W

  • By default moves all slaves of X, where possible
  • Each slave verified to be able to replicate from W. Those that can’t are left behind.
  • Can filter using regular expression via –pattern=some?[reg]ex on slave hostnames
  • Can relocate under same master, effectively repointing all slaves (see above explanation)
  • Can relocate below one of the very slaves of X. If Y is in itself a slave of X and you’re executing:
    orchestrator -c relocate-slaves -i X -d Y
    then Y is excluded from the list of relocated slaves. This effectively means “make Y local master of its current siblings”. Very cool stuff.
  • When binlog servers involved, simple math-less repointing takes place
  • When GTID involved, let MySQL/MariaDB (both supported) do the math on a per-server basis
  • When Pseudo-GTID involved, greatly optimize by dividing into equivalence classes and only doing the math on a representative of each class.
  • Or combination of the above

What about the other commands?

The above covers such commands as move-up, move-below, repoint, repoint-slaves, match-below, multi-match-slaves, regroup-slaves and more. It does not cover enslave-master and make-co-master which are a bit different.

My guess is you can pass 98% of your operations with relocate and relocate-slaves. Otherwise just run orchestrator with no arguments nor options to get a full-blown breakdown of available commands.

GUI drag-n-drop

relocate-slaves is achieved by dragging the slaves of an instance on top of a new master, as follows:

orchestrator-relocate-slaves-beforeorchestrator-relocate-slaves-hoverorchestrator-relocate-slaves-drag
orchestrator-relocate-slaves-after

Find latest orchestrator release at https://github.com/outbrain/orchestrator/releases

 

]]>
https://shlomi-noach.github.io/blog/mysql/orchestrator-visual-cheatsheet-tldr-the-smart-way/feed 0 7347
Orchestrator 1.4.340: GTID, binlog servers, Smart Mode, failovers and lots of goodies https://shlomi-noach.github.io/blog/mysql/orchestrator-1-4-340-gtid-binlog-servers-smart-mode-failovers-and-lots-of-goodies https://shlomi-noach.github.io/blog/mysql/orchestrator-1-4-340-gtid-binlog-servers-smart-mode-failovers-and-lots-of-goodies#respond Tue, 01 Sep 2015 10:10:15 +0000 https://shlomi-noach.github.io/blog/?p=7360 Orchestrator 1.4.340 is released. Not quite competing with the MySQL latest changelog, and as I haven’t blogged about orchestrator featureset in a while, this is a quick listing of orchestrator features available since my last publication:

  • Supports GTID (Oracle & MariaDB)
    • GTID still not being used in automated recovery — in progress.
    • enable-gtid, disable-gtid, skip-query for GTID commands
  • Supports binlog servers (MaxScale)
    • Discovery & operations on binlog servers
    • Understanding slave repositioning in a binlog-server architecture
  • Smart mode: relocate & relocate-below commands (or Web/GUI drag-n-drop) let orchestrator figure out the best way of slave repositioning. Orchestrator picks from GTID, Pseudo GTID, binlog servers, binlog file:pos math (and more) options, or combinations of the above. Fine grained commands still there, but mostly you won’t need them.
  • Crash recoveries (did you know orchestrator does that?):
    • For intermediate master recovery: improved logic in picking the best recovery plan (prefer in-DC, prefer promoting local slave, supporting binlog server topologies, …)
    • For master recovery: even better slave promotion; supports candidate slaves (prefer promoting such slaves); supports binlog server shared topologies
    • Better auditing and logging of recovery cases
    • Better analysis of crash scenarios, also in the event of lost VIPs, hanging connections; emergent checks in crash suspected scenarios
    • recover-lite: do all topology-only recovery steps, without invoking external processes
  • Better browser support: used to only work on Firefox and Chrome (and the latter has had issues), the Web UI should now work well on all browsers, at the cost of reduced d3 animation. More work still in progress.
  • Faster, more parallel, less blocking operations on all counts; removed a lots of serialized code; less locks.
  • Web enhancements
    • More verbose drag-n-drop (operation hint; color hints)
    • Drag-n-drop for slaves-of-a-server
    • Replication/crash analysis dashboard
  • Pools: orchestrator can be told about instance-to-pool association (submit-pool-instances command)
    • And can then present pool status (web)
    • Or pool hints within topologies (web)
    • Or queried for all pools (cluster-pool-instances command)
  • Other:
    • Supports MySQL 5.7 (tested with 5.7.8)
    • Configurable graphite path for metrics
    • –noop flag; does all the work except for actually changing master on slaves. Shows intentions.
    • Web (or cli which-cluster-osc-slaves command) provide list of control slaves to use in pt-osc operation
    • hostname-unresolve: force orchestrator to unresolve a fqdn into VIP/CNAME/… when issuing a CHANGE MASTER TO
  • 3rd party contributions (hey, thanks!) include:
    • More & better SSL support
    • Vagrant templates
  • For developers:
    • Orchestrator now go-gettable. Just go get github.com/outbrain/orchestrator
    • Improved build script; supports more architectures

Also consider these manuals:

Orchestrator is free and open source (Apache 2.0 License).

I’ll be speaking about orchestrator in PerconaLive Amsterdam.

]]>
https://shlomi-noach.github.io/blog/mysql/orchestrator-1-4-340-gtid-binlog-servers-smart-mode-failovers-and-lots-of-goodies/feed 0 7360
Pseudo GTID, ASCENDING https://shlomi-noach.github.io/blog/mysql/pseudo-gtid-ascending https://shlomi-noach.github.io/blog/mysql/pseudo-gtid-ascending#comments Wed, 29 Jul 2015 10:59:50 +0000 https://shlomi-noach.github.io/blog/?p=7299 Pseudo GTID is a technique where we inject Globally Unique entries into MySQL, gaining GTID abilities without using GTID. It is supported by orchestrator and described in more detail here, here and here.

Quick recap: we can join two slaves to replicate from one another even if they never were in parent-child relationship, based on our uniquely identifiable entries which can be found in the slaves’ binary logs or relay logs. Having Pseudo-GTID injected and controlled by us allows us to optimize failovers into quick operations, especially where a large number of server is involved.

Ascending Pseudo-GTID further speeds up this process for delayed/lagging slaves.

Recap, visualized

(but do look at the presentation):

pseudo-gtid-quick

  1. Find last pseudo GTID in slave’s binary log (or last applied one in relay log)
  2. Search for exact match on new master’s binary logs
  3. Fast forward both through successive identical statements until end of slave’s applied entries is reached
  4. Point slave into cursor position on master

What happens if the slave we wish to reconnect is lagging? Or perhaps it is a delayed replica, set to run 24 hours behind its master?

The naive approach would expand bullet #2 into:

  • Search for exact match on master’s last binary logs
  • Unfound? Move on to previous (older) binary log on master
  • Repeat

The last Pseudo-GTID executed by the slave was issued by the master over 24 hours ago. Suppose the master generates one binary log per hour. This means we would need to full-scan 24 binary logs of the master where the entry will not be found; to only be matched in the 25th binary log (it’s an off-by-one problem, don’t hold the exact number against me).

Ascending Pseudo GTID

Since we control the generation of Pseudo-GTID, and since we control the search for Pseudo-GTID, we are free to choose the form of Pseudo-GTID entries. We recently switched into using Ascending Pseudo-GTID entries, and this works like a charm. Consider these Pseudo-GTID entries:

drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B364E3:0000000000056EE2:6DD57B85`
drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B364E8:0000000000056EEC:ACF03802`
drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B364ED:0000000000056EF8:06279C24`
drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B364F2:0000000000056F02:19D785E4`

The above entries are ascending in lexical order. The above is generated using a UTC timestamp, along with other watchdog/random values. For a moment let’s trust that our generation is indeed always ascending. How does that help us?

Suppose the last entry found in the slave is

drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B364E3:0000000000056EE2:6DD57B85`

And this is what we’re to search on the master’s binary logs. Starting with the optimistic hope that the entry is in the master’s last binary log, we start reading. By nature of binary logs we have to scan them sequentially from start to end. As we read the binary log entries, we soon meet the first Pseudo-GTID injection, and it reads:

drop view if exists `meta`.`_pseudo_gtid_hint__asc:55B730E6:0000000000058F02:19D785E4`

 

At this stage we know we can completely skip scanning the rest of the binary log. Our entry will not be there: this entry is larger than the one we’re looking for, and they’ll only get larger as we get along in the binary log. It is therefore safe to ignore the rest of this file and move on to the next-older binary log on the master, to repeat our search there.

Binary logs where the entry cannot be in are only briefly examined: orchestrator will probably read no more than first 1,000 entries or so (can’t give you a number, it’s your workload) before giving up on the binary log.

On every topology chain we have 2 delayed replica slaves, to help us out in the case we make a grave mistake of DELETing the wrong data. These slaves would take, on some chains, 5-6 minutes to reconnect to a new master using Pseudo-GTID, since it required scanning many many GBs of binary logs. This is no longer the case; we’ve reduced scan time for such servers to about 25s at worst, and much quicker on average. There can still be dozens of binary logs to open, but all but one are given up very quickly. I should stress that those 25s are nonblocking for other slaves which are mote up to date than the delayed replicas.

Can there be a mistake?

Notice that the above algorithm does not require each and every entry to be ascending; it just compares the first entry in each binlog to determine whether our target entry is there or not. This means if we’ve messed up our Ascending order and injected some out-of-order entries, we can still get away with it — as long as those entries are not the first ones in the binary log, nor are they the last entries executed by the slave.

But why be so negative? We’re using UTC timestamp as the major sorting order, and inject Pseudo-GTID every 5 seconds; even with leap second we’re comfortable.

On my TODO is to also include a “Plan B” full-scan search: if the Ascending algorithm fails, we can still opt for the full scan option. So there would be no risk at all.

Example

We inject Pseudo-GTID via event-scheduler. These are the good parts of the event definition:

create event if not exists
  create_pseudo_gtid_event
  on schedule every 5 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @connection_id := connection_id();
      set @now := now();
      set @rand := floor(rand()*(1 << 32));
      set @pseudo_gtid_hint := concat_ws(':', lpad(hex(unix_timestamp(@now)), 8, '0'), lpad(hex(@connection_id), 16, '0'), lpad(hex(@rand), 8, '0'));

      set @_create_statement := concat('drop ', 'view if exists `meta`.`_pseudo_gtid_', 'hint__asc:', @pseudo_gtid_hint, '`');
      PREPARE st FROM @_create_statement;
      EXECUTE st;
      DEALLOCATE PREPARE st;

We accompany this by the following orchestrator configuration:

 "PseudoGTIDPattern": "drop view if exists .*?`_pseudo_gtid_hint__",
 "PseudoGTIDMonotonicHint": "asc:",

“PseudoGTIDMonotonicHint” notes a string; if that string (“asc:”) is found in the slave’s Pseudo-GTID entry, then the entry is assumed to have been injected as part of ascending entries, and the optimization kicks in.

The Manual has more on this.

]]>
https://shlomi-noach.github.io/blog/mysql/pseudo-gtid-ascending/feed 1 7299
Speaking at Percona Live: Pseudo GTID and Easy Replication Topology Management https://shlomi-noach.github.io/blog/mysql/speaking-at-percona-live-pseudo-gtid-and-easy-replication-topology-management https://shlomi-noach.github.io/blog/mysql/speaking-at-percona-live-pseudo-gtid-and-easy-replication-topology-management#comments Tue, 31 Mar 2015 15:42:21 +0000 https://shlomi-noach.github.io/blog/?p=7205 In two weeks time I will be presenting Pseudo GTID and Easy Replication Topology Management at Percona Live. From the time I submitted the proposal a LOT has been developed, experimented, deployed and used with both Pseudo GTID and with orchestrator. In my talk I will:

  • Suggest that you skip the “to GTID or not to GTID” question and go for the lightweight Pseudo GTID
  • Show how Pseudo GTID is used in production to recover from various replication failures and server crashes
  • Do an outrageous demonstration
  • Tell you about 50,000 successful experiments and tests done in production
  • Show off orchestrator and its support for Pseudo GTID, including automated crash analysis and recovery mechanism.

I will further show how the orchestrator tooling makes for a less restrictive, more performant, less locking, non-intrusive, trusted and lightweight replication topology management solution.

orchestrator-topology-simple An anonymized topology

Please come by my talk!

Slides:

]]>
https://shlomi-noach.github.io/blog/mysql/speaking-at-percona-live-pseudo-gtid-and-easy-replication-topology-management/feed 1 7205
Speaking at FOSDEM: Pseudo GTID and easy replication management https://shlomi-noach.github.io/blog/mysql/speaking-at-fosdem-pseudo-gtid-and-easy-replication-management https://shlomi-noach.github.io/blog/mysql/speaking-at-fosdem-pseudo-gtid-and-easy-replication-management#respond Thu, 29 Jan 2015 19:06:10 +0000 https://shlomi-noach.github.io/blog/?p=7194 This coming Sunday I’ll be presenting Pseudo GTID and easy replication management at FOSDEM, Brussels.

There’s been a lot of development on Pseudo GTID these last few weeks. In this talk I’ll show you how you can use Pseudo GTID instead of “normal” GTID to easily repoint your slaves, recover from intermediate master failure, promote slaves to masters as well as emply crash safe replication without crash safe replication.

Moreover, I will show how you can achieve all the above with less constraints than GTID, and for bulk operations — with less overhead and in shorter time. You will also see that Pseudo GTID is a non intrusive solution which does not require you to change anything in your topologies.

Moral: I’ll try and convince you to drop your plans for using GTID in favor of Pseudo GTID.

We will be employing Pseudo GTID as the basis for high availability and failover at Booking.com on many topologies, and as a safety mechanism in other topologies where we will employ Binlog servers.

]]>
https://shlomi-noach.github.io/blog/mysql/speaking-at-fosdem-pseudo-gtid-and-easy-replication-management/feed 0 7194
Semi-automatic slave/master promotion via Pseudo GTID https://shlomi-noach.github.io/blog/mysql/semi-automatic-slavemaster-promotion-via-pseudo-gtid https://shlomi-noach.github.io/blog/mysql/semi-automatic-slavemaster-promotion-via-pseudo-gtid#comments Mon, 10 Nov 2014 05:56:58 +0000 https://shlomi-noach.github.io/blog/?p=7119 Orchestrator release 1.2.7-beta now supports semi-automatic slave promotion to master upon master death, via Pseudo GTID.

orchestrator-make-master-highlighted

When the master is dead, orchestrator automatically picks the most up-to-date slaves and marks them as “Master candidates”. It allows a /api/make-master call on such a slave (S), in which case it uses Pseudo GTID to enslave its siblings, and set S as read-only = 0. All we need to do is click the “Make master” button.

orchestrator-make-master-confirm

Yes, “OK”. A moment later:

orchestrator-make-master-refactored

See how the two slaves 22988, 22989 are happily replicating from 22990. Turning them into 22990‘s slave was possible even though their shared master was dead and were stopped at different replication positions.

22990 is now writeable, and its on you to disconnect from its old master and to direct your application into this newly promoted instance.

Local master promotion

Likewise, a semi-automated solution for the corruption of a local master is in place. Consider the following, where 22989 is dead/inaccessible. 22988 and 22990 are stuck in replication:

orchestrator-make-local-master

Orchestrator detects this situation and picks the most up-to-date slave, marking it as candidate for promotion. Click “Make lock master” to get:

orchestrator-make-local-master-confirm

Sure, OK:

orchestrator-make-local-master-refactored

And now 22990 takes over instead of 22989, enslaving 22988, both running happily ever after.

 Automation

The above buttons are just convenience methods. You don’t strictly need visualization/GUI for that. Everything is supported by the API, and can be used by a fully automated monitoring system. More to come in orchestrator.

 

]]>
https://shlomi-noach.github.io/blog/mysql/semi-automatic-slavemaster-promotion-via-pseudo-gtid/feed 2 7119
Refactoring replication topologies with Pseudo GTID: a visual tour https://shlomi-noach.github.io/blog/mysql/refactoring-replication-topologies-with-pseudo-gtid-a-visual-tour https://shlomi-noach.github.io/blog/mysql/refactoring-replication-topologies-with-pseudo-gtid-a-visual-tour#respond Mon, 27 Oct 2014 09:55:39 +0000 https://shlomi-noach.github.io/blog/?p=7088 Orchestrator 1.2.1-beta supports Pseudo GTID (read announcement): a means to refactor the replication topology and connect slaves even without direct relationship; even across failed servers. This post illustrates two such scenarios and shows the visual way of mathcing/re-synching slaves.

Of course, orchestrator is not just a GUI tool; anything done with drag-and-drop is also done via web API (in fact, the drag-and-drop invoke the web API) as well as via command line. I’m mentioning this as this is the grounds for failover automation planned for the future.

Scenario 1: the master unexpectedly dies

The master crashes and cannot be contacted. All slaves are stopped as effect, but each in a different position. Some managed to salvage relay logs just before the master dies, some didn’t. In our scenario, all three slaves are at least caught up with the relay log (that is, whatever they managed to pull through the network, they already managed to execute). So they’re otherwise sitting idle waiting for something to happen. Well, something’s about to happen.

orchestrator-pseudo-gtid-dead-master

Note the green “Safe mode” button to the right. This means operation is through calculation of binary log files & positions with relation to one’s master. But the master is now dead, so let’s switch to adventurous mode; in this mode we can drag and drop slaves onto instances normally forbidden. At this stage the web interface allows us to drop a slave onto its sibling or any of its ancestors (including its very own parent, which is a means of reconnecting a slave with its parent). Anyhow:

orchestrator-pseudo-gtid-dead-master-pseudo-gtid-mode

We notice that orchestrator is already kind enough to say which slave is best candidate to be the new master (127.0.0.1:22990): this is the slave (or one of the slaves) with most up-to-date data. So we choose to take another server and make it a slave of 127.0.0.1:22990:

orchestrator-pseudo-gtid-dead-master-begin-drag

And drop:

orchestrator-pseudo-gtid-dead-master-drop

There we have it: although their shared master is inaccessible, and the two slave’s binary log file names & position mean nothing to each other, we are able to correctly match the two and make one child of the other:

orchestrator-pseudo-gtid-dead-master-refactored-1

Likewise, we do the same with 127.0.0.1:22988:

orchestrator-pseudo-gtid-dead-master-begin-drag-2

And end up with our (almost) final topology:

orchestrator-pseudo-gtid-dead-master-refactored-2

Notice how the two slaves 22988, 22989 are happily replicating from 22990. As far as they’re concerned, there is no problem in the topology any more. Now it’s your decision: do you decommission the old master? You will need to RESET SLAVE on 22990 (can do via orchestrator), turn off 22990‘s read_only (can do via orchestrator) and change DNS entries (or what have you).

Scenario 2: a local master (“relay-master”) unexpectedly dies

In this scenario we have a deep nested topology, and a local master died. What of its slaves?

 orchestrator-pseudo-gtid-dead-relay-master

We choose one of the children and drag it over onto the master, which is up and healthy:

orchestrator-pseudo-gtid-dead-relay-master-begin-drag

As you can see we are allowed (green instances are allowed drop places) to drop 22989 on its sibling and on its grandparent, the latter bypassing a broken connection. There is no connection between the two!

orchestrator-pseudo-gtid-dead-relay-master-drop

And we get a new topology:

orchestrator-pseudo-gtid-dead-relay-master-refactored-1

22989 is now lagging, but on the right path! Let’s do the same for 22988:

orchestrator-pseudo-gtid-dead-relay-master-begin-drag-2

And finally:

orchestrator-pseudo-gtid-dead-relay-master-refactored-2

Great success! 22989 already caught up, 22988 on the way, victory is ours!

The real fun, of course, is to execute with –debug and review the DEBUG messages as orchestrator seeks, finds, matches and follows up on Pseudo GTID entries in the binary logs. We each have our pleasures.

]]>
https://shlomi-noach.github.io/blog/mysql/refactoring-replication-topologies-with-pseudo-gtid-a-visual-tour/feed 0 7088
Orchestrator 1.2.1 BETA: Pseudo GTID support, reconnect slaves even after master failure https://shlomi-noach.github.io/blog/mysql/orchestrator-1-2-1-beta-pseudo-gtid-support-reconnect-slaves-even-after-master-failure https://shlomi-noach.github.io/blog/mysql/orchestrator-1-2-1-beta-pseudo-gtid-support-reconnect-slaves-even-after-master-failure#comments Mon, 27 Oct 2014 09:51:51 +0000 https://shlomi-noach.github.io/blog/?p=7114 orchestrator 1.2.1 BETA is released. This version supports Pseudo GTID, and provides one with powerful refactoring of one’s replication topologies, even across failed instances.

orchestrator-pseudo-gtid-dead-relay-master-begin-drag

Depicted: moving a slave up the topology even though its local master is inaccessible

Enabling Pseudo-GTID

You will need to:

  1. Inject a periodic unique entry onto your binary logs
  2. Configure orchestrator to recognize said entry.

Pseudo GTID injection example

We will use the event scheduler (must be enabled) to inject an entry every 10 seconds, recognized both in statement-based and row-based replication.

create database if not exists meta;

drop event if exists meta.create_pseudo_gtid_view_event;

delimiter ;;
create event if not exists
  meta.create_pseudo_gtid_view_event
  on schedule every 10 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid := uuid();
      set @_create_statement := concat('create or replace view meta.pseudo_gtid_view as select \'', @pseudo_gtid, '\' as pseudo_gtid_unique_val from dual');
      PREPARE st FROM @_create_statement;
      EXECUTE st;
      DEALLOCATE PREPARE st;
    end
;;

delimiter ;

set global event_scheduler := 1;

Make sure to enable event_scheduler in your my.cnf config file.

An entry in the binary logs would look like this:

mysql [localhost] {msandbox} (meta) > show binlog events in 'mysql-bin.000002' LIMIT 2,1;
+------------------+-----+------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                                                                                                                                                                               |
+------------------+-----+------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| mysql-bin.000002 | 388 | Query      |         1 |         669 | use `meta`; CREATE OR REPLACE ALGORITHM=UNDEFINED DEFINER=`msandbox`@`localhost` SQL SECURITY DEFINER VIEW `pseudo_gtid_view` AS select '2f6ad653-5db3-11e4-b91d-3c970ea31ea8' as pseudo_gtid_unique_val from dual |
+------------------+-----+------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The above entry will be unique, and orchestrator will be able to find it in the binary log if configured with:

{
    ...
    "PseudoGTIDPattern": "CREATE OR REPLACE .*? VIEW `pseudo_gtid_view` AS select"
}

The value of “PseudoGTIDPattern” is a regular expression which must match the Pseudo GTID entries in the binary log, and nothing but those entries.

Pre-release

This is BETA quality; though I have high confidence in its safety: the process of matching the binary log entries makes for a self-validating mechanism. The process will abort on any mismatch or uncertainty.

Still there can be use cases I haven’t encountered yet. You input is appreciated!

]]>
https://shlomi-noach.github.io/blog/mysql/orchestrator-1-2-1-beta-pseudo-gtid-support-reconnect-slaves-even-after-master-failure/feed 1 7114