Speaking at FOSDEM: Pseudo GTID and easy replication management

This coming Sunday I’ll be presenting Pseudo GTID and easy replication management at FOSDEM, Brussels.

There’s been a lot of development on Pseudo GTID these last few weeks. In this talk I’ll show you how you can use Pseudo GTID instead of “normal” GTID to easily repoint your slaves, recover from intermediate master failure, promote slaves to masters as well as emply crash safe replication without crash safe replication.

Moreover, I will show how you can achieve all the above with less constraints than GTID, and for bulk operations — with less overhead and in shorter time. You will also see that Pseudo GTID is a non intrusive solution which does not require you to change anything in your topologies.

Moral: I’ll try and convince you to drop your plans for using GTID in favor of Pseudo GTID.

We will be employing Pseudo GTID as the basis for high availability and failover at Booking.com on many topologies, and as a safety mechanism in other topologies where we will employ Binlog servers.

Pseudo gtid & easy replication topology management from Shlomi Noach

Reading RBR binary logs with pt-query-digest

For purposes of auditing anything that goes on our servers we’re looking to parse the binary logs of all servers (masters), as with “Anemomaster“. With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).

I’ve written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I’m able to now feed it to pt-query-digest which seems to be happy.

The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:

mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001

The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.

The script can be found in my gist repositories. Current version is as follows: Continue reading » “Reading RBR binary logs with pt-query-digest”

Orchestrator 1.2.9 GA released

Orchestrator 1.2.9 GA has been released. Noteworthy:

  • Added “ReadOnly” (true/false) configuration param. You can have orchestrator completely read-only
  • Added “AuthenticationMethod”: “multi”: works like BasicAuth (your normal HTTP user+password) only it also accepts the special user called “readonly”, which, surprise, can only view and not modify
  • Centralized/serialized most backend database writes (with hundreds/thousands monitored servers it was possible or probable that high concurrency led to too-many-connections openned on the backend database).
  • Fixed evil evil bug that would skip some checks if binary logs were not enabled
  • Better hostname resolve (now also asking MySQL server to resolve hostname; resolving is cached)
  • Pseudo-GTID (read here, here, here) support now considered stable (apart from being tested it has already been put to practice multiple times in production at Outbrain, in different planned and unplanned crash scenarios)

I continue developing orchestrator as free and open source at my new employer, Booking.com.

 

 

Joining Booking.com

I’m excited to be joining Booking.com at the Netherlands in a couple weeks 🙂

I’m looking forward to be working with a great team and friendly people! I hope to contribute from my experience and of course be challenged by difficult problems.

Booking.com is a supporter of open source in multiple aspects, and I am looking forward to continue working with open source solutions as well as releasing open source code.

I am leaving my work at Outbrain feeling grateful for the opportunity of working at this wonderful company! I am awed and humbled by the amazing teams I’ve worked with, whose level of knowledge and insights I can only aspire to match. Thank you in particular to the Infrastructure team, of which I was proud to be part of.

Outbrain allowed me and others (and in fact encouraged and supported) to develop as much open source as we saw fit. This is not a minor thing: when you orient your code towards open source, you need to make generalizations which are not always providing direct benefit to the company, and which consume precious time. I hold Outbrain in the highest respect for their support for open source.

 

 

 

Semi-automatic slave/master promotion via Pseudo GTID

Orchestrator release 1.2.7-beta now supports semi-automatic slave promotion to master upon master death, via Pseudo GTID.

orchestrator-make-master-highlighted

When the master is dead, orchestrator automatically picks the most up-to-date slaves and marks them as “Master candidates”. It allows a /api/make-master call on such a slave (S), in which case it uses Pseudo GTID to enslave its siblings, and set S as read-only = 0. All we need to do is click the “Make master” button. Continue reading » “Semi-automatic slave/master promotion via Pseudo GTID”

Percona Live 2015: Call for Papers is open

And not for long!

The Call for Papers for Percona Live MySQL Conference and Expo, to be held at Santa Clara in April 2015, is open. The dead line for submissions is Nov. 16th; that’s just around the corner.

As with previous years, we will hold a 4 day conference, the first being a tutorials day and three days for sessions, BoF and lightning talks, as well as community events. The committee is expecting to review at about 250-300 submissions, out of which it will pick at about 100 talks to schedule or reserve.

We will be using these tracks:

  • High Availability
  • DevOps
  • Programming
  • Performance Optimization
  • Replication and Backup
  • MySQL in the Cloud
  • MySQL and NoSQL
  • MySQL Case Studies
  • Security
  • What’s New in MySQL

This year we will roughly pre-define the desired number of sessions we wish to have per track. This is not set in stone and everything is fluid. Yet, this will give us better guidelines at choosing and pursuing content for this conference.

Submitting a proposal

We encourage all members of the community to submit their tutorial/session/BoF proposals as soon as possible. Please register/login at the conference home page.

The guidelines for submitting a proposal are generally unchanged; please review past recommendations: [1], [2], [3], [4]. To add to all these:

  • Do note that we are likely to only review a proposal just once. Please submit only after you have finalized your draft.
  • Make a reasonable length of proposal. We believe 250 – 300 words are quite enough for a good proposal. Please don’t write an essay, and remember that you proposal is what gets printed on the schedule, and what is read by the conference attendees when choosing the next talk to go to.
  • Write a descent Bio.

Continue reading » “Percona Live 2015: Call for Papers is open”

Refactoring replication topologies with Pseudo GTID: a visual tour

Orchestrator 1.2.1-beta supports Pseudo GTID (read announcement): a means to refactor the replication topology and connect slaves even without direct relationship; even across failed servers. This post illustrates two such scenarios and shows the visual way of mathcing/re-synching slaves.

Of course, orchestrator is not just a GUI tool; anything done with drag-and-drop is also done via web API (in fact, the drag-and-drop invoke the web API) as well as via command line. I’m mentioning this as this is the grounds for failover automation planned for the future.

Scenario 1: the master unexpectedly dies

The master crashes and cannot be contacted. All slaves are stopped as effect, but each in a different position. Some managed to salvage relay logs just before the master dies, some didn’t. In our scenario, all three slaves are at least caught up with the relay log (that is, whatever they managed to pull through the network, they already managed to execute). So they’re otherwise sitting idle waiting for something to happen. Well, something’s about to happen.

orchestrator-pseudo-gtid-dead-master

Note the green “Safe mode” button to the right. This means operation is through calculation of binary log files & positions with relation to one’s master. But the master is now dead, so let’s switch to adventurous mode; in this mode we can drag and drop slaves onto instances normally forbidden. At this stage the web interface allows us to drop a slave onto its sibling or any of its ancestors (including its very own parent, which is a means of reconnecting a slave with its parent). Anyhow:

orchestrator-pseudo-gtid-dead-master-pseudo-gtid-mode

We notice that orchestrator is already kind enough to say which slave is best candidate to be the new master (127.0.0.1:22990): this is the slave (or one of the slaves) with most up-to-date data. So we choose to take another server and make it a slave of 127.0.0.1:22990: Continue reading » “Refactoring replication topologies with Pseudo GTID: a visual tour”

Orchestrator 1.2.1 BETA: Pseudo GTID support, reconnect slaves even after master failure

orchestrator 1.2.1 BETA is released. This version supports Pseudo GTID, and provides one with powerful refactoring of one’s replication topologies, even across failed instances.

orchestrator-pseudo-gtid-dead-relay-master-begin-drag

Depicted: moving a slave up the topology even though its local master is inaccessible

Enabling Pseudo-GTID

You will need to:

  1. Inject a periodic unique entry onto your binary logs
  2. Configure orchestrator to recognize said entry.

Pseudo GTID injection example

We will use the event scheduler (must be enabled) to inject an entry every 10 seconds, recognized both in statement-based and row-based replication.

create database if not exists meta;

drop event if exists meta.create_pseudo_gtid_view_event;

delimiter ;;
create event if not exists
  meta.create_pseudo_gtid_view_event
  on schedule every 10 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid := uuid();
      set @_create_statement := concat('create or replace view meta.pseudo_gtid_view as select \'', @pseudo_gtid, '\' as pseudo_gtid_unique_val from dual');
      PREPARE st FROM @_create_statement;
      EXECUTE st;
      DEALLOCATE PREPARE st;
    end
;;

delimiter ;

set global event_scheduler := 1;

Make sure to enable event_scheduler in your my.cnf config file.

An entry in the binary logs would look like this: Continue reading » “Orchestrator 1.2.1 BETA: Pseudo GTID support, reconnect slaves even after master failure”

Refactoring replication topology with Pseudo GTID

This post describes in detail the method of using Pseudo GTID to achieve unplanned replication topology changes, i.e. connecting two arbitrary slaves, or recovering from a master failure even as all its slaves are hanging in different positions.

Please read Pseudo GTID and Pseudo GTID, RBR as introduction.

Consider the following case: the master dies unexpectedly, and its three slaves are all hanging, not necessarily at same binary log file/position (network broke down while some slaves managed to salvage more entries into their relay logs than others)

orchestrator-failed-master

(Did you notice the “Candidate for master” message? To be discussed shortly)

GTID

With GTID each transaction (and entry in the binary log) is associated with a unique mark — the Global Transaction ID. Just pick the slave with the most advanced GTID to be the next master, and just CHANGE MASTER TO MASTER_HOST=’chosen_slave’ on the other slaves, and everything magically works. A slave knows which GTID it has already processed, and can look that entry on its master’s binary logs, resuming replication on the one that follows.

How does that work? The master’s binary logs are searched for that GTID entry. I’m not sure how brute-force this is, since I’m aware of a subtlety which requires brute-force scan of all binary logs; I don’t actually know if it’s always like that.

Pseudo GTID

We can mimick that above, but our solution can’t be as fine grained. With the injection of Pseudo GTID we mark the binary log for unique entries. But instead of having a unique identifier for every entry, we have a unique identifier for every second, 10 seconds, or what have you, with otherwise normal, non-unique entries in between our Pseudo GTID entries.

Recognizing which slave is more up to date

Given two slaves, which is more up to date?

  • If both replicate(d) from same master, a SHOW SLAVE STATUS comparison answers (safe method: wait till SQL thread catches up with broken IO thread, compare relay_master_log_file, exec_master_log_pos on both machines). This is the method by which the above “Candidate for master” message is produced.
  • If one is/was descendent of the other, then obviously it is less advanced or equals its ancestor.
  • Otherwise we’re unsure – still solvable via bi-directional trial & error, as explained later on.

For now, let’s assume we know which slave is more up to date (has received and executed more relay logs). Let’s call it S1, whereas the less up-to-date will be S2. This will make our discussion simpler. Continue reading » “Refactoring replication topology with Pseudo GTID”

Pseudo GTID, Row Based Replication

This post continues Pseudo GTID, in a series of posts describing an alternative to using MySQL GTIDs.

The solution offered in the last post does not work too well for row based replication. The binary log entries for the INSERT statement look like this:

# at 1020
# at 1074
#141020 12:36:21 server id 1  end_log_pos 1074  Table_map: `test`.`pseudo_gtid` mapped to number 33
#141020 12:36:21 server id 1  end_log_pos 1196  Update_rows: table id 33 flags: STMT_END_F

BINLOG '
lddEVBMBAAAANgAAADIEAAAAACEAAAAAAAEABHRlc3QAC3BzZXVkb19ndGlkAAMDBw8CQAAE
lddEVBgBAAAAegAAAKwEAAAAACEAAAAAAAEAA///+AEAAACL10RUJDg2ZmRhMDk1LTU4M2MtMTFl
NC05NzYyLTNjOTcwZWEzMWVhOPgBAAAAlddEVCQ4Y2YzOWMyYy01ODNjLTExZTQtOTc2Mi0zYzk3
MGVhMzFlYTg=
'/*!*/;

Where’s our unique value? Encoded within something that cannot be trusted to be unique. Issuing mysqlbinlog –verbose helps out:

BEGIN
/*!*/;
# at 183
# at 237
#141020 12:35:51 server id 1  end_log_pos 237   Table_map: `test`.`pseudo_gtid` mapped to number 33
#141020 12:35:51 server id 1  end_log_pos 359   Update_rows: table id 33 flags: STMT_END_F

BINLOG '
d9dEVBMBAAAANgAAAO0AAAAAACEAAAAAAAEABHRlc3QAC3BzZXVkb19ndGlkAAMDBw8CQAAE
d9dEVBgBAAAAegAAAGcBAAAAACEAAAAAAAEAA///+AEAAABt10RUJDc1MWJkYzEwLTU4M2MtMTFl
NC05NzYyLTNjOTcwZWEzMWVhOPgBAAAAd9dEVCQ3YjExZDQzYy01ODNjLTExZTQtOTc2Mi0zYzk3
MGVhMzFlYTg=
'/*!*/;
### UPDATE `test`.`pseudo_gtid`
### WHERE
###   @1=1
###   @2=1413797741
###   @3='751bdc10-583c-11e4-9762-3c970ea31ea8'
### SET
###   @1=1
###   @2=1413797751
###   @3='7b11d43c-583c-11e4-9762-3c970ea31ea8'

and that’s something we can work with. However, I like to do stuff from within MySQL, and rely as little as possible on external tools. How do the binary log entries look via SHOW BINLOG EVENTS? Not good.

master [localhost] {msandbox} (test) > show binlog events in 'mysql-bin.000058' limit 20;
+------------------+------+-------------+-----------+-------------+---------------------------------------+
| Log_name         | Pos  | Event_type  | Server_id | End_log_pos | Info                                  |
+------------------+------+-------------+-----------+-------------+---------------------------------------+
| mysql-bin.000058 |    4 | Format_desc |         1 |         107 | Server ver: 5.5.32-log, Binlog ver: 4 |
| mysql-bin.000058 |  107 | Query       |         1 |         183 | BEGIN                                 |
| mysql-bin.000058 |  183 | Table_map   |         1 |         237 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  237 | Update_rows |         1 |         359 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  359 | Xid         |         1 |         386 | COMMIT /* xid=5460 */                 |
| mysql-bin.000058 |  386 | Query       |         1 |         462 | BEGIN                                 |
| mysql-bin.000058 |  462 | Table_map   |         1 |         516 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  516 | Update_rows |         1 |         638 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  638 | Xid         |         1 |         665 | COMMIT /* xid=5471 */                 |
| mysql-bin.000058 |  665 | Query       |         1 |         741 | BEGIN                                 |
| mysql-bin.000058 |  741 | Table_map   |         1 |         795 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  795 | Update_rows |         1 |         917 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  917 | Xid         |         1 |         944 | COMMIT /* xid=5474 */                 |
| mysql-bin.000058 |  944 | Query       |         1 |        1020 | BEGIN                                 |
| mysql-bin.000058 | 1020 | Table_map   |         1 |        1074 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 | 1074 | Update_rows |         1 |        1196 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 | 1196 | Xid         |         1 |        1223 | COMMIT /* xid=5476 */                 |
| mysql-bin.000058 | 1223 | Query       |         1 |        1299 | BEGIN                                 |
| mysql-bin.000058 | 1299 | Table_map   |         1 |        1353 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 | 1353 | Update_rows |         1 |        1475 | table_id: 33 flags: STMT_END_F        |
+------------------+------+-------------+-----------+-------------+---------------------------------------+

The representation of row-format entries in the SHOW BINLOG EVENTS output is really poor. Why, there’s nothing to tell me at all about what’s been done, except that this is some operation on test.pseudo_gtid. Obviously I cannot find anything unique over here.

Not all is lost. How about DDL statements? Those are still written in SBR format (there’s no rows to log upon creating a table). A solution could be somehow manipulating a unique value in a DDL statement. There could be various such solutions, and I chose to use a CREATE VIEW statement, dynamically composed of a UUID(): Continue reading » “Pseudo GTID, Row Based Replication”