Using dbdeployer in CI tests

February 20, 2018

I was very pleased when Giuseppe Maxia (aka datacharmer) unveiled dbdeployer in his talk at pre-FOSDEM MySQL day. The announcement came just at the right time. I wish to briefly describe how we use dbdeployer (work in progress).

The case for gh-ost

A user opened an issue on gh-ost, and the user was using MySQL 5.5. gh-ost is being tested on 5.7 where the problem does not reproduce. A discussion with Gillian Gunson raised the concern of not testing on all versions. Can we run gh-ost tests for all MySQL/Percona/MariaDB versions? Should we? How easy would it be?

gh-ost tests

gh-ost has three different test types:

  • Unit tests: these are plain golang logic tests which are very easy and quick to run.
  • Integration tests: the topic of this post, see following. Today these do not run as part of an automated CI testing.
  • System tests: putting our production tables to the test, continuously migrating our production data on dedicated replicas, verifying checksums are identical and data is intact, read more.

Unit tests are already running as part of automated CI (every PR is subjected to those tests). Systems tests are clearly tied to our production servers. What's the deal with the integration tests? Continue Reading »

orchestrator 3.0.6: faster crash detection & recoveries, auto Pseudo-GTID, semi-sync and more

January 29, 2018

orchestrator 3.0.6 is released and includes some exciting improvements and features. It quickly follows up on 3.0.5 released recently, and this post gives a breakdown of some notable changes:

Faster failure detection

Recall that orchestrator uses a holistic approach for failure detection: it reads state not only from the failed server (e.g. master) but also from its replicas. orchestrator now detects failure faster than before:

  • A detection cycle has been eliminated, leading to quicker resolution of a failure. On our setup, where we poll servers every 5sec, failure detection time dropped from 7-10sec to 3-5sec, keeping reliability. The reduction in time does not lead to increased false positives.
    Side note: you may see increased not-quite-failure analysis such as "I can't see the master" (UnreachableMaster).
  • Better handling of network scenarios where packets are dropped. Instead of hanging till TCP timeout, orchestrator now observes server discovery asynchronously. We have specialized failover tests that simulate dropped packets. The change reduces detection time by some 5sec.

Faster master recoveries

Promoting a new master is a complex task which attempts to promote the best replica out of the pool of replicas. It's not always the most up-to-date replica. The choice varies depending on replica configuration, version, and state.

With recent changes, orchestrator is able to to recognize, early on, that the replica it would like to promote as master is ideal. Assuming that is the case, orchestrator is able to immediate promote it (i.e. run hooks, set read_only=0 etc.), and run the rest of the failover logic, i.e. the rewiring of replicas under the newly promoted master, asynchronously.

This allows the promoted server to take writes sooner, even while its replicas are not yet connected. It also means external hooks are executed sooner.

Between faster detection and faster recoveries, we're looking at some 10sec reduction in overall recovery time: from moment of crash to moment where a new master accepts writes. We stand now at < 20sec in almost all cases, and < 15s in optimal cases. Those times are measured on our failover tests.

We are working on reducing failover time unrelated to orchestrator and hope to update soon.

Automated Pseudo-GTID

As reminder, Pseudo-GTID is an alternative to GTID, without the kind of commitment you make with GTID. It provides similar "point your replica under any other server" behavior GTID allows. Continue Reading »

Implementing non re-entrant functions in Golang

January 4, 2018

A non re-entrant function is a function that could only be executing once at any point in time, regardless of how many times it is being invoked and by how many goroutines.

This post illustrates blocking non re-entrant functions and yielding non re-entrant functions implementations in golang.

A use case

A service is polling for some conditions, monitoring some statuses once per second. We want each status to be checked independently of others without blocking. An implementation might look like:

func main() {
    tick := time.Tick(time.Second)
    go func() {
        for range tick {
            go CheckSomeStatus()
            go CheckAnotherStatus()

We choose to run each status check in its own goroutine so that CheckAnotherStatus() doesn't wait upon CheckSomeStatus() to complete.

Each of these checks typically take a very short amount of time, and much less than a second. What happens, though, if CheckAnotherStatus() itself takes more than one second to run? Perhaps there's an unexpected network or disk latency affecting the execution time of the check.

Does it make sense for the function to be executed twice at the same time? If not, we want it to be non re-entrant. Continue Reading »

orchestrator 3.0.3: auto provisioning raft nodes, native Consul support and more

November 16, 2017

orchestrator 3.0.3 is released! There's been a lot going on since 3.0.2:

orchestrator/raft: auto-provisioning nodes via lightweight snaphsots

In an orchestrator/raft setup, we have n hosts forming a raft cluster. In a 3-node setup, for example, one node can go down, and still the remaining two will form a consensus, keeping the service operational. What happens when the failed node returns?

With 3.0.3 the failed node can go down for as long as it wants. Once it comes back, it attempts to join the raft cluster. A node keeps its own snapshots and its raft log outside the relational backend DB. If it has recent-enough data, it just needs to catch up with raft replication log, which is acquires from one of the active nodes.

If its data is very stale, it will request a snapshot from an active node, which it will import, and will just resume from that point.

If its data is gone, that's not a problem. It gets a snapshot from an active node, improts it, and keeps running from that point.

If it's a newly provisioned box, that's not a problem. It gets a snapshot from an active node, ... etc.

  • SQLite backed setups can just bootstrap new nodes. No need to dump+load or import any data.
    • Side effect: you may actually use :memory:, where SQLite does not persist any data to disk. Remember that the raft snapshots and replication log will cover you. The cheat is that the raft replication log itself is managed and persisted by an independent SQLite database.
  • MySQL backed setups will still need to make sure orchestrator has the privileges to deploy itself.

More info in the docs.

This plays very nicely into the hands of kubernetes, which is on orchestrator's roadmap.

Key Value, native Consul support (Zk TODO)

orchestrator now supports Key-Value stores built-in, and Consul in particular.

At this time the purpose of orchestrator KV is to support master discovery. orchestrator will write the identity of the master of each cluster to KV store. The user will use that information to apply changes to their infrastructure.

For example, the user will rely on Consul KV entries, written by orchestrator, to generate proxy config files via consul-template, such that traffic is directed via the proxy onto the correct master.

orchestrator supports:

  • Manually writing identity of cluster's master to KV store
    • e.g. `orchestrator-client -c submit-masters-to-kv-stores -alias mycluster`
  • Automatically updating master's identify upon failover

Key-value pairs are in the form of <cluster-alias>-&lt;master&gt;. For example:

  • Key is `main_cluster`
  • Value is

Web UI improvements

Using the web UI, you can now: Continue Reading »

gh-ost 1.0.42 released: JSON support, optimizations

September 14, 2017

gh-ost 1.0.42 is released and available for download.


MySQL 5.7's JSON data type is now supported.

There is a soft-limitation, that your JSON may not be part of your PRIMARY KEY. Currently this isn't even supported by MySQL anyhow.


Two noteworthy changes are:

  • Client side prepared statements reduce network traffic and round trips to the server.
  • Range query iteration avoids creating temporary tables and filesorting.

We're not running benchmarks at this time to observe performance gains.


More tests validating 5.7 compatibility (at this time GitHub runs MySQL 5.7 in production).


Many other changes included.

We are grateful for all community feedback in form of open Issues, Pull Requests and questions!

gh-ost is authored by GitHub. It is free and open source and is available under the MIT license.


In two weeks time, Jonah Berquist will present gh-ost: Triggerless, Painless, Trusted Online Schema Migrations at Percona Live, Dublin.

Tom Krouper and myself will present MySQL Infrastructure Testing Automation at GitHub, where, among other things, we describe how we test gh-ost in production.

Speaking at Percona Live Dublin: keynote, orchestrator tutorial, MySQL testing automation

September 13, 2017

I'm looking forward to a busy Percona Live Dublin conference, delivering three talks. Chronologically, these are:

  • Practical orchestrator tutorial
    Attend this 3 hour tutorial for a thorough overview on orchestrator: what, why, how to configure, best advice, deployments, failovers, security, high availability, common operations, ...
    We will of course discuss the new orchestrator/raft setup and share our experience running it in production.
    The tutorial will allow for general questions from the audience and open discussions.
  • Why Open Sourcing Our Database Tooling was the Smart Decision
    What it says. A 10 minute journey advocating for open sourcing infrastructure.
  • MySQL Infrastructure Testing Automation at GitHub
    Co-presenting with Tom Krouper, we share how & why we run infrastructure tests in and near production that gives us trust in many of our ongoing, ever changing operations. Essentially this is "why you should feel OK trusting us with your data".

See you there!

orchestrator 3.0.2 GA released: raft consensus, SQLite

September 12, 2017

orchestrator 3.0.2 GA is released and available for download (see also packagecloud repository).

3.0.2 is the first stable release in the 3.0* series, introducing (recap from 3.0 pre-release announcement):


Raft is a consensus protocol, supporting leader election and consensus across a distributed system.  In an orchestrator/raft setup orchestrator nodes talk to each other via raft protocol, form consensus and elect a leader. Each orchestrator node has its own dedicated backend database. The backend databases do not speak to each other; only the orchestrator nodes speak to each other.

No MySQL replication setup needed; the backend DBs act as standalone servers. In fact, the backend server doesn't have to be MySQL, and SQLiteis supported. orchestrator now ships with SQLite embedded, no external dependency needed.

For details, please refer to the documentation:


Suggested and requested by many, is to remove orchestrator's own dependency on a MySQL backend. orchestrator now supports a SQLite backend.

SQLite is a transactional, relational, embedded database, and as of 3.0 it is embedded within orchestrator, no external dependency required.


orchestrator-client is a client shell script which mimics the command line interface, while running curl | jq requests against the HTTP API. It stands to simplify your deployments: interacting with the orchestrator service via orchestrator-client is easier and only requires you to place a shell script (this is as opposed to installing the orchestrator binary + configuration file).

orchestrator-client is the way to interact with your orchestrator/raft cluster. orchestrator-client now has its own RPM/deb release package.

You may still use the web interface, web API ; and a special --ignore-raft-setup keeps power at your hand (use at your own risk).

State of orchestrator/raft

orchestrator/raft is a big change: Continue Reading »

Remembering Jaakko Pesonen

September 9, 2017

I was sorrowed to hear that Jaakko Pesonen has passed away after battling cancer.

I first met Jaakko a few years back, during a Percona Live conference, and as community goes, our paths crossed again a few times. He spoke at and attended conferences where we'd have casual chats.

We were both expats in the Netherlands for a period. As I moved in from Israel, he was already working at Spil Games, having relocated from Finland, his home country. We shared expat experiences and longings to our homes. One day he pinged me that he was planning a trip to Israel - and the next few days were all about planning the best culinary experience of his travel (he approved of the results).

He was happy for the opportunity to work for Percona, as this allowed him to move back home to Finland.

Jaakko had the biggest, widest, most consuming smile, and this smile will sure be the most vivid memory of him that I'll keep.

I do not have personal pictures of Jaakko. This picture was taken by Julian Cash at Percona Live. A rare non-smiling appearance.




Speaking at August Penguin, MySQL Track, GitHub sponsored

September 3, 2017

This Thursday I'll be presenting at August Penguin, conveniently taking place September 7th, 8th, Ramat Gan, Israel.

I will be speaking as part of the MySQL track, 2nd half of Thursday. The (Hebrew) schedule is here.

My talk is titled Reliable failovers, safe schema migrations: open source solutions to MySQL problems. I will describe some of the open source MySQL infrastructure work we run at GitHub ; how it solves reliability, availability and usability. I'll describe some of our internal workflows and our use of chat and chatops.

I'm proud to announce GitHub sponsors the event. We won't have a booth, but please do grab me in the hallways or over lunch to chat!

And, yes, octocat stickers will be made available 🙂


orchestrator/raft: Pre-Release 3.0

August 3, 2017

orchestrator 3.0 Pre-Release is now available. Most notable are Raft consensus, SQLite backend support, orchestrator-client no-binary-required client script.


You may now set up high availability for orchestrator via raft consensus, without need to set up high availability for orchestrator's backend MySQL servers (such as Galera/InnoDB Cluster). In fact, you can run a orchestrator/raft setup using embedded SQLite backend DB. Read on.

orchestrator still supports the existing shared backend DB paradigm; nothing dramatic changes if you upgrade to 3.0 and do not configure raft.


Raft is a consensus protocol, supporting leader election and consensus across a distributed system.  In an orchestrator/raft setup orchestrator nodes talk to each other via raft protocol, form consensus and elect a leader. Each orchestrator node has its own dedicated backend database. The backend databases do not speak to each other; only the orchestrator nodes speak to each other.

No MySQL replication setup needed; the backend DBs act as standalone servers. In fact, the backend server doesn't have to be MySQL, and SQLite is supported. orchestrator now ships with SQLite embedded, no external dependency needed. Continue Reading »

Powered by Wordpress and MySQL. Theme by