I read with interest MySQL High Availability tools – Comparing MHA, MRM and ClusterControl by SeveralNines. I thought there was a missing piece in the comparison: orchestrator, and that as result the comparion was missing scope and context.
I’d like to add my thoughts on topics addressed in the post. I’m by no means an expert on MHA, MRM or ClusterControl, and will mostly focus on how orchestrator
tackles high availability issues raised in the post.
What this is
This is to add insights on the complexity of failovers. Over the duration of three years, I always think I’ve seen it all, and then get hit by yet a new crazy scenario. Doing the right thing automatically is difficult.
In this post, I’m not trying to convince you to use orchestrator
(though I’d be happy if you did). To be very clear, I’m not claiming it is better than any other tool. As always, each tool has pros and cons.
This post does not claim other tools are not good. Nor that orchestrator
has all the answers. At the end of the day, pick the solution that works best for you. I’m happy to use a solution that reliably solves 99% of the cases as opposed to an unreliable solution that claims to solve 99.99% of the cases.
Quick background
orchestrator
is actively maintained by GitHub. It manages automated failovers at GitHub. It manages automated failovers at Booking.com, one of the largest MySQL setups on this planet. It manages automated failovers as part of Vitess. These are some names I’m free to disclose, and browsing the issues shows a few more users running failovers in production. Otherwise, it is used for topology management and visualization in a large number of companies such as Square, Etsy, Sendgrid, Godaddy and more.
Let’s now follow one-by-one the observations on the SeveralNines post. Continue reading » ““MySQL High Availability tools” followup, the missing piece: orchestrator”