Being a stateful, centralized datastore, MySQL can serve in negotiating leadership: a mechanism to elect a single service out of multiple services; moreover, a mechanism to promote a new leader should the existing leader cease to function.
What of Zookeeper?
Zookeeper makes for an excellent leader election mechanism. This is one of the most recognized uses for Zookeeper. It has HA via multiple nodes & quorum, ephemeral nodes, all you need. To achieve similar benefits with MySQL you’d need to use Galera or NDB Cluster; so why not use Zk? The use case at hand is
orchestrator, a multi-node, mostly stateless service that happens to use MySQL as backend datastore. Ir relies on MySQL to exist in backend. It already expects it to be there. If the MySQL server is down, so is the service, effectively. In such case it doesn’t hurt adding another dependency on MySQL; this does not reduce HA. You need to take care of MySQL HA anyhow so there’s no additional cost. In fact, going to Zookeeper makes the additional cost as you introduce a new component to the system that can be avoided.
Terms of the solution
Our proposed solution offers:
- Single leader election out of multiple nodes
- Leader actively reaffirms its leadership periodically
- Timeout based re-election: decision to re-elect new leader based on the fact current leader has not reaffirmed its leadership over X seconds
- A way to forcibly assume leadership for a specific node
- A way to forcibly call for re-elections by demoting existing leader
- A node/service can easily tell whether it’s the leader or not
- Anyone can tell who the leader is
SQL solution
The solution is composed of a single table and a set of queries which implement the above offers. We assume a service can uniquely identify itself; this is easy to achieve:
- If services are running from different hosts (as should be the case, this is service HA), use hostname for ID
- But what if the service restarts? Are you good with calling this “the same service” or is this now a new service running on the same host?
- In such case use combination of hostname & OS process ID
- Or generate a random token upon startup
- Or use startup timestamp Whichever solution you pick, make sure it is human readable, such that it is easy to tell
which service is the leader. This helps operations. We note this as service_id
Table
The following table will have a single row; the
service_id in that row is the active leader.
CREATE TABLE service_election (
anchor tinyint(3) unsigned NOT NULL,
service_id varchar(128) NOT NULL,
last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (anchor)
) ENGINE=InnoDB