Leader election using MySQL

October 14, 2015

Being a stateful, centralized datastore, MySQL can serve in negotiating leadership: a mechanism to elect a single service out of multiple services; moreover, a mechanism to promote a new leader should the existing leader cease to function.

What of Zookeeper?

Zookeeper makes for an excellent leader election mechanism. This is one of the most recognized uses for Zookeeper. It has HA via multiple nodes & quorum,  ephemeral nodes, all you need. To achieve similar benefits with MySQL you'd need to use Galera or NDB Cluster; so why not use Zk?

The use case at hand is orchestrator, a multi-node, mostly stateless service that happens to use MySQL as backend datastore. Ir relies on MySQL to exist in backend. It already expects it to be there. If the MySQL server is down, so is the service, effectively. In such case it doesn't hurt adding another dependency on MySQL; this does not reduce HA. You need to take care of MySQL HA anyhow so there's no additional cost. In fact, going to Zookeeper makes the additional cost as you introduce a new component to the system that can be avoided.

Terms of the solution

Our proposed solution offers:

  • Single leader election out of multiple nodes
  • Leader actively reaffirms its leadership periodically
  • Timeout based re-election: decision to re-elect new leader based on the fact current leader has not reaffirmed its leadership over X seconds
  • A way to forcibly assume leadership for a specific node
  • A way to forcibly call for re-elections by demoting existing leader
  • A node/service can easily tell whether it's the leader or not
  • Anyone can tell who the leader is

SQL solution

The solution is composed of a single table and a set of queries which implement the above offers. We assume a service can uniquely identify itself; this is easy to achieve:

  • If services are running from different hosts (as should be the case, this is service HA), use hostname for ID
    • But what if the service restarts? Are you good with calling this "the same service" or is this now a new service running on the same host?
  • In such case use combination of hostname & OS process ID
    • Or generate a random token upon startup
    • Or use startup timestamp

Whichever solution you pick, make sure it is human readable, such that it is easy to tell which service is the leader. This helps operations. We note this as service_id

Table

The following table will have a single row; the service_id in that row is the active leader.

CREATE TABLE service_election (
  anchor tinyint(3) unsigned NOT NULL,
  service_id varchar(128) NOT NULL,
  last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (anchor)
) ENGINE=InnoDB

Attempt election

All services issue the following, periodically. Say every service issues this once per second:

insert ignore into service_election (
	anchor, service_id, last_seen_active
) values (
	1, 'my_service_number_7', now()
) on duplicate key update
	service_id = if(last_seen_active < now() - interval 20 second, values(service_id), service_id),
	last_seen_active = if(service_id = values(service_id), values(last_seen_active), last_seen_active)
;

Replace 'my_service_number_7' with specific service_id per service.

The above is the most complex query in our solution; breakdown:

  • Assume timeout is set for 20 seconds
  • The first ever service (my_service_number_7) succeeds in registering (inserting its own service_id into the table)
  • The next time this service issues the query, it finds that it is already the owner, hence updates last_seen_active.
    • And while it is alive and keeps on polling, it remains the leader.
  • Immediately following, another service (my_service_number_12) does not update anything: since last_seen_active < now() - interval 20 second is falseservice_id is unchanged, nor is last_seen_active.
  • Assuming said service (my_service_number_7) has died and hasn't injected anything in the last 20 seconds, a different service will inject its own service_id as well as update last_seen_active. It will become the leader.

Forcibly assume leadership

Make a certain service the leader:

replace into service_election (
        anchor, service_id, last_seen_active
    ) values (
        1, 'my_service_number_12', now()
    )

The next time my_service_number_12 attempts to register its own leadership, it finds out it's already the leader and updates last_seen_active.

Force re-elections

The next service to attempt election will succeed after this:

delete from service_election;

Am I the leader?

A service can query as follows:

select count(*) as is_leader from service_election where anchor=1 and service_id='my_service_number_7';

Who is the leader?

This is of course very simple:

select max(service_id) as leader from service_election where anchor=1;

What of GET_LOCK()?

GET_LOCK() allows one to acquire a helper-lock of arbitrary name; this lock is reserved as long as the connection that acquired it lives. No other connection can acquire lock of same name.

SELECT GET_LOCK("my_service_leadership", 0)

Returns 1 on success, 0 on failure to obtain lock.

This supposedly makes a much simpler leader election mechanism. However I don't like it for the following reasons:

  • You must allocate a dedicated connection that lives for the duration of your service
    • This doesn't play too well with ORMs or typical connection pools
  • There is no visibility (< MySQL 5.7) into who is holding the lock. You only know that it's being held.
  • If your service hangs (but does not die), the lock is still being held. There is no "keepalive" requirement on the lock's side.
  • Hi, I'm wondering how would you implement fencing with this techniques?
    To be more precise, I don't want any server who is not leader to alter any data. Maybe expressed(not working) as:

    IF (SELECT count(*) as is_leader from service_election WHERE anchor=1 AND service_id='my_service_number_7') = 1 THEN
    do update like insert or update or delete
    END IF

    In this way, we can prevent malfunctioning master polluting data. But I don't know if we can express it in SQL

  • @xudifsd I'm not sure I understand the question.

    > I don't want any server who is not leader to alter any data

    Which data is that? Any data?

    I wouldn't incorporate an SQL leader logic into every single query in your app. Instead, your app can have a flag, frequently updated, saying "am I the leader".
    If you're OK that the app will cease writing for a second as the leader changes, you can further require that a new leader only begins writing at least one second after becoming leader, during which time previous leader completes its own writes.

  • tbeerbower

    Nice, Thanks!

    I think it's worth mentioning that you can answer 'Am I the leader?' by looking at the number of affected rows on the insert/update. Only the leader will change the row values so you can update the row and check leadership in one statement. In JDBC something like ...

    Properties connectionProperties = new Properties();
    connectionProperties.put("user", username);
    connectionProperties.put("password", password);
    connectionProperties.put("useAffectedRows", "true");

    Connection connection = DriverManager.getConnection(url, connectionProperties);

    PreparedStatement statement = connection.prepareStatement(INSERT_ELECTION_STATEMENT);
    statement.setString(1, id);
    statement.setInt(2, timeout);

    boolean leader = statement.executeUpdate() > 0;

    Note that you need to set the useAffectedRows (MySQL specific) property since JDBC specifies that the the number of selected rows should be returned.

    ---

    Also, another benefit of your solution over GET_LOCK is that GET_LOCK doesn't work with MySQL Cluster since the lock is not visible outside of a single MySQL process.

 
Powered by Wordpress and MySQL. Theme by openark.org