Making UUID() and RAND() replication safe

October 20, 2014

MySQL's UUID() and RAND() functions both provide with (pseudo) indeterministic result. UUID()'s result is moreover bound to the host on which it executes. For this reason, both are unsafe to replicate with STATEMENT binlog format. As an example, consider:

master> create table test.uuid_test (id int, u varchar(64));

master> insert into test.uuid_test values (1, UUID());
Query OK, 1 row affected, 1 warning (0.03 sec)

master> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e3596d8-56ac-11e4-b284-3c970ea31ea8 |
+------+--------------------------------------+

The warning we got on the insert directly relates to the following inconsistency on a slave:

slave1> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e379d63-56ac-11e4-8477-3c970ea31ea8 |
+------+--------------------------------------+

The data on the slave is clearly inconsistent with the master's. The slave, replicating via STATEMENT binlog format, re-executes the INSERT command and gets a different UUID value.

External

One solution to the above is to generate the UUID value from your application. By the time MySQL gets the INSERT statement, the UUID value is a constant string, as far as MySQL is concerned.

Internal

However there's a way to do it from within MySQL, by decoupling the UUID() function from the INSERT statement. It takes a session variable. Consider:

master> set @safe_uuid := UUID();
Query OK, 0 rows affected (0.00 sec)

master> insert into test.uuid_test values (2, @safe_uuid);
Query OK, 1 row affected (0.02 sec)

master> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e3596d8-56ac-11e4-b284-3c970ea31ea8 |
|    2 | 29c51fb9-56ad-11e4-b284-3c970ea31ea8 |
+------+--------------------------------------+

And on a slave:

slave1> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e379d63-56ac-11e4-8477-3c970ea31ea8 |
|    2 | 29c51fb9-56ad-11e4-b284-3c970ea31ea8 |
+------+--------------------------------------+

The reason why this succeeds is that MySQL stores session variable values that are being used by DML queries in the binary log. It just so happened that @safe_uuid was assigned the UUID() value, but it could just as well have been assigned a constant or other computation. MySQL stored the resulting value into the binary log, where it is forces upon the slave to use. Check out this binary log snippet:

# at 14251
#141018 12:57:35 server id 1  end_log_pos 14319         Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1413626255/*!*/;
SET @@session.sql_auto_is_null=0/*!*/;
BEGIN
/*!*/;
# at 14319
#141018 12:57:35 server id 1  end_log_pos 14397         User_var
SET @`safe_uuid`:=_utf8 0x32396335316662392D353661642D313165342D623238342D336339373065613331656138 COLLATE `utf8_general_ci`/*!*/;
# at 14397
#141018 12:57:35 server id 1  end_log_pos 14509         Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1413626255/*!*/;
insert into test.uuid_test values (2, @safe_uuid)
/*!*/;
# at 14509
#141018 12:57:35 server id 1  end_log_pos 14536         Xid = 145
COMMIT/*!*/;

The same can be applied for RAND(). Funny thing about RAND() is that it is already taken care of by the binary log via SET @@RAND_SEED1, SET @@RAND_SEED2 statements (i.e. it works), though the documentation clearly states it is unsafe.

With Row Based Replication (RBR) the problem never arises in the first place since the binlog contains the values of the new/updated rows.

  • What UUID version is used in MySQL function ?

    There are several versions for it, each one uses something a bit different.

    http://en.wikipedia.org/wiki/Universally_unique_identifier#Variants_and_versions

  • @ik_5, please follow the UUID() link in text to find out what version is used.

  • It reminds me of the similar method which can be used for password() so that it doesn't put the cleartext password in the binlog.

    Why would you do this and not set the binlog_format to MIXED (or ROW) for the server or session?

  • Daniel, if the slaves are replicating via STATEMENT based replication, and have log_slave_updates, then replication on those slaves will fail.

  • Pingback: Pseudo GTID | code.openark.org()

 
Powered by Wordpress and MySQL. Theme by openark.org