Notes on row based replication

MySQL’s Row Based Replication (RBR) succeeds (though not replaces) Statement Based Replication (SBR), as of version 5.1.

Anyone who is familiar with replication data drift — the unexplained growing data difference between master & slave — might wish to look into row based replication. On multiple servers I’m handling the change to RBR has eliminated (to the best of my findings) replication data drift.

This is easily explained, as RBR writes particular row IDs into the binary log. You no longer need to hope the statement

DELETE FROM my_table ORDER BY my_column LIMIT 100

acts deterministically on all servers (is my_column UNIQUE?). With row based replication the particular rows deleted on the master are then deleted on the slave.

Here are three notes on RBR:

  • Binary log size:

With statement based replication the binary log merely logs the statement – typically a short text. But RBR logs the row IDs and changes made to data.

After we finally got used to cheap huge disks, and forgot about the need for cleaning up the binary logs, RBR introduces bloated logs. On some servers I’am again confounded by the fact that 3 days worth of logs will hog my entire disk space.

  • Forgiveness

RBR is not as forgiving as SBR. With SBR, you could DELETE some rows from the master. If they’re already missing on the slave, no problem here. With RBR, such an incident makes for replication failure (RBR: “I’m supposed to DELETE rows 3, 4, 5 but can’t find them! I can’t to my job like this! Heeelp!”)

This is not a bad thing: you get an early alert that something went wrong.

  • Less slave effort

On the up side, the slave does not need to do much thinking. Given a DELETE command, the slave does not need to look up those rows WHERE some_condition IS TRUE. Instead, the master does all the hard work, the slave just gets the IDs of rows to delete.

I find that this boosts up replication speed in some scenarios, and in particular the scenario of data cleanup: those nightly/weekly purging of old, unused data. If you look hard and all you find are 5 rows to delete, all the slave needs to do is to delete those indicated 5 rows. With single-threaded replication this makes a real difference.

13 thoughts on “Notes on row based replication

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.