Seconds_behind_master vs. Absolute slave lag

I am unable to bring myself to trust the Seconds_behind_master value on SHOW SLAVE STATUS. Even with MySQL 5.5‘s CHANGE MASTER TO … MASTER_HEARTBEAT_PERIOD (good thing, applied when no traffic goes from master to slave) it’s easy and common to find fluctuations in Seconds_behind_master value.

And, when sampled by your favourite monitoring tool, this often leads to many false negatives.

At Outbrain we use HAProxy as proxy to our slaves, on multiple clusters. More about that in a future post. What’s important here is that our decision whether a slave enters or leaves a certain pool (i.e. gets UP or DOWN status in HAProxy) is based on replication lag. Taking slaves out when they are actually replicating well is bad, since this reduces the amount of serving instances. Putting slaves in the pool when they are actually lagging too much is bad as they contain invalid, irrelevant data.

To top it all, even when correct, the Seconds_behind_master value is practically irrelevant on 2nd level slaves. In a Master -> Slave1 -> Slave2 setup, what does it mean that Slave2 has Seconds_behind_master = 0? Nothing much to the application: Slave1 might be lagging an hour behind the master, or may not be replicating at all. Slave2 might have an hour’s data missing even though it says its own replication is fine.

None of the above is news, and yet many fall in this pitfall. The solution is quite old as well; it is also very simple: do your own heartbeat mechanism, at your favourite time resolution, and measure slave lag by timestamp you yourself updated on the master.

Maatkit/percona-toolkit did this long time ago with mk-heartbeat/pt-heartbeat. We’re doing it in a very similar manner. The benefit is obvious. Consider the following two graphs; the first shows Seconds_behind_master, the seconds shows our own Absolute_slave_lag measurement. Continue reading » “Seconds_behind_master vs. Absolute slave lag”

Percona Live 2014 schedule released; BoF and Lightning Talks Call for Papers continues

The complete tutorial & session schedule for Percona Live MySQL Conference & Expo 2014 is released. This schedule offers both a sense of achievement as well as a sense of regret; for I believe the schedule is very good, and yet some good proposals had to be left out.

This is an inevitable result of a conference that is popular and receives far more proposals than can fit within the time frames. This conference offers 96 session slots and 16 3-hour tutorial slots. We got well over 300 proposals — I’m not even sure how to count them — and they just can’t all fit in. My sincere apologies to all those left out. A proposal of mine was just rejected yesterday from another conference; I can sympathize and empathize with all turned down.

As part of our interest in having a diversity of talks and speakers, we have promoted talks by less frequent speakers and newly presenting companies. We are happy to grow the community!

Although titled “Percona Live”, this conference’s program is managed by a diverse and independent committee. We had good discussions and some very good thinking and advice were offered. I’m happy to acknowledge and thank the committee members:

  • Cédric Peintre, Dailymotion
  • Giuseppe Maxia, Continuent
  • Ivan Zoratti, SkySQL
  • Jay Janssen, Percona
  • Jeremy Cole, Google
  • Laine Campbell, PalominoDB (now Blackbird, congrats!)
  • Liz van Dijk, Percona
  • Roland Bouman, Pentaho
  • Tim Callaghan, Tokutek
  • Todd Farmer, Oracle
  • myself, Outbrain

Looking at the schedule I’m as always eager to attend many more sessions than I can; until I get more replicas of myself, It’s again down to choosing between multiple prominent talks at each time slot.

Thank you to all those who submitted a proposal! (It’s cool, just saying)

Birds of a Feather, Lightning Talks

Call for papers continues! You are encouraged to submit your proposals until end of January. These proposals are reviewed by the committee, and eventually chosen and scheduled by Giuseppe Maxia. See also: