This post continues my impressions from some of the talks I’ve been to. I’ll dedicate this post to a single session.
Wednesday
- BoF: Replication, Lars Thalmann, Mat Keep (Oracle)
Lars Thallman presented the MySQL 5.5 replication features, as well as the expected 5.6 features. Among other features, one could notice parallel replication, binlog checksums, sub-second resolution and more. There was an open discussion about these features, asking for comments; looking for new ideas and suggestion from the audience.
I can’t possibly cover it all. I’ll note two discussion I participated in, and which have interested me. This also serves for noting down to myself my ideas and thoughts.
Parallel replication
As the trend goes, parallel replication will be implemented by having a thread per schema. My own contribution to the discussion is that with such paradigm, the slave may be at any state inconsistent with the master. That is, at any point in time it can be in such state that has never existed in any point in time on the master. This is because it is possible that one or more of the replication threads executes much faster than one or more of the others.
We have discussed the meaning of STOP SLAVE. I have suggested that STOP SLAVE works in such way that it stops in a consistent slave. Lars suggested there could be normal STOP SLAVE and STOP SLAVE CONSISTENT.
It has been noted that we may not care about the slave being inconsistent, as the prerequisite for parallel replication is that the distinct schemata are independent of one another. After some thought, I disagree. Today parallel replication may be implemented on different schemata; tomorrow – based on finer parameters. I think it’s best to prepare for the inevitable future. Besides, STOP SLAVE INCONSISTENT is such an easy feature to implement!
Slave analysis
I also had a long time feature request. While writing management tools, I realized it is difficult to learn the topology of a replication environment by simple interrogation. If you connect to a master, you can either issue SHOW PROCESSLIST to detect the slaves, or SHOW SLAVE HOSTS.
The former does not tell you the port on which the slaves are listening. The latter requires that you configure report_host and report_port properly.
What I would like to see is a SHOW SLAVES STATUS command (or similar INFORMATION_SCHEMA table) on the master. Such that it lists connected slaves, one per row, with the following details:
- Slave_host (this is easy, already exists in SHOW PROCESSLIST)
- Slave_port (slave should send this kind of information to master upon first connect)
- Slave_master_log_file (last log file requested by slave)
- Slave_master_log_pos (last log pos requested by slave)
- Slave_seconds_behind_master (slave should report this upon each request)
- Slave_server_id (slave should send this upon connect)
By reading Slave_host, Slave_port & Slave_server_id, it makes it easy for us to build replication topology. It’s a single operation to detect a master’s slaves. It takes recursive steps to build entire topology.
By reading the Slave_master_log_file, Slave_master_log_pos, we can easily determine that binary logs up to that point have already been written to the slave’s relay log. This makes it easier to decide how to purge master logs. Slave_seconds_behind_master completes the replication picture.
Shlomi,
You can do some topology discovery today, although it requires a few hacks:
http://datacharmer.blogspot.com/2008/08/introducing-mysql-community-driven.html
Giuseppe,
Thanks! I was aware of the I_S tables implementation, but not of the MySQL Community Replication Monitor.
Hi Shlomi,
As of 5.5 you are no longer required to properly configure the report_host option:
“Displays a list of replication slaves currently registered with the master. (Before MySQL 5.5.3, only slaves started with the –report-host=host_name option are visible in this list.)”
http://dev.mysql.com/doc/refman/5.5/en/show-slave-hosts.html
And.. I feel your pain with replication topology – to do this properly is quite a chore (we jump through a few hoops in MEM to do this as well).
We were lucky enough to have the replication team’s ear however, and 5.6 answers some prayers – each instance has a UUID automatically created for it on startup now, and that is then peppered through SHOW VARIABLES (server_uuid), SHOW SLAVE STATUS (Master_UUID) and SHOW SLAVE HOSTS (Slave_UUID):
http://dev.mysql.com/doc/refman/5.6/en/show-slave-status.html
http://dev.mysql.com/doc/refman/5.6/en/show-slave-hosts.html
You can just walk the uuid chain, assuming you are getting data for all of the servers in the topology. SHOW SLAVE HOSTS can tell you the ones you don’t know about consistently now as well.
Cheers,
Mark
@Mark,
Great! Thank for letting me know.
The SHOW SLAVE HOSTS in 5.6 + 5.6 are huge improvements to the 5.1 status.