I find myself converting more and more customers' databases to InnoDB plugin. In one case, it was a last resort: disk space was running out, and plugin's compression released 75% space; in another, a slow disk made for IO bottlenecks, and plugin's improvements & compression alleviated the problem; in yet another, I used the above to fight replication lag on a stubborn slave.
In all those case, I needed to justify the move to "new technology". The questions "Is it GA? Is it stable?" are being asked a lot. Well, just a few days ago the MySQL 5.1 distribution started shipping with InnoDB plugin 1.0.4. That gives some weight to the stability question when facing a doubtful customer.
But I realized that wasn't the point.
Before InnoDB plugin was first announced, little was going on with InnoDB. There were concerns about the slow/nonexistent progress on this important storage engine, essentially the heart of MySQL. Then the plugin was announced, and everyone went happy.
The point being, since then I only saw (or was exposed to, at least) progress on the plugin. The way I understand it, the plugin is the main (and only?) focus of development. And this is the significant thing to consider: if you're keeping to "old InnoDB", fine - but it won't get you much farther; you're unlikely to see great performance improvements (will 5.4 make a change? An ongoing improvement to InnoDB?). It may eventually become stale.
Converting to InnoDB plugin means you're working with the technology at focus. It's being tested, benchmarked, forked, improved, talked about, explained. I find this to be a major motive.
So, long live InnoDB Plugin! (At least till next year, that is, when we may all find ourselves migrating to PBXT)
Despite these improvements there will always be replication lag to fight with standard MySQL replication. A better solution is to use a reliable replication product such as dbShards which offers sychronous replication and guarantees that transactions are not lost in the case of the master database server failing.
Andy,
Could you please explain why would synchronous replication help when the slave find it hard to follow? Will a synchronous replication not keep back writing speed on the master instead?
Synchronous replication means that the commit does not complete until the transaction is replicated to the slave server (not necessarily replicated to the database on the slave server, just to memory or log file). This does reduce write speed to the master - typically we see a 10% reduction in throughput with dbShards - but it means that if the master fails then no transactions are lost and the client applications can simply failover to the slave.
Shlomi,
Nice post. Sync replications guarantees that there is no lag as all (or most if using quorum) servers must commit the transaction at the same time. The lag is eliminated by rate limiting at commit time.
I have not read much about dbShards but most of the libraries that support sync replication above the db server (in middleware or client libraries) impose limits on concurrency. Does dbShards allow: all transactions to run concurrently, transactions on different tables to run concurrently, group commit?
dbShards does allow concurrent transactions but guarantees that transactions are replicated in the order they are commited against the master database. We don't specifically do anything to support group commit so I'm not sure that we do support that.
How does dbShards know the commit order without modifying InnoDB or serializing them?
That was the hard part in building the product. It's actually part of our patent-pending technology so unfortunately I can't go into the detail of how we implemented but we do not serialize the commits (we tried that approach early on and performance was terrible).
@Mark - thanks
@Andy - to fight replication lag you would need more than slowing down the master. Does your solution make for higher concurrency on the slave? You say you do not serialize the commits. Does it follow that you are able to pass them on synchronously to the slave with original concurrency?
Andy,
OK, and I will add dbShards to the list of topics I won't discuss. I don't want to interfere with your business, but I also don't want to waste my time doing marketing for it.
@Mark - What I can say is that we co-ordinate with the database transaction to ensure that we record the order that the commits happen and then make sure we apply the transactions in the same order on the slave. @Shlomi - Yes, we do have higher concurrency on the slave when using dbShards. Transactions are effectively written to the master and slave machines in parallel with the same concurrency (although it should be noted that we do not write to the slave DB at this time, just to slave memory or log). We know that the dbShards log reliable… Read more »
"I only saw (or was exposed to, at least) progress on the plugin. The way I understand it, the plugin is the main (and only?) focus of development. And this is the significant thing to consider: if you’re keeping to “old InnoDB”, fine – but it won’t get you much farther; you’re unlikely to see great performance improvements" Hi, the builtin InnoDB in MySQL 5.1 and the whole MySQL 5.1 is GA and frozen for new features and big/risky changes. This is a common sense - you don't risk the stability of the stable branch with new features. Of course… Read more »
Hi Vasil, Thanks. I'm not referring to the fact that InnoDB in 5.1 & 5.0 is now frozen to new features. I'm not referring to bug fixes, as well. I'm wondering if any further development is going to take place on builtin InnoDB. My impression is that all further development, performance improvements etc. goes to InnoDB plugin, and I suspect that the Plugin will become that main InnoDB distribution from now on. For example: will the builtin InnoDB code get the group-commit fix? Is it planned to get this fix in future versions? My impression is that it will not.… Read more »
Hi,
You can't expect new features or significant performance improvements in a GA branch. This is what GA is by definition. Hot features and stability are adverse.
Vasil,
When the server is on a 3 to 4 year release cycle and the current release does not scale on commodity HW, then you should make these changes to the GA release or users will go elsewhere (PostgreSQL, XtraDB). The InnoDB plugin has made 5.1 much more attractive as an upgrade target.
Mark,
I agree with you 🙂
Synchronous replication isn't the only way to go... sure, it's bearable on small clusters, but when you have tens of slaves, it gets painful.
Here at GenieDB, we've done some tricks with asynchronous replication combined with synchronously updating a chosen-per-record-with-a-hash-of-the-PK 'consistency buffer' server that stores the record in-memory for long enough for the asynch replication to happen, thereby getting around the problem with a small latency increase (the consistency buffer is highly optimised for low latency; we use memcache!) that doesn't grow as the number of replicas does.