I find myself converting more and more customers’ databases to InnoDB plugin. In one case, it was a last resort: disk space was running out, and plugin’s compression released 75% space; in another, a slow disk made for IO bottlenecks, and plugin’s improvements & compression alleviated the problem; in yet another, I used the above to fight replication lag on a stubborn slave.
In all those case, I needed to justify the move to “new technology”. The questions “Is it GA? Is it stable?” are being asked a lot. Well, just a few days ago the MySQL 5.1 distribution started shipping with InnoDB plugin 1.0.4. That gives some weight to the stability question when facing a doubtful customer.
But I realized that wasn’t the point.
Before InnoDB plugin was first announced, little was going on with InnoDB. There were concerns about the slow/nonexistent progress on this important storage engine, essentially the heart of MySQL. Then the plugin was announced, and everyone went happy.
The point being, since then I only saw (or was exposed to, at least) progress on the plugin. The way I understand it, the plugin is the main (and only?) focus of development. And this is the significant thing to consider: if you’re keeping to “old InnoDB”, fine – but it won’t get you much farther; you’re unlikely to see great performance improvements (will 5.4 make a change? An ongoing improvement to InnoDB?). It may eventually become stale.
Converting to InnoDB plugin means you’re working with the technology at focus. It’s being tested, benchmarked, forked, improved, talked about, explained. I find this to be a major motive.
So, long live InnoDB Plugin! (At least till next year, that is, when we may all find ourselves migrating to PBXT)
How does dbShards know the commit order without modifying InnoDB or serializing them?
That was the hard part in building the product. It’s actually part of our patent-pending technology so unfortunately I can’t go into the detail of how we implemented but we do not serialize the commits (we tried that approach early on and performance was terrible).
@Mark – thanks
@Andy – to fight replication lag you would need more than slowing down the master. Does your solution make for higher concurrency on the slave? You say you do not serialize the commits. Does it follow that you are able to pass them on synchronously to the slave with original concurrency?
Andy,
OK, and I will add dbShards to the list of topics I won’t discuss. I don’t want to interfere with your business, but I also don’t want to waste my time doing marketing for it.
@Mark – What I can say is that we co-ordinate with the database transaction to ensure that we record the order that the commits happen and then make sure we apply the transactions in the same order on the slave.
@Shlomi – Yes, we do have higher concurrency on the slave when using dbShards. Transactions are effectively written to the master and slave machines in parallel with the same concurrency (although it should be noted that we do not write to the slave DB at this time, just to slave memory or log). We know that the dbShards log reliable replication is much faster than the master DB, so this is how we have such a minimal effect on performance. The main evil of replication lag is what happens if the master fails before transactions have been communicated to the slave. With our approach we do not lose any transactions if this happens.