Speaking at “August Penguin 2011”

I will be speaking at August Penguin 2011 (אוגוסט פינגווין), on August 12th in Ramat-Gan, Israel.

August Penguin is the annual meeting of Hamakor society: an Israeli society for Free Software and Open-Source Code (read more here).

I’ll be holding a non-technical talk about MySQL, titled “MySQL and the Open Source Sphere”. In this talk I will be presenting my impressions of the nature of open source development of MySQL and surroundings: the core server, the various forks, patches, 3rd party tools, companies involved, etc. So this is a general “get to know who’s who & what’s what in the MySQL world”.

This is a 30 minutes talk. I will surely not cover every open source development in this field, so my apologies in advance to those I leave out. The truth is, there’s so much going on lately I can hardly keep up with reading the announcements. Truly, this is a wonderful time for open source development with MySQL.

Talk will be held in Hebrew.

See you there!

Announcing common_schema: common views & routines for MySQL

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

Continue reading » “Announcing common_schema: common views & routines for MySQL”

Thoughts and ideas for Online Schema Change

Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than never.

The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.

“The original table must have PK. Otherwise an error is returned.”

This restriction could be lifted: it’s enough that the table has a UNIQUE KEY. My original oak-online-alter-table handled that particular case. As far as I see from their code, the Facebook code would work just as well with any unique key.

However, this restriction is of no real interest. As we’re mostly interested in InnoDB tables, and since any InnoDB table should have a PRIMARY KEY, we shouldn’t care too much.

“No foreign keys should exist. Otherwise an error is returned.”

Tricky stuff. With oak-online-alter-table, changes to the original table were immediately reflected in the ghost table. With InnoDB tables, that meant same transaction. And although I never got to update the text and code, there shouldn’t be a reason for not using child-side foreign keys (the child-side is the table on which the FK constraint is defined).

The Facebook patch works differently: it captures changes and writes them to a delta table,  to be later (asynchronously) analyzed and make for a replay of actions on the ghost table. Continue reading » “Thoughts and ideas for Online Schema Change”

Table refactoring & application version upgrades, Part II

Continuing Table refactoring & application version upgrades, Part I, we now discuss code & database upgrades which require DROP operations. As before, we break apart the upgrade process into sequential steps, each involving either the application or the database, but not both.

As I’ll show, DROP operations are significantly simpler than creation operations. Interestingly, it’s the same as in life.

DROP COLUMN

A column turns to be redundant, unused. Before it is dropped from the database, we must ensure no one is using it anymore. The steps are:

  1. App: V1 -> V2. Remove all references to column; make sure no queries use said column.
  2. DB: V1 -> V2 (possibly failover from M1 to M2), change is DROP COLUMN.

DROP INDEX

A possibly simpler case here. Why would you drop an index? Is it because you found out you never use it anymore? Then all you have to do is just drop it.

Or perhaps you don’t need the functionality the index supports anymore? Then first drop the functionality:

  1. (optional) App: V1 -> V2. Discard using functionality which relies on index.
  2. DB: V1 -> V2 (possibly failover from M1 to M2), change is DROP INDEX. Check out InnoDB Plugin here. Continue reading » “Table refactoring & application version upgrades, Part II”

Table refactoring & application version upgrades, Part I

A developer’s major concern is: How do I do application & database upgrades with minimal downtime? How do I synchronize between a DB’s version upgrade and an application’s version upgrade?

I will break down the discussion into types of database refactoring operations, and I will limit to single table refactoring. The discussion will try to understand the need for refactoring and will dictate the steps towards a successful upgrade.

Reader prerequisites

I will assume MySQL to be the underlying database. To take a major component out of the equation: we may need to deal with very large tables, for which an ALTER command may take long hours. I will assume familiarity with Master-Master (Active-Passive) replication, with possible use of MMM for MySQL. When I describe “Failover from M1 to M2“, I mean “Make the ALTER changes on M2 (passive), then switch your application from M1 to M2 (change of IPs, VIP, etc.), promoting M2 to active position, then apply same changes on M1 (now passive) or completely rebuild it”.

Phew, a one sentence description of M-M usage…

I also assume the reader’s understanding that a table’s schema can be different on master & slave, which is the basis for the “use replication for refactoring” trick. But it cannot be too different, or, to be precise, the two schemata must both support the ongoing queries for the table.

A full discussion of the above is beyond the scope of this post.

Types of refactoring needs

As I limit this discussion to single table refactoring,we can look at major refactoring operations and their impact on application & upgrades. We will discuss ADD/DROP COLUMN, ADD/DROP INDEX, ADD/DROP UNIQUE INDEX, ADD/DROP FOREIGN KEY, ADD/DROP TABLE.

We will assume the database and application are both in Version #1 (V1), and need to be upgraded to V2 or greater. Continue reading » “Table refactoring & application version upgrades, Part I”