Table refactoring & application version upgrades, Part I

A developer’s major concern is: How do I do application & database upgrades with minimal downtime? How do I synchronize between a DB’s version upgrade and an application’s version upgrade?

I will break down the discussion into types of database refactoring operations, and I will limit to single table refactoring. The discussion will try to understand the need for refactoring and will dictate the steps towards a successful upgrade.

Reader prerequisites

I will assume MySQL to be the underlying database. To take a major component out of the equation: we may need to deal with very large tables, for which an ALTER command may take long hours. I will assume familiarity with Master-Master (Active-Passive) replication, with possible use of MMM for MySQL. When I describe “Failover from M1 to M2“, I mean “Make the ALTER changes on M2 (passive), then switch your application from M1 to M2 (change of IPs, VIP, etc.), promoting M2 to active position, then apply same changes on M1 (now passive) or completely rebuild it”.

Phew, a one sentence description of M-M usage…

I also assume the reader’s understanding that a table’s schema can be different on master & slave, which is the basis for the “use replication for refactoring” trick. But it cannot be too different, or, to be precise, the two schemata must both support the ongoing queries for the table.

A full discussion of the above is beyond the scope of this post.

Types of refactoring needs

As I limit this discussion to single table refactoring,we can look at major refactoring operations and their impact on application & upgrades. We will discuss ADD/DROP COLUMN, ADD/DROP INDEX, ADD/DROP UNIQUE INDEX, ADD/DROP FOREIGN KEY, ADD/DROP TABLE.

We will assume the database and application are both in Version #1 (V1), and need to be upgraded to V2 or greater. Continue reading » “Table refactoring & application version upgrades, Part I”

SQL: good comments conventions

I happened upon a customer who left me in awe and admiration. The reason: excellent comments for their SQL code.

I list four major places where SQL comments are helpful. I’ll use the sakila database. It is originally scarcely commented; I’ll present it now enhanced with comments, to illustrate.

Table definitions

The CREATE TABLE statement allows for a comment, intended to describe the nature of the table:

CREATE TABLE `film_text` (
 `film_id` smallint(6) NOT NULL,
 `title` varchar(255) NOT NULL,
 `description` text,
 PRIMARY KEY (`film_id`),
 FULLTEXT KEY `idx_title_description` (`title`,`description`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Reflection of `film`, used for FULLTEXT search.'

It’s too bad the comment’s max length is 60 characters, though. However, it’s a very powerful field.

Column definitions

One may comment particular columns: Continue reading » “SQL: good comments conventions”

Database schema under version control

How many organization use version control for development? Probably almost every single one.

How many store the database schema under version control? Alas, not as many.

Coupling one’s application with table schema is essential. Organization who actively support multiple versions of the product understand that well. Smaller organizations not always have this realization.

How is it done?

Ideally

Ideally one would have:

  • The schema, generated by hand
  • Essential data (INSERT INTO statements for those lookup tables without which you cannot have an application)
  • Sample data: sufficient real-life data on which to act. This would include customers data, logs, etc.

If you can work this way, then creating a staging environment consists of re-creating the entire schema from out schema code, and filling in the essential & sample data.

The thing with this method is that one does not (usually?) apply it on one’s live system. Say we were to add a column. On our live servers we would issue an ALTER TABLE t ADD COLUMN.

But this means we’re using different methods on our staging server and on our production server.

Incremental

Another kind of solution would be to hold:

  • The static schema, as before
  • Essential data, as before
  • Sample data, as before
  • A migration script, which is the concatenation of all ALTERs, CREATEs etc. as of the static schema.

Once in a while one can do a “reset”, and update the static schema with the existing design.

As you go along

This solution simply means “we apply the changes on staging; test + version them; then apply on production”.

A side effect of this solution is that the database generates the schema, but the schema does not generate the database as in previous cases. This makes for an uglier solution, where you first apply the changes to the database, and then, based on what the database report, enter data into the version control.

How to do that? Easiest would be to use mysqldump –routines –no-data. Some further parsing should be done to strip out the AUTO_INCREMENT values, which tend to change, as well as the surrounding variables settings (strip out the character set settings etc.).

Summary

However you do it, make sure you have some kind of version control on your schema. It pays off just as with doing version control for your code. You get to compare, understand the incremental changes, understand the change in design, etc.