SQL: forcing single row tables integrity

Single row tables are used in various cases. Such tables can be used for “preferences” or “settings”; for managing counters (e.g. summary tables), for general-purpose administration tasks (e.g. heartbeat table) etc.

The problem with single row tables is that, well, they must have s single row. And the question is: how can you force them to have just one row?

The half-baked solution

The common solution is to create a PRIMARY KEY and always use the same value for that key. In addition, using REPLACE or INSERT INTO ON DUPLICATE KEY UPDATE helps out in updating the row. For example:

CREATE TABLE heartbeat (
 id int NOT NULL PRIMARY KEY,
 ts datetime NOT NULL
 );

The above table definition is taken from mk-heartbeat. It should be noted that mk-heartbeat in itself does not require that the table has a single row, so it is not the target of this post. I’m taking the above table definition as a very simple example.

So, we assume we want this table to have a single row, for whatever reasons we have. We would usually do:

REPLACE INTO heartbeat (id, ts) VALUES (1, NOW());

or

INSERT INTO heartbeat (id, ts) VALUES (1, NOW()) ON DUPLICATE KEY UPDATE ts = NOW();

Why is the above a “half baked solution”? Because it is up to the application to make sure it reuses the same PRIMARY KEY value. There is nothing in the database to prevent the following: Continue reading » “SQL: forcing single row tables integrity”

Personal observation: more migrations from MyISAM to InnoDB

I’m evidencing an increase in the planning, confidence & execution for MyISAM to InnoDB migration.

How much can a single consultant observe? I agree Oracle should not go to PR based on my experience. But I find that:

  • More companies are now familiar with InnoDB than there used to.
  • More companies are interested in migration to InnoDB than there used to.
  • More companies feel such migration to be safe.
  • More companies start up with an InnoDB based solution than with a MyISAM based solution.

This is the way I see it. No doubt, the Oracle/Sun deal made its impact. The fact that InnoDB is no longer a 3rd party; the fact Oracle invests in InnoDB and no other engine (Falcon is down, no real development on MyISAM); the fact InnoDB is to be the default engine: all these put companies at ease with migration.

Continue reading » “Personal observation: more migrations from MyISAM to InnoDB”

Verifying GROUP_CONCAT limit without using variables

I have a case where I must know if group_concat_max_len is at its default value (1024), which means there are some operation I cannot work out. I’ve ranted on this here.

Normally, I would simply:

SELECT @@group_concat_max_len

However, I am using views, where session variables are not allowed. Using a stored function can do the trick, but I wanted to avoid stored routines. So here’s a very simple test case: is the current group_concat_max_len long enough or not? I’ll present the long version and the short version.

The long version

SELECT
  CHAR_LENGTH(
    GROUP_CONCAT(
      COLLATION_NAME SEPARATOR ''
    )
  )
FROM
  INFORMATION_SCHEMA.COLLATIONS;

If the result is 1024, we are in a bad shape. I happen to know that the total length of collation names is above 1800, and so it is trimmed down. Another variance of the above query would be: Continue reading » “Verifying GROUP_CONCAT limit without using variables”

Those oversized, undersized variables defaults

Some mysqld parameters are far from having reasonable defaults. Most notable are the engine-specific values, and in particular the InnoDB parameters.

Some of these variables have different defaults as of MySQL 5.4. innodb_buffer_pool_size, for example, is 128M on 5.4. innodb_log_file_size, however, has changed back and forth, as far as I understand, and is down to 5M again. These settings are still the same on 5.5.

I wish to present some not-so-obvious parameters which, in my opinion, have poor defaults, for reasons I will explain.

  • group_concat_max_len: This parameter limits the maximum text length of a GROUP_CONCAT concatenation result. It defaults to 1024. I think this is a very low value. I have been using GROUP_CONCAT more and more, recently, to solve otherwise difficult problems. And in most cases, 1024 was just too low, resulting in silent (Argh!) truncating of the result, thus returning incorrect results. It is interesting to learn that the maximum value for this parameter is limited by max_packet_size. I would suggest, then, that this parameter should be altogether removed, and have the max_packet_size limitation as the only limitation. Otherwise, I’d like it to have a very large default value, in the order of a few MB.
  • wait_timeout: Here’s a parameter whose default value is over permissive. wait_timeout enjoys an 8 hour default. I usually go for 5-10 minutes. I don’t see a point in letting idle connections waste resources for 8 hours. Applications which hold up such connections should be aware that they’re doing something wrong, in the form of a forced disconnection. Connection pools work beautifully with low settings, and can themselves do keepalives, if they choose to.
  • sql_mode: I’ve discussed this in length before. My opinion unchanged.
  • open_files_limit: What with the fact connections, threads, table descriptors, table file descriptors (depending on how you use InnoDB), temporary file tables — all are files on unix-like systems, and considering this is an inexpensive payment, I think open_files_limit should default to a few thousands. Why risk the crash of “too many open files”?

Continue reading » “Those oversized, undersized variables defaults”

mycheckpoint (rev. 132): custom monitoring, custom charts, process list dump

Revision 132 of mycheckpoint has been released. New and updated in this revision:

  • Custom monitoring: monitoring & charting for user defined queries
  • HTML reports for custom monitoring
  • Process list dump upon alert notifications

Custom monitoring & charts

Custom monitoring allows the user to supply with a query, the results of which will be monitored.

That is, mycheckpoint monitors the status variables, replication status, OS metrics. But it cannot by itself monitor one’s application. Which is why a user may supply with such query as:

SELECT COUNT(*) FROM shopping_cart WHERE is_pending=1

Such a query will tell an online store how many customers are in the midst of shopping. There is no argument that this number is worth monitoring for. Given the above query, mycheckpoint will execute it per sample, and store the query’s result along with all sampled data, to be then aggregated by complex views to answer for:

  • What was the value per given sample?
  • What is the value difference for each sample?
  • What is the change per second, i.e. the rate?

mycheckpoint goes one step forward, and explicity records another metric:

  • How much time did it take to take that sample?

Continue reading » “mycheckpoint (rev. 132): custom monitoring, custom charts, process list dump”

Choosing MySQL boolean data types

How do you implement True/False columns?

There are many ways to do it, each with its own pros and cons.

ENUM

Create you column as ENUM(‘F’, ‘T’), or ENUM(‘N’,’Y’) or ENUM(‘0’, ‘1’).

This is the method used in the mysql tables (e.g. mysql.user privileges table). It’s very simple and intuitive. It truly restricts the values to just two options, which serves well. It’s compact (just one byte).

A couple disadvantages to this method:

  1. Enums are represented by numerical values (which is good) and start with 1 instead of 0. This means ‘F’ is 1, and ‘T’ is 2, and they both translate to True when directly used in a booleanic expression (e.g. IF(val, ‘True’, ‘False’) always yields ‘True’)
  2. There’s no real convention. Is it ‘Y’/’N’? ‘T’/’F’? ‘P’/’N’? ‘1’/’0′?

CHAR(1)

Simple again. Proposed values are, as before, ‘F’, ‘T’ etc. This time there’s no way to limit the range of values. You cannot (in MySQL, unless using triggers) prevent an ‘X’.

Watch out for the charset! If it’s utf8 you pay with 3 bytes instead of just 1. And, again, ‘T’, ‘F’, ‘Y’, ‘N’ values all evaluate as True. It is possible to use the zero-valued character, but it defeats the purpose of using CHAR. Continue reading » “Choosing MySQL boolean data types”