CHAR and spaces

I know about it, I knew about it all along, but… it’s so easy to fall for it; there’s just so much absurdity!

A CHAR type has a known number of characters. For example, the column:

CountryCode CHAR(3) CHARSET ascii NOT NULL

– is known to have exactly three characters. These could be ‘USA’, ‘FRA’, etc.

What happens with spaces? A SQL CHAR type ignores any trailing spaces; thus, the code ‘GB ‘ (the characters ‘G’, ‘B’, and the space ‘ ‘) is interpreted as ‘GB’. Trailing spaces are not regarded as part of the text. Want to see some absurdity?

CREATE TABLE `c_test` (
  `c` char(1) DEFAULT NULL
);
INSERT INTO c_test VALUES ('a');
INSERT INTO c_test VALUES ('b');
INSERT INTO c_test VALUES (' ');

SELECT c, LENGTH(c), c = 'a', c = 'b', c = ' ', LENGTH(' ') FROM c_test;
+------+-----------+---------+---------+---------+-------------+
| c    | LENGTH(c) | c = 'a' | c = 'b' | c = ' ' | LENGTH(' ') |
+------+-----------+---------+---------+---------+-------------+
| a    |         1 |       1 |       0 |       0 |           1 |
| b    |         1 |       0 |       1 |       0 |           1 |
|      |         0 |       0 |       0 |       1 |           1 |
+------+-----------+---------+---------+---------+-------------+

Note that LENGTH(c) is zero on third row, but nevertheless it equals the space character (c = ‘ ‘). Although, the length of the space character is… 1 (and this is because the constant ‘ ‘ is interpreted as VARCHAR).

To emphasize the absurdity, now take a closer look:

SELECT c, LENGTH(c), c = ' ', LENGTH(' '), c = '', LENGTH('') FROM c_test;
+------+-----------+---------+-------------+--------+------------+
| c    | LENGTH(c) | c = ' ' | LENGTH(' ') | c = '' | LENGTH('') |
+------+-----------+---------+-------------+--------+------------+
| a    |         1 |       0 |           1 |      0 |          0 |
| b    |         1 |       0 |           1 |      0 |          0 |
|      |         0 |       1 |           1 |      1 |          0 |
+------+-----------+---------+-------------+--------+------------+

I add a comparison with the empty string , and it turns out the space character equals both a space character and the empty string, though their lengths do not agree.

Now, I usually don’t bother with CHAR types, and prefer the VARCHAR or TEXT types, but every once in a while I do use it. Most recently, in a stored routine code in common_schema, I iterate & parse some text. I iterated the text char by char, and declared a local CHAR(1) variable to assign iterated value to. It sure bit me when I got to a space character which insisted it was the empty text (indicating end of iteration).

Of course the immediate and correct solution was to change the local variable’s type to VARCHAR(1) instead. But the time it consumed to find the problem… Be warned and be aware!

12 thoughts on “CHAR and spaces

  1. What is absurd here is that “SELECT c = ”;” returns “1”, right? (in other words: empty string == SPACE-character). This I consider a plain bug. What say?

    “SELECT LENGTH(c)” will simply return the column width with this SQL_mode, I think, as “When CHAR values are stored, they are right-padded with spaces to the specified length.” (quote from http://dev.mysql.com/doc/refman/5.5/en/char.html)

  2. “I add a comparison with the empty string ”, and it turns out the space character equals both a space character and the empty string, though their lengths do not agree.”

    I think this might be blank-padding semantics, where the SQL standard requires padding so that when compared the values will have the same length.

    Test with, let’say PostgreSQL, it should also have this behavior.

  3. @Rick,

    Well, only for truly fixed length fields which do not end with spaces, that is.
    I had a perfectly true fixed length field which was spaces, those spaces being ignored…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.