CHAR and spaces

I know about it, I knew about it all along, but… it’s so easy to fall for it; there’s just so much absurdity!

A CHAR type has a known number of characters. For example, the column:

CountryCode CHAR(3) CHARSET ascii NOT NULL

– is known to have exactly three characters. These could be ‘USA’, ‘FRA’, etc.

What happens with spaces? A SQL CHAR type ignores any trailing spaces; thus, the code ‘GB ‘ (the characters ‘G’, ‘B’, and the space ‘ ‘) is interpreted as ‘GB’. Trailing spaces are not regarded as part of the text. Want to see some absurdity?

CREATE TABLE `c_test` (
  `c` char(1) DEFAULT NULL
);
INSERT INTO c_test VALUES ('a');
INSERT INTO c_test VALUES ('b');
INSERT INTO c_test VALUES (' ');

SELECT c, LENGTH(c), c = 'a', c = 'b', c = ' ', LENGTH(' ') FROM c_test;
+------+-----------+---------+---------+---------+-------------+
| c    | LENGTH(c) | c = 'a' | c = 'b' | c = ' ' | LENGTH(' ') |
+------+-----------+---------+---------+---------+-------------+
| a    |         1 |       1 |       0 |       0 |           1 |
| b    |         1 |       0 |       1 |       0 |           1 |
|      |         0 |       0 |       0 |       1 |           1 |
+------+-----------+---------+---------+---------+-------------+

Note that LENGTH(c) is zero on third row, but nevertheless it equals the space character (c = ‘ ‘). Although, the length of the space character is… 1 (and this is because the constant ‘ ‘ is interpreted as VARCHAR).

To emphasize the absurdity, now take a closer look:

SELECT c, LENGTH(c), c = ' ', LENGTH(' '), c = '', LENGTH('') FROM c_test;
+------+-----------+---------+-------------+--------+------------+
| c    | LENGTH(c) | c = ' ' | LENGTH(' ') | c = '' | LENGTH('') |
+------+-----------+---------+-------------+--------+------------+
| a    |         1 |       0 |           1 |      0 |          0 |
| b    |         1 |       0 |           1 |      0 |          0 |
|      |         0 |       1 |           1 |      1 |          0 |
+------+-----------+---------+-------------+--------+------------+

I add a comparison with the empty string , and it turns out the space character equals both a space character and the empty string, though their lengths do not agree.

Now, I usually don’t bother with CHAR types, and prefer the VARCHAR or TEXT types, but every once in a while I do use it. Most recently, in a stored routine code in common_schema, I iterate & parse some text. I iterated the text char by char, and declared a local CHAR(1) variable to assign iterated value to. It sure bit me when I got to a space character which insisted it was the empty text (indicating end of iteration).

Of course the immediate and correct solution was to change the local variable’s type to VARCHAR(1) instead. But the time it consumed to find the problem… Be warned and be aware!

12 thoughts on “CHAR and spaces

  1. Roland,
    You were aware enough to use it in _wrap_select_list_columns:
    “…
    declare v_whitespace varchar(1) default ”;
    …”

  2. @Peter,
    Even with sql_mode=’PAD_CHAR_TO_FULL_LENGTH’, I get:

    SELECT c, LENGTH(c), c = ' ', LENGTH(' '), c = '', LENGTH('') FROM c_test;
    +------+-----------+---------+-------------+--------+------------+
    | c    | LENGTH(c) | c = ' ' | LENGTH(' ') | c = '' | LENGTH('') |
    +------+-----------+---------+-------------+--------+------------+
    | a    |         1 |       0 |           1 |      0 |          0 |
    | b    |         1 |       0 |           1 |      0 |          0 |
    |      |         1 |       1 |           1 |      1 |          0 |
    +------+-----------+---------+-------------+--------+------------+
    

    Which is different than before, but still absurd.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.