CHAR and spaces

I know about it, I knew about it all along, but… it’s so easy to fall for it; there’s just so much absurdity!

A CHAR type has a known number of characters. For example, the column:

CountryCode CHAR(3) CHARSET ascii NOT NULL

– is known to have exactly three characters. These could be ‘USA’, ‘FRA’, etc.

What happens with spaces? A SQL CHAR type ignores any trailing spaces; thus, the code ‘GB ‘ (the characters ‘G’, ‘B’, and the space ‘ ‘) is interpreted as ‘GB’. Trailing spaces are not regarded as part of the text. Want to see some absurdity?

CREATE TABLE `c_test` (
  `c` char(1) DEFAULT NULL
);
INSERT INTO c_test VALUES ('a');
INSERT INTO c_test VALUES ('b');
INSERT INTO c_test VALUES (' ');

SELECT c, LENGTH(c), c = 'a', c = 'b', c = ' ', LENGTH(' ') FROM c_test;
+------+-----------+---------+---------+---------+-------------+
| c    | LENGTH(c) | c = 'a' | c = 'b' | c = ' ' | LENGTH(' ') |
+------+-----------+---------+---------+---------+-------------+
| a    |         1 |       1 |       0 |       0 |           1 |
| b    |         1 |       0 |       1 |       0 |           1 |
|      |         0 |       0 |       0 |       1 |           1 |
+------+-----------+---------+---------+---------+-------------+

Note that LENGTH(c) is zero on third row, but nevertheless it equals the space character (c = ‘ ‘). Although, the length of the space character is… 1 (and this is because the constant ‘ ‘ is interpreted as VARCHAR).

To emphasize the absurdity, now take a closer look:

SELECT c, LENGTH(c), c = ' ', LENGTH(' '), c = '', LENGTH('') FROM c_test;
+------+-----------+---------+-------------+--------+------------+
| c    | LENGTH(c) | c = ' ' | LENGTH(' ') | c = '' | LENGTH('') |
+------+-----------+---------+-------------+--------+------------+
| a    |         1 |       0 |           1 |      0 |          0 |
| b    |         1 |       0 |           1 |      0 |          0 |
|      |         0 |       1 |           1 |      1 |          0 |
+------+-----------+---------+-------------+--------+------------+

I add a comparison with the empty string , and it turns out the space character equals both a space character and the empty string, though their lengths do not agree.

Now, I usually don’t bother with CHAR types, and prefer the VARCHAR or TEXT types, but every once in a while I do use it. Most recently, in a stored routine code in common_schema, I iterate & parse some text. I iterated the text char by char, and declared a local CHAR(1) variable to assign iterated value to. It sure bit me when I got to a space character which insisted it was the empty text (indicating end of iteration).

Of course the immediate and correct solution was to change the local variable’s type to VARCHAR(1) instead. But the time it consumed to find the problem… Be warned and be aware!

12 thoughts on “CHAR and spaces

  1. The behaviour for PostgreSQL 8.4 seems identical.

    chartest=# CREATE TABLE c_test (c char(1) DEFAULT NULL);
    CREATE TABLE
    chartest=# INSERT INTO c_test VALUES (‘a’);
    INSERT 0 1
    chartest=# INSERT INTO c_test VALUES (‘b’);
    INSERT 0 1
    chartest=# INSERT INTO c_test VALUES (‘ ‘);
    INSERT 0 1
    chartest=# SELECT c, LENGTH(c), c = ‘a’, c = ‘b’, c = ‘ ‘, LENGTH(‘ ‘) FROM c_test;
    c | length | ?column? | ?column? | ?column? | length
    —+——–+———-+———-+———-+——–
    a | 1 | t | f | f | 1
    b | 1 | f | t | f | 1
    | 0 | f | f | t | 1
    (3 rows)

    chartest=# SELECT c, LENGTH(c), c = ‘ ‘, LENGTH(‘ ‘), c = ”, LENGTH(”) FROM c_test;
    c | length | ?column? | length | ?column? | length
    —+——–+———-+——–+———-+——–
    a | 1 | f | 1 | f | 0
    b | 1 | f | 1 | f | 0
    | 0 | t | 1 | t | 0
    (3 rows)

  2. And now with pre tags:

    chartest=# CREATE TABLE c_test (c char(1) DEFAULT NULL);
    CREATE TABLE
    chartest=# INSERT INTO c_test VALUES (‘a’);
    INSERT 0 1
    chartest=# INSERT INTO c_test VALUES (‘b’);
    INSERT 0 1
    chartest=# INSERT INTO c_test VALUES (‘ ‘);
    INSERT 0 1
    chartest=# SELECT c, LENGTH(c), c = ‘a’, c = ‘b’, c = ‘ ‘, LENGTH(‘ ‘) FROM c_test;
    c | length | ?column? | ?column? | ?column? | length
    —+——–+———-+———-+———-+——–
    a | 1 | t | f | f | 1
    b | 1 | f | t | f | 1
    | 0 | f | f | t | 1
    (3 rows)

    chartest=# SELECT c, LENGTH(c), c = ‘ ‘, LENGTH(‘ ‘), c = ”, LENGTH(”) FROM c_test;
    c | length | ?column? | length | ?column? | length
    —+——–+———-+——–+———-+——–
    a | 1 | f | 1 | f | 0
    b | 1 | f | 1 | f | 0
    | 0 | t | 1 | t | 0
    (3 rows)

    chartest=# select version();
    version
    —————————————————————————————————————–
    PostgreSQL 8.4.9 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6), 64-bit
    (1 row)

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.