Choosing MySQL boolean data types

June 3, 2010

How do you implement True/False columns?

There are many ways to do it, each with its own pros and cons.


Create you column as ENUM('F', 'T'), or ENUM('N','Y') or ENUM('0', '1').

This is the method used in the mysql tables (e.g. mysql.user privileges table). It's very simple and intuitive. It truly restricts the values to just two options, which serves well. It's compact (just one byte).

A couple disadvantages to this method:

  1. Enums are represented by numerical values (which is good) and start with 1 instead of 0. This means 'F' is 1, and 'T' is 2, and they both translate to True when directly used in a booleanic expression (e.g. IF(val, 'True', 'False') always yields 'True')
  2. There's no real convention. Is it 'Y'/'N'? 'T'/'F'? 'P'/'N'? '1'/'0'?


Simple again. Proposed values are, as before, 'F', 'T' etc. This time there's no way to limit the range of values. You cannot (in MySQL, unless using triggers) prevent an 'X'.

Watch out for the charset! If it's utf8 you pay with 3 bytes instead of just 1. And, again, 'T', 'F', 'Y', 'N' values all evaluate as True. It is possible to use the zero-valued character, but it defeats the purpose of using CHAR.


Many are unaware that it's even valid to make this definition. What does it mean? Take a look at the following table:

 `bval` char(0) DEFAULT NULL
mysql> INSERT INTO t1 VALUES ('');
mysql> INSERT INTO t1 VALUES ('');

mysql> SELECT * FROM t1;
| bval |
|      |
|      |
| NULL |

NULLable columns cause for an additional storage per row. There's one bit per NULLable column which notes down whether the column's value is NULL or not. If you only have one NULLable column, you must pay for this bit with 1 byte. If you have two NULLable columns, you still only pay with 1 byte.


mysql> SELECT bval IS NOT NULL FROM t1;
| bval IS NOT NULL |
|                1 |
|                1 |
|                0 |

So this plays somewhat nicely into booleanic expressions.

However, this method is unintuitive and confusing. I personally don't use it.


With integer values, we can get down to 0 and 1. With TINYINT, we only pay with 1 byte of storage. As with CHAR(1), we cannot prevent anyone from INSERTing other values. But that doesn't really matter, if we're willing to accept that 0 evaluates as False, and all other values as True. In this case, boolean expressions work very well with your column values.


These are just synonyms to TINYINT. I like to define my boolean values as such. Alas, when issuing a SHOW CREATE TABLE the definition is just a normal TINYINT. Still, it is clearer to look at if you're storing your table schema under your version control.

  • It may be interesting to note that if you need multiple T/F columns in a table you can pack them all in the BIT data type. This would only take approximately (M+7)/8 bytes where M is the number of bits (although you could probably do this if you bitmask a TINYINT too).

  • @Andrew,
    Cool. This is DBA-ing C-style. But makes for complicated code and impossible indexing. When considering such approach, I think the CHAR(0) provides with better (less worse?) design.

  • @Shlomi
    Ah yes, good points, I didn't consider indexing 🙂

    The other advantage of BIT data type is you can have multiple BIT(1) columns and if you are using NDB it will pack them into a 4 byte block automatically. I don't think other storage engines do this unfortunately.

  • @Andrew,
    Didn't know cluster does that (don't know much about cluster, for that matter...). That's cool.

  • Hmm... So what is a good use case where you even want to index a boolean column? Ok, so as part of a multi-column index it makes sense.

    Depending on the app, packing several T/F values into one BIT field may even be an advantage, if there is a case where you use *all* of the boolean values to select rows.

  • @Henrik,

    Yes, in case of multiple columns and uneven distribution. Actually, I find this kind of indexing pretty popular. e.g. key(is_unhandled, customer_id).
    When distribution is really uneven, even a single-column index on the boolean type can be useful (although my experience shows that you can usually extend to add other columns and benefit more queries using the index).

  • You can use (and we mostly do) type ENUM('','Y') so '' translates to False.

  • @Pavel: great!

  • It's also relevant to mention that creating a [VAR]CHAR(5) column to store 'True' or 'False' is not efficient due to extra bytes. I've seen this example before, and changing to ENUM('True','False') both addresses no code changes, and reduces the size to 1 byte.

  • @Ronald,
    Good point.

  • @Andrew,
    My bad: with InnoDB, you cannot create an index on CHAR(0) columns anyhow. So my comment #2 becomes irrelevant.

  • Janis

    So it is about to guessing when mysql will have simple boolean data type? It is rather funny to see how all of us make workarounds of design fault or bug in mysql.

  • Masoud

    Indexing a column that has just 2 different values, like true/false, is useless, and instead of improving read performance, slows down writes... Your data must be unique (like Primary Keys), or at least have less duplicates (like users first/last names) to use advantages of indexing.

  • @Masoud, please see comment #6

Powered by Wordpress and MySQL. Theme by