TIMESTAMP vs. DATETIME, which should I be using?

They both look the same, have roughly the same use. Nevertheless, they differ in many respects. I’ll note the difference, and note a few pitfalls and peculiarities.

Range

TIMESTAMP starts with the epoch, ‘1970-01-01 00:00:01’ UTC and ends with ‘2038-01-19 03:14:07’ UTC. This is all very nice today, and may actually hold up till our retirement, but the younger readers may yet have to deal with the Bug2K+38™, certain to arrive.

DATETIME starts with ‘1000-01-01 00:00:00’ and lasts up to ‘9999-12-31 23:59:59’. More on that later.

In respect of range, your current event logs may well use a TIMESTAMP value, while your grandfather’s and granddaughter’s birth dates may require DATETIME.

In general I would suggest that anything that relates to now, can be a TIMESTAMP. A new entry is added? It is added now. It can be represented by a TIMESTAMP. Anything has an expiry time of a few minutes, perhaps a few days or a month? You’ll be safe using it up till late 2037. Anything else had better use a DATETIME. In particular, dates relating to issues such as birth, insurance, the market etc. fall into this category.

History, however, does not even hold up with DATETIME. Rome fell long before MIN(DATETIME). You will have to manage your own. Not even YEAR will help you out.

Storage

TIMESTAMP makes for 4 bytes. DATETIME makes for 8 bytes. Now that we have this behind us, let’s see why.

Internal representation

A TIMESTAMP is merely the number of elapsed seconds since the epoch. It is a number. Not only is this a number, it is an incrementing number, and without gaps. Ever ascending. It actually equals the UNIX_TIMESTAMP() function.

A DATETIME is more of an oddity. From the manual:

  • DATETIME: Eight bytes:
    • A four-byte integer packed as YYYY×10000 + MM×100 + DD
    • A four-byte integer packed as HH×10000 + MM×100 + SS

Huh?

There’s nothing sequential about a DATETIME. The value which follows 20110307095159 is 20110307095200. It’s like the string representation without any delimiters. See the following (ignore the fraction, it’s not really there):

root@mysql-5.1.51> SELECT NOW()+0;
+-----------------------+
| NOW()+0               |
+-----------------------+
| 20110307095238.000000 |
+-----------------------+

The implication of this internal representation is that conversions must be made back and forth. When you want to add 5 seconds to a DATETIME, MySQL cannot simply add 5 to some number, but must make for a more complicated calculation.

Peculiarities

Now here’s a question: why does DATETIME start with ‘1000-01-01 00:00:00’? Can you provide with a convincing argument? The only argument I can find is hardly convincing.

Did you ever need to type in a password with your phone? Say, for your Credit Card company, or your Bank? Did you get recorded instructions saying “You must type six digits. You must not repeat the same digits three successive times“? That’s somewhat nice, and probably a good advice. Did you ever get told “The first digit must not be 0“? That’s just stupid. This means we’re giving up on 100,000 valid passwords, thereby reducing our search space by 10%!

I suspect in both cases the reasoning is the same: if you start with a 0, and we represent it as a number, it won’t make up for the same number of digits we intended it to.

In the case of your Credit Card company, there is no excuse. What’s the problem with padding with zeros till you make those six digits?

In the case of MySQL’s DATETIME, I can see a weak reasoning. Read this page on the docs to find out that a DATETIME can be represented in many forms, and can be automatically deduced from a string in many formats. Both strings ‘110307095100’ and ‘20110307095100’ make for the same DATETIME value. It is based on the number of characters in your text, that the parsing decides how to act.

The fact the first year is 1000 makes the number of digits predictable.

My thoughts?

I don’t see why I should care about automatically converting texts of different formats to a DATETIME. As a programmer, I’m perfectly content with strict typing. I don’t mind passing around only texts of the form ‘YYYY-MM-DD HH:MM:SS’ (or, better yet, passing date objects and letting my connector do the translation). Moreover, I prefer it that way! It makes me feel safer, that I haven’t passed a wrong text by mistake, to be silently accepted.

For this reason I don’t like the idea of losing the ability to use DATETIME on 1,000 lost years. As far as I can tell, this is a MySQL specific issue; there is no ANSI SQL for DATETIME.

[UPDATE:  Sheeri notes the biggest difference: DATETIME does not support time zones. A good discussion is also available on OurSQL]

 

9 thoughts on “TIMESTAMP vs. DATETIME, which should I be using?

  1. @Shlomi,
    Usually I’m not taking the unix time stamp from my code, but using thd DB function that fetch its (the DB servers) time.

  2. There are other major differences between TIMESTAMP and DATETIME,

    1. TIMESTAMP columns can be told to be updated on INSERTs and UPDATEs, without providing a value for these columns, that is particularly useful in logging.
    2. TIMESTAMP columns are converted back and forth between UTC and the session timezones, when the value is inserted/updated in the TIMESTAMP column its converted to UTC, and when the value is retrieved its converted to your session timezone

  3. So “which-should-i-be-using” that part is not addressed or did I miss it?

    Moreover, I find the difference to be more vague since, as of MySQL 5.6.5, both TIMESTAMP and DATETIME columns can be automatically initialized and updated to the current date and time.

    Also, normally when people talk about timezone being changed, are they referring to timezone of server itself (why would that happen?) or its replica located in different timezone or are they talking about end-user sitting in some other timezone and viewing the data (how should that matter as MySQL is not considering that while storing/retrieval)?

    I did a little research and it seems that best practice would be to set all your severs in UTC timezone and use TIMESTAMP datatype for dates. Of course, year 2038 limit of TIMESTAMP is a concern but hopefully we will fix it well in time 🙂

    What are your thoughts?

  4. > I did a little research and it seems that best practice would be to set all your severs in UTC timezone and use TIMESTAMP datatype for dates.

    That’s a good rule to work with.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.