TIMESTAMP vs. DATETIME, which should I be using?

May 22, 2011

They both look the same, have roughly the same use. Nevertheless, they differ in many respects. I'll note the difference, and note a few pitfalls and peculiarities.

Range

TIMESTAMP starts with the epoch, '1970-01-01 00:00:01' UTC and ends with '2038-01-19 03:14:07' UTC. This is all very nice today, and may actually hold up till our retirement, but the younger readers may yet have to deal with the Bug2K+38™, certain to arrive.

DATETIME starts with '1000-01-01 00:00:00' and lasts up to '9999-12-31 23:59:59'. More on that later.

In respect of range, your current event logs may well use a TIMESTAMP value, while your grandfather's and granddaughter's birth dates may require DATETIME.

In general I would suggest that anything that relates to now, can be a TIMESTAMP. A new entry is added? It is added now. It can be represented by a TIMESTAMP. Anything has an expiry time of a few minutes, perhaps a few days or a month? You'll be safe using it up till late 2037. Anything else had better use a DATETIME. In particular, dates relating to issues such as birth, insurance, the market etc. fall into this category.

History, however, does not even hold up with DATETIME. Rome fell long before MIN(DATETIME). You will have to manage your own. Not even YEAR will help you out.

Storage

TIMESTAMP makes for 4 bytes. DATETIME makes for 8 bytes. Now that we have this behind us, let's see why.

Internal representation

A TIMESTAMP is merely the number of elapsed seconds since the epoch. It is a number. Not only is this a number, it is an incrementing number, and without gaps. Ever ascending. It actually equals the UNIX_TIMESTAMP() function.

A DATETIME is more of an oddity. From the manual:

  • DATETIME: Eight bytes:
    • A four-byte integer packed as YYYY×10000 + MM×100 + DD
    • A four-byte integer packed as HH×10000 + MM×100 + SS

Huh?

There's nothing sequential about a DATETIME. The value which follows 20110307095159 is 20110307095200. It's like the string representation without any delimiters. See the following (ignore the fraction, it's not really there):

root@mysql-5.1.51> SELECT NOW()+0;
+-----------------------+
| NOW()+0               |
+-----------------------+
| 20110307095238.000000 |
+-----------------------+

The implication of this internal representation is that conversions must be made back and forth. When you want to add 5 seconds to a DATETIME, MySQL cannot simply add 5 to some number, but must make for a more complicated calculation.

Peculiarities

Now here's a question: why does DATETIME start with '1000-01-01 00:00:00'? Can you provide with a convincing argument? The only argument I can find is hardly convincing.

Did you ever need to type in a password with your phone? Say, for your Credit Card company, or your Bank? Did you get recorded instructions saying "You must type six digits. You must not repeat the same digits three successive times"? That's somewhat nice, and probably a good advice. Did you ever get told "The first digit must not be 0"? That's just stupid. This means we're giving up on 100,000 valid passwords, thereby reducing our search space by 10%!

I suspect in both cases the reasoning is the same: if you start with a 0, and we represent it as a number, it won't make up for the same number of digits we intended it to.

In the case of your Credit Card company, there is no excuse. What's the problem with padding with zeros till you make those six digits?

In the case of MySQL's DATETIME, I can see a weak reasoning. Read this page on the docs to find out that a DATETIME can be represented in many forms, and can be automatically deduced from a string in many formats. Both strings '110307095100' and '20110307095100' make for the same DATETIME value. It is based on the number of characters in your text, that the parsing decides how to act.

The fact the first year is 1000 makes the number of digits predictable.

My thoughts?

I don't see why I should care about automatically converting texts of different formats to a DATETIME. As a programmer, I'm perfectly content with strict typing. I don't mind passing around only texts of the form 'YYYY-MM-DD HH:MM:SS' (or, better yet, passing date objects and letting my connector do the translation). Moreover, I prefer it that way! It makes me feel safer, that I haven't passed a wrong text by mistake, to be silently accepted.

For this reason I don't like the idea of losing the ability to use DATETIME on 1,000 lost years. As far as I can tell, this is a MySQL specific issue; there is no ANSI SQL for DATETIME.

[UPDATE:  Sheeri notes the biggest difference: DATETIME does not support time zones. A good discussion is also available on OurSQL]

 

tags:
posted in MySQL by shlomi

« | »

Follow comments via the RSS Feed | Leave a comment | Trackback URL

7 Comments to "TIMESTAMP vs. DATETIME, which should I be using?"

  1. Chris wrote:

    I've always found it odd that a DATETIME is actually two bytes larger than storing DATE and TIME columns separately. TIMESTAMP's history of magic auto-update-on-modification behavior has led me to avoid them in order to prevent confusion among developers less familiar with MySQL's magic (at least you can disable it now, but that wasn't always the case).

    Anyway, I generally just store things as INTs. FROM_UNIXTIME() exists for those few times I want a date string out of the database instead of something easy (like an INT) to feed into my language's date/time class.

  2. Baron Schwartz wrote:

    Remember the bad old days, when TIMESTAMP and DATETIME were formatted differently when displayed to the user?

    Date and time types are one of the places where MySQL shows a bit of a legacy. I know the Drizzle developers have a lot to say about this, too :-)

  3. Shantanu Oak wrote:

    While dumping data, --tz-utc needs to be used ...

    http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html#option_mysqldump_tz-utc
    _____

    @Chris
    >> MySQL's magic auto-update-on-modification behavior has led me to avoid timestamp

    agreed. very true.

    >> store things as INTs. FROM_UNIXTIME()
    I have seen people doing this, but I prefer good old datetime!

  4. Ziv wrote:

    I always prefer using unix time stamp rather than using the DB date, datetime or timestamp types.

    Using unix time stamp I gain portability to other system, and - by using very simple arithmetic in the SQL I have all the functions available for dates and time...

  5. shlomi wrote:

    @Ziv,
    But then you lose timestamp information; or are you always using UTC?
    Another thing: using the DB's timestamps ensures consistency. For example, imagine two machines not in sync with global time, which have time offset between themselves.
    Also, as a more common scenario. machines that do not always correctly update with the latest Summer Saving time info (especially true for Israel, where the date is not constant and must be updated each year).

  6. Ziv wrote:

    @Shlomi,
    Usually I'm not taking the unix time stamp from my code, but using thd DB function that fetch its (the DB servers) time.

  7. Ovais Tariq wrote:

    There are other major differences between TIMESTAMP and DATETIME,

    1. TIMESTAMP columns can be told to be updated on INSERTs and UPDATEs, without providing a value for these columns, that is particularly useful in logging.
    2. TIMESTAMP columns are converted back and forth between UTC and the session timezones, when the value is inserted/updated in the TIMESTAMP column its converted to UTC, and when the value is retrieved its converted to your session timezone

Leave Your Comment

 

 
Powered by Wordpress and MySQL. Theme by openark.org