TIMESTAMP vs. DATETIME, which should I be using?

They both look the same, have roughly the same use. Nevertheless, they differ in many respects. I’ll note the difference, and note a few pitfalls and peculiarities.

Range

TIMESTAMP starts with the epoch, ‘1970-01-01 00:00:01’ UTC and ends with ‘2038-01-19 03:14:07’ UTC. This is all very nice today, and may actually hold up till our retirement, but the younger readers may yet have to deal with the Bug2K+38™, certain to arrive.

DATETIME starts with ‘1000-01-01 00:00:00’ and lasts up to ‘9999-12-31 23:59:59’. More on that later.

In respect of range, your current event logs may well use a TIMESTAMP value, while your grandfather’s and granddaughter’s birth dates may require DATETIME.

In general I would suggest that anything that relates to now, can be a TIMESTAMP. A new entry is added? It is added now. It can be represented by a TIMESTAMP. Anything has an expiry time of a few minutes, perhaps a few days or a month? You’ll be safe using it up till late 2037. Anything else had better use a DATETIME. In particular, dates relating to issues such as birth, insurance, the market etc. fall into this category.

History, however, does not even hold up with DATETIME. Rome fell long before MIN(DATETIME). You will have to manage your own. Not even YEAR will help you out.

Storage

TIMESTAMP makes for 4 bytes. DATETIME makes for 8 bytes. Now that we have this behind us, let’s see why.

Internal representation

A TIMESTAMP is merely the number of elapsed seconds since the epoch. It is a number. Not only is this a number, it is an incrementing number, and without gaps. Ever ascending. It actually equals the UNIX_TIMESTAMP() function.

A DATETIME is more of an oddity. From the manual:

DATETIME: Eight bytes:

A four-byte integer packed as YYYY×10000 + MM×100 + DD

A four-byte integer packed as HH×10000 + MM×100 + SS

Huh?

There’s nothing sequential about a DATETIME. The value which follows 20110307095159 is 20110307095200. It’s like the string representation without any delimiters. See the following (ignore the fraction, it’s not really there):

root@mysql-5.1.51> SELECT NOW()+0;
+-----------------------+
| NOW()+0               |
+-----------------------+
| 20110307095238.000000 |
+-----------------------+

The implication of this internal representation is that conversions must be made back and forth. When you want to add 5 seconds to a DATETIME, MySQL cannot simply add 5 to some number, but must make for a more complicated calculation.

Peculiarities

Now here’s a question: why does DATETIME start with ‘1000-01-01 00:00:00’? Can you provide with a convincing argument? The only argument I can find is hardly convincing.

Did you ever need to type in a password with your phone? Say, for your Credit Card company, or your Bank? Did you get recorded instructions saying “You must type six digits. You must not repeat the same digits three successive times“? That’s somewhat nice, and probably a good advice. Did you ever get told “The first digit must not be 0“? That’s just stupid. This means we’re giving up on 100,000 valid passwords, thereby reducing our search space by 10%!

I suspect in both cases the reasoning is the same: if you start with a 0, and we represent it as a number, it won’t make up for the same number of digits we intended it to.

In the case of your Credit Card company, there is no excuse. What’s the problem with padding with zeros till you make those six digits?

In the case of MySQL’s DATETIME, I can see a weak reasoning. Read this page on the docs to find out that a DATETIME can be represented in many forms, and can be automatically deduced from a string in many formats. Both strings ‘110307095100’ and ‘20110307095100’ make for the same DATETIME value. It is based on the number of characters in your text, that the parsing decides how to act.

The fact the first year is 1000 makes the number of digits predictable.

My thoughts?

I don’t see why I should care about automatically converting texts of different formats to a DATETIME. As a programmer, I’m perfectly content with strict typing. I don’t mind passing around only texts of the form ‘YYYY-MM-DD HH:MM:SS’ (or, better yet, passing date objects and letting my connector do the translation). Moreover, I prefer it that way! It makes me feel safer, that I haven’t passed a wrong text by mistake, to be silently accepted.

For this reason I don’t like the idea of losing the ability to use DATETIME on 1,000 lost years. As far as I can tell, this is a MySQL specific issue; there is no ANSI SQL for DATETIME.

[UPDATE: Sheeri notes the biggest difference: DATETIME does not support time zones. A good discussion is also available on OurSQL]

9 thoughts on “TIMESTAMP vs. DATETIME, which should I be using?”

Chris says:

May 22, 2011 at 7:29 am

I’ve always found it odd that a DATETIME is actually two bytes larger than storing DATE and TIME columns separately. TIMESTAMP’s history of magic auto-update-on-modification behavior has led me to avoid them in order to prevent confusion among developers less familiar with MySQL’s magic (at least you can disable it now, but that wasn’t always the case).

Anyway, I generally just store things as INTs. FROM_UNIXTIME() exists for those few times I want a date string out of the database instead of something easy (like an INT) to feed into my language’s date/time class.
Baron Schwartz says:

May 22, 2011 at 1:33 pm

Remember the bad old days, when TIMESTAMP and DATETIME were formatted differently when displayed to the user?

Date and time types are one of the places where MySQL shows a bit of a legacy. I know the Drizzle developers have a lot to say about this, too 🙂
Shantanu Oak says:

May 24, 2011 at 9:42 am

While dumping data, –tz-utc needs to be used …

http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html#option_mysqldump_tz-utc
_____

@Chris
>> MySQL’s magic auto-update-on-modification behavior has led me to avoid timestamp

agreed. very true.

>> store things as INTs. FROM_UNIXTIME()
I have seen people doing this, but I prefer good old datetime!
Ziv says:

May 26, 2011 at 4:11 pm

I always prefer using unix time stamp rather than using the DB date, datetime or timestamp types.

Using unix time stamp I gain portability to other system, and – by using very simple arithmetic in the SQL I have all the functions available for dates and time…
shlomi says:

May 26, 2011 at 4:42 pm

@Ziv,
But then you lose timestamp information; or are you always using UTC?
Another thing: using the DB’s timestamps ensures consistency. For example, imagine two machines not in sync with global time, which have time offset between themselves.
Also, as a more common scenario. machines that do not always correctly update with the latest Summer Saving time info (especially true for Israel, where the date is not constant and must be updated each year).