An InnoDB table must have a primary key (one is created if you don’t do it yourself). You may have a natural key at hand. Stop! Allow me to suggest an AUTO_INCREMENT may be better.
Why should one add an AUTO_INCREMENT PRIMARY KEY on a table on which there’s a natural key? Isn’t an AUTO_INCREMENT a pseudo key, meaning, it doesn’t have any explicit relation to the row data, other than it is a number and unique?
Yes, indeed so. Nevertheless, consider:
- Natural keys are many times multi-columned.
- Multi column PRIMARY KEYs make for larger keys, and make for bloated secondary keys as well. You may be wasting space for storing the additional AUTO_INCREMENT column, but you may gain space back on secondary keys.
- Multi column PRIMARY KEYs make for more locks. See also this post.
- InnoDB INSERTs work considerably faster when worked in ascending PRIMARY KEY order. Can you ensure your natural key is in such order?
- Even though an AUTO_INCREMENT makes for an INSERT bottleneck (values must be given serially), it is in particular helpful to InnoDB by ensuring PRIMARY KEY values are in ascending order.
- AUTO_INCEMENT makes for chronological resolution. You know what came first, and what came next.
- In many datasets, more recent entries are often being accessed more, and are therefore “hotter”. By using AUTO_INCREMENT, you’re ensuring that recent entries are grouped together within the B+ Tree. This means less random I/O when looking for recent data.
- A numerical key is in particular helpful in splitting your table (and tasks on your table) into smaller chunks. I write tools which can work out with any PRIMARY KEY combination, but it’s easier to work with numbers.
@Lachlan,
Indeed so.
@Justin,
Indeed so. I guess for any bullet I put there could be a counter-bullet.
The sentence “In many datasets, more recent entries are often being accessed more” can also read “and in many others – they aren’t”.
If the natural key is the one being mainly accessed, AND it is small enough (an integer, a couple integers etc.), then I agree.
If, however, it is a very large key, then I must disagree (see my reference to an older post.
@flytwokites:
joins have nothing in particular to do with PRIMARY KEYs.
You don;t even have to have *any* index, although indexes are the only MySQL optimization for joins (the alternative is full-join, usually unacceptable).
Call me bookish. But from MySQL Administrator’s Bible By Sheeri Cabral, Page 221
First, a natural key should be looked for, and if a natural key cannot be determined, then a surrogate key may be appropriate.
@Shantanu,
You are not foolish, and neither is Sheeri nor Mark. There are reasons for and against using AUTO_INCREMENT (see comments by Lachlan & Justin).
Best is to think about the reasons, not just take someone’s advice as granted. In our case, please read two paragraphs down the same page (221) and see that the authors provide with a case FOR using a surrogate key over a natural key. Does that contradict what was written before? No. Because there are cases FOR and cases AGAINST.
There is reason to their advice of preferring a natural key; first answer to yourself the question: “what is their reason?”
Regards