Reasons to use AUTO_INCREMENT columns on InnoDB

March 22, 2011

An InnoDB table must have a primary key (one is created if you don't do it yourself). You may have a natural key at hand. Stop! Allow me to suggest an AUTO_INCREMENT may be better.

Why should one add an AUTO_INCREMENT PRIMARY KEY on a table on which there's a natural key? Isn't an AUTO_INCREMENT a pseudo key, meaning, it doesn't have any explicit relation to the row data, other than it is a number and unique?

Yes, indeed so. Nevertheless, consider:

  • Natural keys are many times multi-columned.
  • Multi column PRIMARY KEYs make for larger keys, and make for bloated secondary keys as well. You may be wasting space for storing the additional AUTO_INCREMENT column, but you may gain space back on secondary keys.
  • Multi column PRIMARY KEYs make for more locks. See also this post.
  • InnoDB INSERTs work considerably faster when worked in ascending PRIMARY KEY order. Can you ensure your natural key is in such order?
  • Even though an AUTO_INCREMENT makes for an INSERT bottleneck (values must be given serially), it is in particular helpful to InnoDB by ensuring PRIMARY KEY values are in ascending order.
  • AUTO_INCEMENT makes for chronological resolution. You know what came first, and what came next.
  • In many datasets, more recent entries are often being accessed more, and are therefore "hotter". By using AUTO_INCREMENT, you're ensuring that recent entries are grouped together within the B+ Tree. This means less random I/O when looking for recent data.
  • A numerical key is in particular helpful in splitting your table (and tasks on your table) into smaller chunks. I write tools which can work out with any PRIMARY KEY combination, but it's easier to work with numbers.
  • Great Post!

    In 5.0 there were some issues with auto_inc performance, so this might only be a good practice if you're using 5.1 or 5.5

    Autinc is also tunable:

  • Lachlan Mulcahy

    Also keep in mind that one of the things you note could be a reason FOR using Natural PRIMARY KEYs. That is to say, if the fields a query requests are only in the PRIMARY KEY and one SECONDARY KEY, then it is possible to serve the query entirely from the SECONDARY KEY.

  • Justin Swanhart

    This is all true, but if you have to create a secondary key on the natural key order to access the table AND you mainly access the table by the primary key, then it might make sense to organize the table around the natural key.

    You will end up significantly increasing random IO once a table becomes large if you use a surrogate key but don't access the data via it. Of course, the adaptive hash index could help with this, but it isn't guaranteed if your access pattern can't utilize it (perhaps range scans)..

  • Justin Swanhart

    I mean to say "mainly access the table by the 'natural key'" then it might make sense to use it.

  • flytwokites

    An auto increment PK is needed for m2m join tables?

  • @Lachlan,
    Indeed so.

    Indeed so. I guess for any bullet I put there could be a counter-bullet.
    The sentence "In many datasets, more recent entries are often being accessed more" can also read "and in many others - they aren't".
    If the natural key is the one being mainly accessed, AND it is small enough (an integer, a couple integers etc.), then I agree.
    If, however, it is a very large key, then I must disagree (see my reference to an older post.

  • @flytwokites:
    joins have nothing in particular to do with PRIMARY KEYs.
    You don;t even have to have *any* index, although indexes are the only MySQL optimization for joins (the alternative is full-join, usually unacceptable).

  • Call me bookish. But from MySQL Administrator's Bible By Sheeri Cabral, Page 221
    First, a natural key should be looked for, and if a natural key cannot be determined, then a surrogate key may be appropriate.

  • @Shantanu,
    You are not foolish, and neither is Sheeri nor Mark. There are reasons for and against using AUTO_INCREMENT (see comments by Lachlan & Justin).
    Best is to think about the reasons, not just take someone's advice as granted. In our case, please read two paragraphs down the same page (221) and see that the authors provide with a case FOR using a surrogate key over a natural key. Does that contradict what was written before? No. Because there are cases FOR and cases AGAINST.
    There is reason to their advice of preferring a natural key; first answer to yourself the question: "what is their reason?"


Powered by Wordpress and MySQL. Theme by