June 27, 2012

Have just read INSERT, Don’t DELETE by Aaron Brown, and have some lengthy response, which is why I write this post instead of commenting on said post.

I wish to offer my counter thought and suggest that DELETEs are probably the better choice.

Aaron suggests that, when one wishes to purge rows from some table, a trick can be used: instead of DELETEing unwanted rows, one can INSERT "good" rows into a new table, then switch over with RENAME (but please read referenced post for complete details).

I respectfully disagree on several points discussed.


The fact one needs to block writes during the time of creation of new table is problematic: you need to essentially turn off parts of your application. The posts suggests one could use a slave - but this solution is far from being trivial as well. To switch over, you yet again need to turn off access to DB, even if for a short while.

A switch over to a slave is quite a big deal, in my opinion, for the mere purpose of deletion of rows.

DELETEs are easy

The DELETEs are so much easier: the first thing to note is the following: You don't actually have to delete all the rows *at once*.

You just need to drop some rows, right? Why waste a huge transaction that takes minutes, when you can drop the rows by chunks, one at a time?
For that, you can use either pt-archive from Percona Toolkit, oak-chunk-update from openark-kit, or write a simple QueryScript code with common_schema:

while (DELETE FROM title WHERE title <= 'g' LIMIT 1000)
  throttle 1;

So, drop 1,000 rows or so at a time, then sleep some time, etc. The total runtime is longer, but who cares? The impact can be reduced to be unnoticeable.

Space reclaim

You can use online table operations to rebuild your table and reclaim the disk space. Either see oak-online-alter-table or pt-online-schema-change. Again, both work in small chunks, so no long stalls.

But more on this: my usual purge scenario shows that it is repetitive. You purge, data fills again, you purge again, and so on.

Which is why it doesn't make much sense to rebuild the table and reclaim the disk space: it just grows again to roughly same dimensions.
For a one time operation (e.g. after neglect of cleanup for long time) -- yes, absolutely, do a rebuild and reclaim. For repetitive cleanup - I don't bother.


Aaron does make note at the end of his post that DELETE operations can be done online, while the INSERT trick requires downtime, and this is a fair assessment.

But just to make a point: none of the DELETE timings are interesting. Since we are not concerned with deleting the rows in a given time (no "press the red button"), we can spread them over time and make the impact negligible. So not only is everything done online, it also goes unnoticed by the user. And this, I believe, is the major thing to consider.

  • I largely agree with your statements, but we do have a corner case where the lockdown isn't an issue: the application only writes to the tables at a set interval, so we were able to utilize Aaron's suggested solution a couple of times. In that particular case we needed to remove 95% of all records and deleting them in a loop would have taken us days instead of minutes.

    At the same I do want to add that Aaron's suggestion also has a problem on high concurrent MySQL systems where the create/drop will certainly cause stalls in the LRU.

  • Hi Shlomi,

    Thanks for responding to my post. I agree with all the points that you have made above. I didn't intend to imply that the INSERT technique was universally better, only that it was faster in some cases, which was an important factor to me. It certainly has a number of limitations, particularly if you have foreign key relationships or cannot stop writes to the table. My specific use was to get rid of about 50MM rows in over a dozen tables with only soft foreign key relationships (Rails application) and also have an easy rollback in the event that something broke. By doing this, you end up with a complete copy of your table which you can drop at your leisure or move back into place if something goes wrong.

    @Art - I brought up the stall issue in my post along with problems caused by ext filesystems. I can't see this technique being very effective on a live server unless you can guarantee no writes to the table.

  • @Art + Aaron,
    Nor did I mean to suggest Aaron's solution was universally inferior.
    There are always cases where the standard solution to a problem would perform worse than an unconventional ad-hoc one.
    For that matter, I think Aaron's trick is cool!

  • Rick James

    My discussion on the problem:

    For time-based purging, PARTITION by "range" is an excellent solution to large DELETEs.

  • @Rick I agree, if you know ahead of time that you are going to purging large amounts of data regularly, using partitioned tables is an ideal solution.

  • Due to the fact that large tables are usually continually used by a live system, I found that lock-delays of a few seconds or more (minutes, hours) are unacceptable. Therefore I took a different approach to this problem.

    I've created a script that:
    - creates a new table MY_TABLE_NEW
    - creates a merge-table MY_TABLE_MERGE combining MY_TABLE_NEW and MY_TABLE whereby inserts are done in MY_TABLE_NEW
    - renames MY_TABLE into MY_TABLE_OLD and at the same time MY_TABLE_MERGE into MY_TABLE

    This effectively creates a merge table with all the data, still accessible through the name MY_TABLE where all new records are moved into MY_TABLE_NEW - which is optimized of course as it is empty.

    Then the script slowly starts moving records from MY_TABLE to MY_TABLE. It takes a record (or fifty), deletes them and inserts them. That effectively moves them from MY_TABLE_OLD to MY_TABLE_NEW. That slowly builds a MY_TABLE_NEW which is optimized, while the system continues to work.

    The chances of the system needing a specific record while it's being deleted and inserted are very, very small with the tables I've used. Of course that depends highly on the way your system processes data. But for me, it worked flawlessly.

    I've used this script dozens of times on production systems. It's coping with interruptions, can be run in parallel on several tables at the same time and has virtually no downsides.

    Of course you cannot merge merge-tables, so when you already have merge-tables, this solution may not be for you.

    If you'd like to take a look at the script, let me know in a comment below.

    Eelko de Vos

Powered by Wordpress and MySQL. Theme by