Upgrading to Barracuda & getting rid of huge ibdata1 file

February 15, 2011

Some of this is old stuff, but more people are now converting to InnoDB plugin, so as to enjoy table compression, performance boosts. Same holds for people converting to Percona's XtraDB. InnoDB plugin requires innodb_file_per_table. No more shared tablespace file.

So your ibdata1 file is some 150GB, and it won't reduce. Really, it won't reduce. You set innodb_file_per_table=1, do ALTER TABLE t ENGINE=InnoDB (optionally ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8), and you get all your tables in file-per-table .ibd files.

But the original ibdata1 file is still there. It has to be there, don't delete it! It contains more than your old data.

InnoDB tablespace files never reduce in size, it's an old-time annoyance. The only way to go round it, if you need the space, is to completely drop them and start afresh. That's one of the things so nice about file-per-table: an ALTER TABLE actually creates a new tablespace file and drops the original one.

The procedure

The procedure is somewhat painful:

  • Dump everything logically (either use mysqldump, mk-parallel-dump, or do it your own way)
  • Erase your data (literally, delete everything under your datadir)
  • Generate a new empty database
  • Load your dumped data.

Using replication

Replication makes this less painful. Set up a slave, have it follow up on the master.

  • Stop your slave.
  • Make sure to backup the replication position (e.g. write SHOW SLAVE STATUS on a safe location, or copy master.info file).
  • Work out the dump-erase-generate-load steps on the slave.
  • Reattach the slave to the master using saved data.

For this to succeed you must keep enough binary logs on the master for the entire dump-load period, which could be lengthy.

Upgrading to barracuda

If you wish to upgrade your InnoDB tables to Barracuda format, my advice is this:

  1. Follow the steps above to generate a file-per-table working slave
  2. Stop the slave
  3. Configure skip_slave_start
  4. Restart MySQL
  5. One by one do the ALTER TABLE into Barracuda format (ROW_FORMAT=COMPACT or ROW_FORMAT=COMPRESSED)

Note that if you're about to do table compression, the ALTER statements become considerably slower the better the compression is.

If your dataset is very large, and you can't keep so many binary logs, you may wish to break step 5 above into:

  • ALTER a large table
  • Restart MySQL
  • Start slave, wait for it to catch up
  • Restart MySQL again

and do the same for all large tables.

Why all these restarts?

I've been upgrading to Barracuda for a long time now. I have clearly noticed that ALTER into a COMPRESSED format works considerably slower after the slave has done some "real work". This in particular relates to the last "renaming table" stage. There was a bug with earlier InnoDB plugin versions which made this stage hang. It was solved. But it still takes some time for this last, weird stage, where the new replacement table is complete, and it's actually been renamed in place of the old table, and the old table renamed into something like "#sql-12345.ibd", and all that needs to be done is have it dropped, and... Well, it takes time.

My observation is it works faster on a freshly started server. Which is why I take the bother to restart MySQL before each large table conversion.

  • I used xtrabackup and replication to turn innodb_file_per_table on, avoiding MySQL shutdown: enable innodb_file_per_table with zero downtime.

  • @Dmitri,
    yes, that's what I meant with regard to replication. It is a very common practice.

    Do note that, strictly speaking, it isn't a "zero downtime". There is some (small) period of time where you stop writing to the master, verify that slave has caught up, and move everything to the slave.

    It actually can take a few seconds; with MMM I do that a lot, and I usually evidence a roughly 10 seconds of complete lockdown.

  • The end phase of the table rename is faster on a freshly started server because the buffer pool is empty. See http://www.mysqlperformanceblog.com/2011/02/03/performance-problem-with-innodb-and-drop-table/ for what is happening.

  • @Baron, thanks!

  • Daniël van Eeden

    Removing all files in the datadir is not really what's needed.

    All InnoDB files (ibdata files, iblogfiles, *.ibd) need to be removed. The non-InnoDB files like *.MYI *.MYD and various logfiles can stay in the datadir if they're there.

    Just deleting all files in the datadir is not always sufficient as the iblogfiles are somtimes moved to a different filesystem.

  • @Daniël,
    I usually work with InnoDB-only databases.

    I'm also assuming you won't be dumping just selected tables. It can be done, but complicates stuff. Sure, you can hack your way around, keep all MyISAM files, delete only InnoDB files & logs. What other various logfiles should remain in the datadir? They all become meaningless.

    With regard to "Just deleting all files in the datadir is not always sufficient as the iblogfiles are somtimes moved to a different filesystem": I recommend that you verify that all of:
    - ibdata1, ..
    - ib_logfile[012...]
    - *.ibd files
    are on the same filesystem. Again, unless you actually know what you're doing and have such I/O bottlenecks that you have to move InnoDB files around.
    With InnoDB files spread over multiple filesystems, you lose the ability to take FS snapshots.


  • Daniël van Eeden


    Even InnoDB-only setups do have MyISAM tables for the internal mysql.* tables.

    Just doing a dump/restore for InnoDB tables could be nice if there are a few big MyISAM tables.

    One of the files you might want to safe is the master.info file if it's a slave. And it might be nice to safe errorlogs and slow query logs to compare with the situation after the restore.

    With NetApp you could use one volume with multiple Qtrees to create atomic snapshots and still use multiple mountpoints.

  • @Daniël,
    I'm wondering about your NetApp setup. Do you have your different filesystems mapped to different disks on your NetApp?

  • Gabi Davis

    Hi shlomi,
    Good post.
    However, when handling large and very busy databases the restore from dump is not an option. The time to dump and load as well as the storage required for keeping binary log files till this entire operation is over is a kill over.
    Are there no other ways to do this? we're carrying on our back a file of 350G and the only option we currently have is to keep carrying it even though we plan to have no tables left in it in the near future.
    Also, are there no plans in the mysql world to finally allow altering tables without a rewrite of the full table? that's a huge downside of mysql as I see it.

  • @Gabi,

    When working with a slave, you can get the benefit of using mk-parallel-dump + restore. I've had these boost up export/import time by magnitude of 6x.

    Assuming your dataset size is indeed 350GB, I would guess this should take a day or two at the most (pure guess, based on experience with commodity hardware?). Why don't you try offline and see how much time it takes for export/import, without actually taking down a slave? I mean, export from a (stopped) slave, then import on a local office machine; measure the time for export + import. Then you can see if you have enough room to store the binary logs.

    Since I'm assuming your dataset always grows, I should add: the sooner the better.

    With regard to altering without rewrite of full table: InnoDB plugin's barracuda format allows for fast index creation.
    Or you could use Facebook's OSC, or openark kit's oak-online-alter-table

  • Gabi Davis

    Hi shlomi,
    Thanks for the timely reply 🙂
    Our entire dataset is almost double that, the innodb file is 350G on its own... and although two days of traffic is huge there are definitely ways to overcome the disk issue but in general, it's sad to realize that this is the only way to go. We were hoping to digg deeper and find something else we could do that wouldn't require a dump and reload of the entire data.
    If we have to go this way we'll definitely take the opportunity to make some structural changes we had been planning for long and did not run due to the size of the tables and the nature of 'alter table', and probably also upgrade to 5.5.
    As for the fast index creation - that is definitely great, but there's also the problem of not being able to add/drop columns (and I have to read more about the plugin to see if there's anything about that).
    Thanks again,

  • Gabi,
    I thinks this task is not hard to achieve. It's a one time job, I've done plenty of these before. We can discuss more privately if you like; it's a bit too lengthy to explain everything here.

  • Thanks for the post. You reiterated my earlier post about the same topic.


  • Pingback: Getting rid of huge ibdata file, no dump required | code.openark.org()

  • Avi Vainshtein

    Many thanks for the post, it's very useful.

    Is it necessary to set both the innodb_file_format and innodb_file_format_max to Barracuda in the my.cnf ?

    best regards,

  • I've never set the "innodb_file_format_max" so apparently not necessary. From the docs:

    "... If the server creates or opens a table with a “higher” file format, it sets the value of innodb_file_format_max to that format. "

Powered by Wordpress and MySQL. Theme by openark.org