An argument for using mysqldump

November 9, 2010

I fully agree with Morgan's An argument for not using mysqldump. This post does not come to contradict it, but rather shed a positive light on mysqldump.

I usually prefer an LVM snapshot based backup, or using XtraBackup. And, with databases as large as dozens of GB and above, I consider mysqldump to be a poor alternative. Poor in runtime, poor in overhead while taking the backup.

However once in a while I get to be reminded that mysqldump just works.

As a recent example, I had a server which was killed after an ALTER TABLE statement hanged forever (table already ALTERed, but old scheme never dropped). The old table data still hanged around the file system, but was not recognized by InnoDB. Trying out DISCARD TABLESPACE did not do the job, and eventually file was dropped.

So far, reasonable. InnoDB would complain about some table it never recognized in the first place, but all would work. That is, until backup was concerned. With innobackup or XtraBackup the restore would fail on some internal problem. LVM would work, but would only copy+paste the problem: innobackup would never again be able to be used on this database.

It turned out a 120GB InnoDB compressed data (roughly 250GB uncompressed) would dump in --single-transaction in a matter of 4 hours and would restore in a matter of some 20 hours. A whole lot more than the 3 hours total it would take for an LVM backup for that database. But the data would load well; no missing tablespaces.

I've had similar incidents in the past. Not to mention the issue of compressing shared tablespace file.

There's something about being able to say "I'm not sure how long this is going to take; maybe a day or two. But in the end, we will have problems P1, P2 & P3 resolved".

I like the clean state you get from a mysqldump restore.

tags: ,
posted in MySQL by shlomi

« | »

Follow comments via the RSS Feed | Leave a comment | Trackback URL

9 Comments to "An argument for using mysqldump"

  1. Daniël van Eeden wrote:

    There is yet another thing why I also like mysqldump: you can more easily grep for some data which should or shouldn't be there which is great for security.

    And restore on a completely other platform and/or mysql version also needs mysqldump.

    With mysqldump is should also be possible to load data in another RDBMS, but I never thried that.

  2. Jason wrote:

    You can also *pull* a backup from a remote host using mysqldump.

  3. Ronald Bradford wrote:

    mysqldump has other benefits.

    You can and should always dump schema and data separately. This gives you a version of your schema.

    mysqldump allows for incremental data recovery, e.g. one table. Even if in one data file you can extract it as it's text.

    You can use mysqldump to modify data during a restore. Ideal for test environments where CLI tools are more effective then DB commands.

    mysqldump is the only way to cross OS boundaries.

  4. Morgan Tocker wrote:

    Hi Shlomi,

    I like the clean state as well. I depend on it for moving between major versions - as I do not trust that just using the old data files in a new release will work without issues.

    I do not like people to pin me down to saying "what is too large for mysqldump" (because it's the table structure that really matters). I tell them if they need a number it's 10G for day-to-day backups. After that you may want to switch.

    "I’ve had similar incidents in the past. Not to mention the issue of compressing shared tablespace file."

    A lot of this issue probably comes from actually *running mysqldump*. Using --single-transaction requires old versions of rows need to be kept around while the backup runs, so it can be very dangerous on large systems with a lot of modifications. It also churns all content through the buffer pool, which prior to InnoDB plugin releases can be particularly problematic due to bug #45015. If you stop using mysqldump, you may stop having to do this tablespace compaction :)

    Compare this to xtrabackup, which uses the end of the backup as a synchronization point - and does not load the data through your InnoDB buffer pool. The way it works is actually very intelligent:

    http://www.percona.com/docs/wiki/percona-xtrabackup:xtrabackup:internals#tuning_the_os_buffers

    ---

    Answering some other points:

    You do not need to use mysqldump to restore from Windows->Linux or 64-bit to 32-bit as other comments have noted. The only incompatibilities you will have come from case sensitivity, but this can be solved with lower_case_table_names http://dev.mysql.com/doc/refman/5.0/en/identifier-case-sensitivity.html.

    You do not need mysqldump to restore a table at a time either. xtrabackup can prepare tables for export (from a MySQL/Percona server), and import them into any Percona server. (Note: a regular MySQL servers can import back to the same server 1 table at a time, but not back to any server).

    You do not need mysqldump to pull from another server - xtrabackup can do a streaming backup. Use ssh user@server [cmd] to start the backup and receive it locally.

    You do need mysqldump if you want to restore individual rows in an adhoc fashion. This is very common, which is why many people take more than one type of backup. But what I am saying in my original post, is that the primary backup to be used in failure situations should not be mysqldump. That is at least for anyone with just a little data.

  5. Daniël van Eeden wrote:

    mysqldump is needed when migrating from a LSB to MSB architecture.
    mysqldump works for all table types. XtraBackup doesn't work for NDB or custom storage engines.

    XtraBackup is the fastest tool for 80% of the backup/restores, for the other 20% there is mysqlbackup.

  6. Morgan Tocker wrote:

    @Daniël - nope. Maybe with MySQL cluster, but with InnoDB and MyISAM the endianess doesn't matter. There's even a MySQL certificate question to that effect. You just need:

    - two's-complement signed integers
    - IEEE floating-point math

    This is true for most hardware except for perhaps embedded.

    NDB has it's own backup, which I would use in favor of mysqldump.

  7. shlomi wrote:

    Morgan,

    "...A lot of this issue probably comes from actually *running mysqldump*..."
    Not sure what you meant, but I was referring to three distinct incidents I had, where innobackup/xtrabackup failed on backup. In two cases, it was (probably) because of tables unlisted in the data dictionary.
    Either the backup would fail (consistently) halfway through, or the restore would fail halfway through.
    On another case I would get (consistently) duplicate key errors on replication slave set up with innobackup (granted, this was almost 2 years ago).

    Just so no one gets me wrong, what I'm saying is:
    - XtraBackup is a great tool and I'm using it a LOT. Customers are always surprised to find out how easy it is to take a hot backup.
    - It is susceptible to bugs, either withing the backup code itself, as well as to InnoDB code, or inconsistencies within file system.
    - LVM based backups, while not as lightweight as XtraBackup in my experience, are not susceptible to such bugs.
    - mysqldump just works, with the intolerable runtimes on large data sets.

  8. Log Buffer #207, A Carnival of the Vanities for DBAs | The Pythian Blog wrote:

    [...] Noach shed a positive light on mysqldump in response to Morgan’s blog post. Even though he agrees with some of Morgan’s [...]

  9. Sherzod Odinaev wrote:

    I am couple years late to this conversation, but the technique I used to speed up mysqldump was to backup tables by groups, and run them simultaneously. If I had 1K tables, I would split them into group of 50 tables (where a group has at least one of the largest tables), process them in parallel by piping to gzip. Couple hundred Gs in 30 minutes.

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by openark.org