TokuDB configuration variables of interest

October 23, 2013

During our experiments I came upon a few TokuDB variables of interest; if you are using TokuDB you might want to look into these:

  • tokudb_analyze_time

This is a boundary on the number of seconds an ANALYZE TABLE will operate on each index on each partition on a TokuDB table.

That is, if tokudb_analyze_time = 5, and your table has 4 indexes (including PRIMARY) and 7 partitions, then the total runtime is limited to 5*4*7 = 140 seconds.

Default in 7.1.0: 5 seconds

  • tokudb_cache_size

Similar to innodb_buffer_pool_size, this variable sets the amount of memory allocated by TokuDB for caching pages. Like InnoDB the table is clustered within the index, so the cache includes pages for both indexes and data.

Default: 50% of total memory

  • tokudb_directio

Boolean, values are 0/1. Setting tokudb_directio = 1 is like specifying innodb_flush_method = O_DIRECT. Which in turn means the OS should not cache pages requested by TokuDB. Default: 0.

Now here's the interesting part: we are used to tell InnoDB to get the most memory we can provide (because we want it to cache as much as it can) and to avoid OS caching (because that would mean a page would appear both in the buffer pool and in OS memory, which is a waste). So the following setup is common:

innodb_buffer_pool_size = [as much as you can allocate while leaving room for connection memory]G
innodb_flush_method = O_DIRECT

And my first instinct was to do the same for TokuDB. But after speaking to Gerry Narvaja of Tokutek, I realized it was not that simple. The reason TokuDB's default memory allocation is 50% and not, say, 90%, is that OS cache caches the data in compressed form, while TokuDB cache caches data in uncompressed form. Which means if you limit the TokuDB cache, you allow for more cache to the OS, that is used to cache compressed data, which means more data (hopefully, pending duplicates) in memory.

I did try both options and did not see an obvious difference, but did not test this thoroughly. My current setup is:

#No setup. just keep to the default for both:
  • tokudb_commit_sync

  • tokudb_fsync_log_period

These two variable are similar in essence to innodb_flush_log_at_trx_commit, but allow for finer tuning. With innodb_flush_log_at_trx_commit you choose between syncing the transaction log to disk upon each commit and once per second. With tokudb_commit_sync = 1 (which is default) you get transaction log sync to disk per commit. When tokudb_commit_sync = 0, then tokudb_fsync_log_period dictates the interval between flushes. So a value of tokudb_fsync_log_period = 1000 means once per second.

Since our original InnoDB installation used innodb_flush_log_at_trx_commit = 2, our TokuDB setup is:

tokudb_commit_sync = 0
tokudb_fsync_log_period = 1000
  • tokudb_load_save_space

Turned on (value 1) by default as of TokuDB 7.1.0, this parameter decides whether temporary file created on bulk load operations (e.g. ALTER TABLE) are compressed or uncompressed. Do yourself a big favour (why? read here) and keep it on. Our setup is:

tokudb_load_save_space = 1

TokuDB's general recommendation is: don't change the variables; the engine should work well right out of the box. I like the approach (by MySQL 5.5 I already lost count of InnoDB variables that can have noticeable impact; with 5.6 I'm all but lost). The complete list of configuration variables is found in TokuDB's Users Guide.

  • Hi Shlomi, Nice article. Thanks. I'm not sure I agree with you on tokudb_load_save_space var. Although I do agree that when you migrate a big table you need to be careful as you the size of the temp files might be occupy about 150% of the size of your original table. But if you turn it off, you will disable a very useful feature - a bulk load. I love it specifically if you need to alter medium size table or load into csv file.

  • Just one more thing. We had a very similar issue as you described in "Converting an OLAP database to TokuDB". Our dataset was 1.7T (uncompressed InnoDB) on 2T disk with the largest table 600G. I started altering from small tables. That would buy me a space. by the time a reached the biggest table I had enough space. BTW our data set after migration to TokuDB shrinked to 300Gig.

  • Hi Shlomi, this is a great article and very well put.

    Regarding the tokudb_cache_size and tokudb_directio the rule of thumb is that if your compressed data fits in RAM, then you'll have the advantage of the OS buffers when using buffered IO (tokudb_directio = 0). However, when your compressed data doesn't fit in memory, you will have the additional overhead of the compression / decompression while reading and writing the data to and from disk. In this case you should use tokudb_directio = 1 and manually set tokudb_cache_size following the same criteria usually recommended for innodb_buffer_pool_size. In most other cases, you won't notice much difference.

    I hope this helps to understand these 2 variables a little better.

    PS: My last name is spelled with a 'j' instead of a 'g' 😉

  • @Igor,
    Turning on tokudb_load_save_space will not disable the bulk load (well, it didn't for me), so it's a win-win variable afaik

  • @Gerry,
    Thanks. Apologies on the typo, fixed.

  • @Shiomi once again thanks for a great article. I'm going to send it to my manager, so he appropriate the work we did with the migration process.
    One more issue I had trying to load a big file was an error: "ERROR 1197 (HY000): Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again" I fixed it by disabling writing to binlog for my session. SET sql_log_bin = 0; Did you have this issue and how did you deal with it? many thanks.

  • @Igor, well what was your max_binlog_cache_size? What was the size of your transaction?

  • @shiomi the max_binlog_cache_size was a default value of 4294963200 . I changed it to 62914560000 .
    SET GLOBAL max_binlog_cache_size = 60000 * 1024 * 1024;
    The Transaction size.. How to measure it? I did a load into csv file 31Gig. - failed
    Than I did another import of 5Gig csv file. failed as well with max_binlog_cache_size set to 60gig
    binlog_format set to 'MIXED'.

  • You tried to LOAD DATA INFILE a 31GB cvs file? I guess the problem is now solved. See, it loads as a single transaction.
    Try splitting the file (eg using unix split command) and load in smaller chunks.

    Here's an old yet relevant read:

  • But, I'm not too focused. If this is TokuDB you're importing to, best to ask on TokuDB-user google group to get answer from Tokutek.

  • @shlomi I did it both ways: LOAD INTO and INSERT INTO SELECT *... both failed. I didn't want to ALTER in place as that table was still in use by our customers. I thought you had the same problem altering a big table and you had your binary log enabled, so why didn't you have this issue as I did? what is your secret? 🙂 I will ask this on Tokudb google-list,sure.

  • TokuDB default parameters worked surprisingly well for me. I got an error "Too many open files", so I increased the file limit.


  • Pingback: Operationalizing TokuDB | MySQL and Stuff()

  • Igor Shevtsov

    Hi Shlomi, what a nice article! I keep coming back to it.
    Percona included Hot TokuDB backup into 5.6 Percona Server.
    It works great, but I found 1 issue I want to share here. If tokudb_commit_sync = 0 and flushing to a storage controlled by tokudb_fsync_log_period, on a busy server you might have issues connecting to the Master. I take tokudb_backup snapshot on the slave with replication position recorded, so when the snapshot is restored and TokuDB recovered, replication fails with ERROR for update/delete row missing. I don't have issues when tokudb_commit_sync = 1.

Powered by Wordpress and MySQL. Theme by