On compiling TokuDB from source

Sharing my experience of compiling TokuDB + MariaDB 5.5. Why? Because I must have this patch to Sphinx 2.0.4.

Note: I was using what seems to be the “old” method of compiling; quoting Leif Walsh:

… We are looking at deprecating that method of building (MariaDB source plus binary fractal tree handlerton).  It only really needed to be that complex when we were closed source.

I also tried the “new” method of compiling, which I couldn’t work out.

Here’s how it goes: TokuDB is newly released as open source. As such, it got a lot of attention, many downloads and I hope it will succeed.

However as stable as the product may be, it’s new to open source, which means anyone compiling it from source is an early adopter (at least for the compilation process).

Installation process

This is an unorthodox, and actually weird process. See section 6 on the Tokutek docs. In order to compile the project you must download:

  • The source code tar.gz
  • And the binary (?!) tar.gz
  • And the binary checksum
  • And the Tokutek patches
  • And the patches checksum

You extract the source tarball. But instead of doing the standard “./configure && make && sudo make install” you need to copy a shell script called tokudb.build.bash one directory level up, and run it from there.

tokudb.build.bash lists gcc47 and g++47 on rows 3, 4. Modify “gcc47” to “gcc”, modify “g++47” to  “g++”. I’m assuming you don’t have a binary called gcc47. Why would you?

Dependencies

You will need CMake >= 2.8

This means Ubuntu LTS 10.04 users are unable to compile out of the box; will need to manually install later version of CMake.

Also needed is zlib1g-dev, rpmbuild.

While compiling

I ran out of disk space. What? I was using a 10G partition I use for my compilations. Looking at “df -h” I get that:

  • The source tarball is extracted (I did it)
  • The binary tarball is also extracted (someone has to explain this for me)
  • And inside the source directory we have:
bash$ df -h
...
1484    build.RelWithDebInfo.rpms
5540    build.RelWithDebInfo

At about 7GB (and counting) of build… stuff?.

UPDATE: just ran out on disk space again. Is this an incremental thing? Like every time my compilation fails and I recompile some files are not cleaned up? If so, put them on /tmp! OK, moving everything to a 300GB partition and starting all over.

More while compiling

I got errors on missing libraries. Like I was missing libssl, rpmbuild. This is what the “configure” script is for — to test for dependencies. It’s really a bummer to have to recompile 4-5 times (and it’s a long compilation), only to find out there’s another missing package.

After compiling

What is the result of the compilation? Not a “make install” prepared binary. The result is a MySQL-binary package. Se need to extract and put on /usr/local/somewhere etc.

Conclusions

The compilation process is unexpected and non-standard. The output is unexpected.

The correct way of doing this is a “./configure && make && sudo make install”. I don’t understand the need for a binary package while compiling from source. Isn’t this the chicken and the egg?

A source distribution is no different from a binary distribution. You must have a testing environment to verify the source distribution actually works. This test environment is typically a bare-new-RedHat or a bare-new-Ubuntu etc. The machines at Tokutek are already installed with needed packages. Not so on my compilation machine. I suggest that apt-gets and yum installs for dependencies are added to the source distribution testing. This is the only reliable way for you guys at Tokutek to know that clients will actually be able to install via source.

14 thoughts on “On compiling TokuDB from source

  1. Hrm… So my comment on mysqlperformanceblog related to myself trying to do the normal “./configure && make” from the Percona build tarball.
    I haven’t tackled the error you’re describing with the bash script. However, I see you are using the “new method” of building the source, whereas I was using the “old method”; haven’t tried the new one as yet.

  2. @shlomi I have read article “Benchmarking Percona Server TokuDB vs InnoDB” on Mysqlperformanceblog and decided not to waste my time playing with TokuDB installation and decided to remain on PerconaDB 🙂

  3. This is two years after the actual article date and TokuDB still has the same compilation issues. I thought of running benchmarks on different index and picked Tokudb too. Unfortunately, the compilation and installation needs too much effort and I left after spending a few hours on it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.