MySQL Conference 2009 Community Awards

OK. That was a surprise!

In retrospective, there have been some hints along the way. But I don’t get hints. I’m the kind of man who, when watching a complicated movie, needs his girlfriend to explain him what goes on.

I was utterly astonished and honored to find my name on the screen, and have me being one of three people called to accept the MySQL Community Award for 2009.

Let me tell you: it is heavy! And it doesn’t fit in my bag, either, so I hang around carrying this big heavy box in both hands…

I guess this calls for a short Oscar speech, a written one.

Continue reading » “MySQL Conference 2009 Community Awards”

MySQL Conference 2009 daily summary: Monday

[See http://forge.mysql.com/wiki/MySQLConf2009MondayNotes]

Monday: day of tutorials. Plenty of interesting tutorials on the Conference itself, plus a session with Mark Callaghan – it was hard to choose. I settled for two tutorials, which turned out to be three.

Practical MySQL Plugin Development: As a C/C++/Java developer, I am very interested in the plugin API. I have used UDF before, and these turned out to be extremely helpful, and solved me a lot of headache. With the new plugin API I was expecting to learn how to properly write INFORMATION_SCHEMA tables, functions and engines.

Wasn’t it possible to learn all this on the web? Sure, but this presentation was delivered by Roland Bouman and Sergei Golubchik, and I was anxious to hear from their experience. Well, that’s what the conference is all about, isn’t it?

The session was very good. Roland & Sergei covered the basics of the Plugin API, the general ideas, then went on to present the specific implementations: daemon plugins, INFORMATION_SCHEMA, FULLTEXT. The session was accompanied by convincing and enlighting examples. For example, a QUERY_CACHE_TABLE: an INFORMATION_SCHEMA table which lists which queries are currently in the query cache, along with the number of used blocks etc.

Continue reading » “MySQL Conference 2009 daily summary: Monday”

A note on Baron’s command line tip for comparing result sets

A while ago Baron Schwartz published a MySQL command-line tip: compare result sets.

A “SELECT * FROM world“, for example, can be checksummed, compared with another checksum made on a replica, or otherwise another table which is supposed to contains the exact same data.

I just wanted to note that if you’re dealing with a MyISAM table, a simple “SELECT * FROM” will not necessarily be too useful, since MyISAM can store rows in any particular order: two different settings of concurrent_insert, or perhaps an OPTIMIZEd table, can make for different ordering, hence different checksums.

Use of “ORDER BY …” is required if you want to have a consistent checksum. With MyISAM, you don’t usually want to count on natural row ordering, at any case.

“Why?” of the week

As progress on oak-online-alter-table goes on, I’m encountering more and more limitations, for which I must find workarounds. Here’s two:

CREATE TABLE … LIKE …

It works well, but it doesn’t copy any foreign key constraints. So, if the original table is this:

CREATE TABLE `dept_emp` (
  `emp_no` int(11) NOT NULL,
  `dept_no` char(4) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY  (`emp_no`,`dept_no`),
  KEY `emp_no` (`emp_no`),
  KEY `dept_no` (`dept_no`),
  CONSTRAINT `dept_emp_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE,
  CONSTRAINT `dept_emp_ibfk_2` FOREIGN KEY (`dept_no`) REFERENCES `departments` (`dept_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Then CREATE TABLE dept_emp_shadow LIKE dept_emp results with: Continue reading » ““Why?” of the week”

The depth of an index: primer

InnoDB and MyISAM use B+ and B trees for indexes (InnoDB also has internal hash index).

In both these structures, the depth of the index is an important factor. When looking for an indexed row, a search is made on the index, from root to leaves.

Assuming the index is not in memory, the depth of the index represents the minimal cost (in I/O operation) for an index based lookup. Of course, most of the time we expect large portions of the indexes to be cached in memory. Even so, the depth of the index is an important factor. The deeper the index is, the worse it performs: there are simply more lookups on index nodes.

What affects the depth of an index?

There are quite a few structural issues, but it boils down to two important factors:

  1. The number of rows in the table: obviously, more rows leads to larger index, larger indexes grow in depth.
  2. The size of the indexed column(s). An index on an INT column can be expected to be shallower than an index on a CHAR(32) column (on a very small number of rows they may have the same depth, so we’ll assume a large number of rows).

Continue reading » “The depth of an index: primer”

7 ways to convince MySQL to use the right index

Sometimes MySQL gets it wrong. It doesn’t use the right index.

It happens that MySQL generates a query plan which is really bad (EXPLAIN says it’s going to explore some 10,000,000 rows), when another plan (soon to show how was generated) says: “Sure, I can do that with 100 rows using a key”.

A true story

A customer had issues with his database. Queries were taking 15 minutes to complete, and the db in general was not responsive. Looking at the slow query log, I found the criminal query. Allow me to bring you up to speed:

A table is defined like this:

CREATE TABLE t (
  id INT UNSIGNED AUTO_INCREMENT,
  type INT UNSIGNED,
  level TINYINT unsigned,
  ...
  PRIMARY KEY(id),
  KEY `type` (type)
) ENGINE=InnoDB;

The offending query was this:

SELECT id FROM data
WHERE type=12345 AND level > 3
ORDER BY id

The facts were:

  • `t` has about 10,000,000 rows.
  • The index on `type` is selective: about 100 rows per value on average.
  • The query took a long time to complete.
  • EXPLAIN has shown that MySQL uses the PRIMARY KEY, hence searches 10,000,000 rows, filtered “using where”.
  • The other EXPLAIN has shown that by using the `type` key, only 110 rows are expected, to be filtered “using where”, then sorted “using filesort”

So MySQL acknowledged it was generating the wrong plan. The other plan was better by its own standards.

Solving the problem

Let’s walk through 7 ways to solve the problem, starting with the more aggressive solutions, refining to achieve desired behavior through subtle changes. Continue reading » “7 ways to convince MySQL to use the right index”

Online ALTER TABLE now available in openark kit

A new utility in openark kit allows for online ALTER TABLE operation. That is, the modification of table structure without locking down the entire table for the duration of the operation. The oak-online-alter-table utility works under the following restrictions:

  • The table has at least one single-column UNIQUE KEY [*]
  • Altered table shares a single-column UNIQUE KEY with the original table [*]
  • No ‘AFTER’ triggers are defined on the table (the utility creates its own triggers for the duration of the operation)
  • The table has no FOREIGN KEYs [*][#]
  • Table name is no longer than 57 characters

[*]: Restriction is scheduled to be removed or partly removed.

[#]: ‘Child-side’ foreign keys may actually work, but have not been tested.

Follows is a mini FAQ which attempts to introduce the utility.

So what exactly does this utility provide?

  • First and foremost, the ability to perform a non blocking ALTER TABLE. This has long been an issue with MySQL, and complex Master-Master, application aware solutions are currently required in order to perform an ALTER TABLE with minimal downtime. The utility offers a no-downtime solution, albeit there is performance penalty for the duration of its runtime, and some requirements to meet.
  • It also supports a ‘null’ ALTER. That is, an ALTER TABLE which does not change anything. This effectively means rebuilding of the table. For InnoDB tables with innodb_file_per_table, for example, this could be the means of regaining disk space after removing many rows from the table. Also, while it does not strictly act like OPTIMIZE TABLE, the effect of running this utility should build a better organized table on disk (this as yet unverified).
  • Another thing this utility supports is the building of a ghost table: a duplicate of a given table, which keeps mirroring the original table via triggers. [May be removed in future versions]

Continue reading » “Online ALTER TABLE now available in openark kit”

LOCK TABLES in MyISAM is NOT a poor man’s tranactions substitute

I get to hear that a lot: that LOCK TABLES with MyISAM is some sort of replacement for transactions; some model we can work with which gives us ‘transactional flavor’.

It isn’t, and here’s why.

When we speak of a transactional database/engine, we check out its ACID compliance. Let’s break out the ACID and see what LOCK TABLES provides us with:

  • A: Atomicity. MyISAM does not provide atomicity.  If we have LOCK TABLES followed by two statements, then closed by UNLOCK TABLES, then it follows that a crash between the two statements will have the first one applied, the second one not not applied. No mechanism ensures an “all or nothing” behavior.
  • C: Consistency. An error in a statement would roll back the entire transaction in a transactional database. This won’t work on MyISAM: every statement is “committed” immediately.
  • I: Isolation. Without LCOK TABLES, working with MyISAM resembles using the read uncommitted, or dirty read isolation level. With LOCK TABLES – it depends. If you were to use LOCK TABLES … WRITE on all tables in all statements, you would get the serializable isolation level. Actually it would be more than serializable. It would be truely serial.
  • D: Durability. Did the INSERT succeed? And did the power went down just after? MyISAM provides not guarantees that the data will be there.

Continue reading » “LOCK TABLES in MyISAM is NOT a poor man’s tranactions substitute”

MySQL User Group Meetings in Israel

This is a short note that the MySQL User Group Meetings in Israel are established (well, re-established after a very long period).

Thanks to Eddy Resnick from Sun Microsystems Israel who has set up the meetings. So far, we’ve had 2 successful meetings, and we intend to have more! First one was in Sun’s offices in Herzlia; second one, held last week, was at Interbit (a MySQL training center) in Ramat Gan. We hope to hold these meetings on a monthly basis, and the next ones are expected to be held at Interbit.

A new (blessed) law in Israel forbids us from sending invitations for these meetings via email without prior consent of the recepient (this law has passed as means of stopping spam). We do realize there are many users out there who would be interested in these meeting. For those users: please stay tuned to Interbit’s website, where future meetings will be published – or just give them a call!

It was my honor to present a short session, one of three in this last meeting. Other presenters were Erad Deutch, who presented “MySQL Success Stories”, and Moshe Kaplan, who presented “Sharding Solutions”. I have presented “MyISAM & InnoDB Tuning Fundamentals”, where I have layed down the basics behind parameter tuning for these storage engines.

As per audience request, here’s the presentation in PDF format:

I intend to give sessions in future meetings, and have already started working on my next one. So please come, it’s a fun way to pass a nice afternoon. See you there!