7 ways to convince MySQL to use the right index

Sometimes MySQL gets it wrong. It doesn’t use the right index.

It happens that MySQL generates a query plan which is really bad (EXPLAIN says it’s going to explore some 10,000,000 rows), when another plan (soon to show how was generated) says: “Sure, I can do that with 100 rows using a key”.

A true story

A customer had issues with his database. Queries were taking 15 minutes to complete, and the db in general was not responsive. Looking at the slow query log, I found the criminal query. Allow me to bring you up to speed:

A table is defined like this:

CREATE TABLE t (
  id INT UNSIGNED AUTO_INCREMENT,
  type INT UNSIGNED,
  level TINYINT unsigned,
  ...
  PRIMARY KEY(id),
  KEY `type` (type)
) ENGINE=InnoDB;

The offending query was this:

SELECT id FROM data
WHERE type=12345 AND level > 3
ORDER BY id

The facts were:

  • `t` has about 10,000,000 rows.
  • The index on `type` is selective: about 100 rows per value on average.
  • The query took a long time to complete.
  • EXPLAIN has shown that MySQL uses the PRIMARY KEY, hence searches 10,000,000 rows, filtered “using where”.
  • The other EXPLAIN has shown that by using the `type` key, only 110 rows are expected, to be filtered “using where”, then sorted “using filesort”

So MySQL acknowledged it was generating the wrong plan. The other plan was better by its own standards.

Solving the problem

Let’s walk through 7 ways to solve the problem, starting with the more aggressive solutions, refining to achieve desired behavior through subtle changes. Continue reading » “7 ways to convince MySQL to use the right index”

Useful database analysis queries with INFORMATION_SCHEMA

A set of useful queries on INFORMATION_SCHEMA follows. These queries can be used when approaching a new database, to learn about some of its properties, or they can be regularly used on an existing schema, so as to verify its integrity.

I will present queries for:

Two storage engines; different plans, Part II

In Part I of this article, we have seen how the internal structure of the storage engine’s index can affect an execution plan. We’ve seen that some plans are inherent to the way engines are implemented.

We wish to present a second scenario in which execution plans vary for different storage engines. Again, we will consider MyISAM and InnoDB. Again, we will use the world database for testing. This time, we will see how confident the storage engines are in their index search capabilities.

Many newcomers to databases often believe that an index search is always preferable to full table scan. This is not the case. If I were to look for 10 rows in a 1,000,000 rows table, using an indexed column – I could benefit from an index search. However, if I’m looking for 200,000 rows on that table (that’s 20% of the rows) – an index search can actually be much more expensive than a full table scan. Continue reading » “Two storage engines; different plans, Part II”

Two storage engines; different plans, Part I

A popping question is: “Can an execution plan change for different storage engines?”

The answer is “Yes”. I will present two such cases, where the MySQL optimizer will choose different execution plans, based on our choice of storage engine.

We will consider MyISAM and InnoDB, the two most popular engines. The two differ in many respects, and in particular, the way they implement indexes and statistics: two major players in the optimizer’s point of view. Continue reading » “Two storage engines; different plans, Part I”