7 ways to convince MySQL to use the right index

Sometimes MySQL gets it wrong. It doesn’t use the right index.

It happens that MySQL generates a query plan which is really bad (EXPLAIN says it’s going to explore some 10,000,000 rows), when another plan (soon to show how was generated) says: “Sure, I can do that with 100 rows using a key”.

A true story

A customer had issues with his database. Queries were taking 15 minutes to complete, and the db in general was not responsive. Looking at the slow query log, I found the criminal query. Allow me to bring you up to speed:

A table is defined like this:

CREATE TABLE t (
  id INT UNSIGNED AUTO_INCREMENT,
  type INT UNSIGNED,
  level TINYINT unsigned,
  ...
  PRIMARY KEY(id),
  KEY `type` (type)
) ENGINE=InnoDB;

The offending query was this:

SELECT id FROM data
WHERE type=12345 AND level > 3
ORDER BY id

The facts were:

  • `t` has about 10,000,000 rows.
  • The index on `type` is selective: about 100 rows per value on average.
  • The query took a long time to complete.
  • EXPLAIN has shown that MySQL uses the PRIMARY KEY, hence searches 10,000,000 rows, filtered “using where”.
  • The other EXPLAIN has shown that by using the `type` key, only 110 rows are expected, to be filtered “using where”, then sorted “using filesort”

So MySQL acknowledged it was generating the wrong plan. The other plan was better by its own standards.

Solving the problem

Let’s walk through 7 ways to solve the problem, starting with the more aggressive solutions, refining to achieve desired behavior through subtle changes.

Solution #1: OPTIMIZE

If MySQL got it wrong, it may be because the table was frequently changed. This affects the statistics. If we can spare the time (table is locked during that time), we could help out by rebuilding the table.

Solution #2: ANALYZE

ANALYZE TABLE is less time consuming, in particular on InnoDB, where it is barely noticed. An ANALYZE will update the index statistics and help out in generating better query plans.

But hold on, the above two solutions are fine, but in the given case, MySQL already acknowledges better plans are at hand. The fact was I tried to run ANALYZE a few times, to no avail.

Solution #3: USE INDEX

Since the issue was urgent, my first thought went for the ultimate weapon:

SELECT id FROM data USE INDEX(type)
WHERE type=12345 AND level > 3
ORDER BY id

This instructs MySQL to only consider the indexes listed; in our example, I only want MySQL to consider using the `type` index. It is using this method that generated the other (good) EXPLAIN result. I could have gone even more ruthless and ask for FORCE INDEX.

Solution #4: IGNORE INDEX

A similar approach would be to explicitly negate the use of the PRIMARY KEY, like this:

SELECT id FROM data IGNORE INDEX(PRIMARY)
WHERE type=12345 AND level > 3
ORDER BY id

A moment of thinking

The above solutions are “ugly”, in the sense that this is not standard SQL. It’s too MySQL specific.

I’ve asked the programmers to do a quick rewrite, and had a few moments to consider: why did MySQL insist on using the PRIMARY KEY. Was it because I’ve asked it for the `id` column only? I rewrote as follows:

SELECT id, type, level FROM data
WHERE type=12345 AND level > 3
ORDER BY id

Nope. EXPLAIN got me the same bad plan. Then it must be the ORDER BY clause:

SELECT id FROM data
WHERE type=12345 AND level > 3

Sure enough, EXPLAIN now  indicates using the `type` index, only reading 110 rows. So MySQL preferred to scan 10,000,000 rows, just so that the rows are generated in the right ORDER, and so no sorting is required, when it could have read 110 rows (where each row is a mere INT) and sort them in no time.

Armed with this knowledge, a few more options come at hand.

Solution #5:Move some logic to the application

At about that point I got a message that the programmers were unable to add the USE INDEX part. Why? They were using the EJB framework, which limits your SQL-like queries to something very generic. Well, you can always drop the ORDER BY part and sort on the application side. That isn’t fun, but it’s been done.

Solution #6: Negate use of PRIMARY KEY

Can we force MySQL to use the `type` index, retain the ORDER BY, and do it all with standard SQL? Sure. The following query does this:

SELECT id, type, level FROM data
WHERE type=12345 AND level > 3
ORDER BY id+0

id+0 is a function on the `id` column. This makes MySQL unable to utilize the PRIMARY KEY (or any other index on `id`, had there been one).

In his book “SQL Tuning“, Dan Tow dedicates a chapter on hints and tips like the above. He shows how to control the use or non-use of indexes, the order by which subqueries are calculated, and more.

Unfortunately, the EJB specification said this was not allowed. You could not ORDER BY a fucntion. Only on normal column.

Solution #7: Make MySQL think the problem is harder than it really is

Almost out of options. Just a moment before settling for sorting on the application side, another issue can be considered: since MySQL was fooled once, can it be fooled again to make things right? Can we fool it to believe that the PRIMARY KEY would not be worthwhile to use? The following query does this:

SELECT id, type, level FROM data
WHERE type=12345 AND level > 3
ORDER BY id, type, level

Let’s reflect on this one. What is the order by which the rows are returned now? Answer: exactly as before. Since `id` is PRIMARY KEY, it is also UNIQUE, so no two `id` values are the same. Therefore, the secondary sorting column is redudant, and so is the following one. We get exactly the same result as “ORDER BY id”.

But MySQL didn’t catch this. This query caused MySQL to say: “Mmmmm. ‘ORDER BY id, type, level’ is not doable with the PRIMARY KEY only. Well, in this case, I had better used the `type` index”. Is this a weakness of MySQL? I guess so. Maybe it will be fixed in the future. But this was the fix that made the day.

39 thoughts on “7 ways to convince MySQL to use the right index

  1. Excellent analysis and still relevant with latest stable version of MySQL. Had similar issue with a new client. Sure enough, ordering by the primary key only was resulting in perfectly good indices not being used where they should be. Came across this following a similar process to the above.

  2. Faced this issue where mysql uses the wrong index and resulting with a simple query executed in 5 seconds.

    I am using php laravel5 framework whose database library (Eloquent ORM) and query builder does not allow to specify use index/force index in the queries. But it allows to write raw order by clauses so i just added order by id + 0 as in SOLUTION #6 to prevent it using the wrong index. Now my query executes in 0.002 seconds.

    Thanks for this great analysis and clever solution coming from 2009!

  3. Facing issue Facing slow query on master even same query is running more faster on slave . need your inputs to resolve this issue .

    Key Points :
    Mysql version 5.6.33 on the both ( master & slave ) version are same .
    cross validated show status o/p is same on both server .
    But getting different execution plan on master & slave .
    Slave is not using sorting & temporary while master is using for same query .
    schema & index are same on both servers even also validated ordered .
    Already run on three table analyse & optimise on master server but still planner is different than slave .

    Query : explain extended select this_.solution_type as y0_, this_.lead_id as y1_, this_.created_at as y2_, this_.cust_id as y3_, this_.mobile_number as y4_, node2_.sub_stage as y5_, node2_.stage as y6_ from user_business_mapping this_ inner join workflow_status wfstatus1_ on this_.id=wfstatus1_.user_business_mapping_id inner join workflow_node node2_ on wfstatus1_.workflow_node_id=node2_.id where this_.created_at>=’2018-01-19′ and this_.created_at<='2018-01-25' and this_.solution_type='bc_merchant' and wfstatus1_.is_active=1 order by this_.id asc limit 5

    Master server Execution Plan :

    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: node2_
    type: index
    possible_keys: PRIMARY
    key: idx_wfn_stage_substage_is_active
    key_len: 305
    ref: NULL
    rows: 43
    filtered: 100.00
    Extra: Using index; Using temporary; Using filesort
    *************************** 2. row ***************************
    id: 1
    select_type: SIMPLE
    table: wfstatus1_
    type: ref
    possible_keys: FK_ws_rbsm,FK_wfs_wf_nodeId_idx
    key: FK_wfs_wf_nodeId_idx
    key_len: 9
    ref: onboarding_engine.node2_.id
    rows: 927
    filtered: 100.00
    Extra: Using where
    *************************** 3. row ***************************
    id: 1
    select_type: SIMPLE
    table: this_
    type: eq_ref
    possible_keys: PRIMARY,created_at_idx
    key: PRIMARY
    key_len: 8
    ref: onboarding_engine.wfstatus1_.user_business_mapping_id
    rows: 1
    filtered: 100.00
    Extra: Using where
    3 rows in set, 1 warning (0.00 sec)

    Slave Server Execution plan :

    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: this_
    type: index
    possible_keys: PRIMARY,created_at_idx
    key: PRIMARY
    key_len: 8
    ref: NULL
    rows: 5
    filtered: 4705840.00
    Extra: Using where
    *************************** 2. row ***************************
    id: 1
    select_type: SIMPLE
    table: wfstatus1_
    type: ref
    possible_keys: FK_ws_rbsm,FK_wfs_wf_nodeId_idx
    key: FK_ws_rbsm
    key_len: 8
    ref: onboarding_engine.this_.id
    rows: 4
    filtered: 100.00
    Extra: Using where
    *************************** 3. row ***************************
    id: 1
    select_type: SIMPLE
    table: node2_
    type: eq_ref
    possible_keys: PRIMARY
    key: PRIMARY
    key_len: 8
    ref: onboarding_engine.wfstatus1_.workflow_node_id
    rows: 1
    filtered: 100.00
    Extra: NULL
    3 rows in set, 1 warning (0.00 sec)

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.