Categories
Tips

MySQL: Are you actually utilizing your indexes?

Reading Time: 2 minutes

This might seem elementary to those of you that are DBAs or something similar, but it was fascinating to find out (not to mention it greatly helped what I had to do), so I decided to post it, in case it helps someone else too. A few moments ago I found out that whereas a query along the lines of…

SELECT title, COUNT(1) as replies
FROM post INNER JOIN thread USING(threadid)
WHERE UNIX_TIMESTAMP(NOW()) - post.dateline < 86400
GROUP BY threadid
ORDER BY replies DESC
LIMIT 5

took a whopping ~10 seconds on a post table of around 2,000,000 rows and a thread table of around 40,000 rows, the following:

SELECT title, COUNT(1) as replies
FROM post INNER JOIN thread USING(threadid)
WHERE post.dateline > UNIX_TIMESTAMP(NOW()) - 86400
GROUP BY threadid
ORDER BY replies DESC
LIMIT 5

took a mere 0.03 seconds!

Probably, MySQL wasn’t able to utilize the B+ tree index of the dateline column in the first query, whereas in the second, things were a bit more obvious to it. This can also be observed by examining the information about the execution plan that EXPLAIN provides:

mysql> explain select t.threadid, t.title, count(1) as replies from vb3_post as p inner join vb3_thread as t using(threadid) where unix_timestamp(now()) - p.dateline < 86400 group by p.threadid order by replies desc limit 5;
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key      | key_len | re         | rows  | Extra                           |
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
|  1 | SIMPLE      | t     | ALL  | PRIMARY       | NULL     | NULL    | NULL       | 39859 | Using temporary; Using filesort |
|  1 | SIMPLE      | p     | ref  | threadid      | threadid | 4       | t.threadid |    49 | Using where                     |
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
2 rows in set (0.01 sec)

mysql> explain select t.threadid, t.title, count(1) as replies from vb3_post as p inner join vb3_thread as t using(threadid) where p.dateline > UNIX_TIMESTAMP(NOW()) - 86400 group by p.threadid order by replies desc limit 5;
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys     | key      | key_len | ref        | rows | Extra                                        |
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
|  1 | SIMPLE      | p     | range  | threadid,dateline | dateline | 4       | NULL       | 1171 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t     | eq_ref | PRIMARY           | PRIMARY  | 4       | p.threadid |    1 |                                              |
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

So, don’t rest assured that MySQL will use your indexes every time it should. It seems that sometimes you have to explicitly point it out.