Categories
Tips

MySQL: Are you actually utilizing your indexes?

Reading Time: 2 minutes

This might seem elementary to those of you that are DBAs or something similar, but it was fascinating to find out (not to mention it greatly helped what I had to do), so I decided to post it, in case it helps someone else too. A few moments ago I found out that whereas a query along the lines of…

SELECT title, COUNT(1) as replies
FROM post INNER JOIN thread USING(threadid)
WHERE UNIX_TIMESTAMP(NOW()) - post.dateline < 86400
GROUP BY threadid
ORDER BY replies DESC
LIMIT 5

took a whopping ~10 seconds on a post table of around 2,000,000 rows and a thread table of around 40,000 rows, the following:

SELECT title, COUNT(1) as replies
FROM post INNER JOIN thread USING(threadid)
WHERE post.dateline > UNIX_TIMESTAMP(NOW()) - 86400
GROUP BY threadid
ORDER BY replies DESC
LIMIT 5

took a mere 0.03 seconds!

Probably, MySQL wasn’t able to utilize the B+ tree index of the dateline column in the first query, whereas in the second, things were a bit more obvious to it. This can also be observed by examining the information about the execution plan that EXPLAIN provides:

mysql> explain select t.threadid, t.title, count(1) as replies from vb3_post as p inner join vb3_thread as t using(threadid) where unix_timestamp(now()) - p.dateline < 86400 group by p.threadid order by replies desc limit 5;
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key      | key_len | re         | rows  | Extra                           |
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
|  1 | SIMPLE      | t     | ALL  | PRIMARY       | NULL     | NULL    | NULL       | 39859 | Using temporary; Using filesort |
|  1 | SIMPLE      | p     | ref  | threadid      | threadid | 4       | t.threadid |    49 | Using where                     |
+----+-------------+-------+------+---------------+----------+---------+------------+-------+---------------------------------+
2 rows in set (0.01 sec)

mysql> explain select t.threadid, t.title, count(1) as replies from vb3_post as p inner join vb3_thread as t using(threadid) where p.dateline > UNIX_TIMESTAMP(NOW()) - 86400 group by p.threadid order by replies desc limit 5;
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys     | key      | key_len | ref        | rows | Extra                                        |
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
|  1 | SIMPLE      | p     | range  | threadid,dateline | dateline | 4       | NULL       | 1171 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t     | eq_ref | PRIMARY           | PRIMARY  | 4       | p.threadid |    1 |                                              |
+----+-------------+-------+--------+-------------------+----------+---------+------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

So, don’t rest assured that MySQL will use your indexes every time it should. It seems that sometimes you have to explicitly point it out.

Categories
Rants Thoughts

9 reasons why I prefer MySQL to MS SQL Server

Reading Time: 2 minutes

In the past, I used MySQL for any of my DBMS needs. It wasn’t really an informed decision based on solid facts, actually I had never really given it any thought. It was what most developers used, it was what vBulletin used (one of the main projects of my company is based on vBulletin), it was what most hosts had pre-installed, in other words, it was the popular choice and I went with the crowd.

Unlike most decisions taken that way, this one turned out to be correct (so far at least). In the university where I study (yeah, I do that too occasionally 😛 ), there is a great and extremely useful class on Database Systems offered in my semester. The only drawback is that it’s done on MS SQL Server. Consequently, I had to work with it quite a lot, and my conclusion was that MySQL is far superior (mostly syntax-wise as I don’t have the deep knowledge required to judge them fairly for other things, so don’t expect a deep analysis about performance or security – as far as I’m concerned, they are equally good at those).