Covering indices

Warning: This blogpost has been posted over two years ago. That is a long time in development-world! The story here may not be relevant, complete or secure. Code might not be complete or obsoleted, and even my current vision might have (completely) changed on the subject. So please do read further, but use it with caution.
Posted on 22 Dec 2009
Tagged with: [ covering ]  [ index ]  [ MySQL

It’s almost to easy to use a SELECT * FROM query in your code. First of all, you instantly get all the fields from your database so you don’t have to worry about changing your queries when you decide to use other fields (in case you don’t use a DAL). However, there are some drawbacks on a SELECT * method,.. the most famous one: it takes more time to retrieve all fields instead of the fields you actually use.. but that’s NOT the most important reason why you should not SELECT * FROM queries..

The main reason? Covering indices…

What is it?

As you all know a database is primarily a fast resource because it uses indexes (or indices depending on what your favorite plural name is). It’s a small part of the system where it can find other data just like the index in the back of a book. You know the term you are looking for, you browse through index, find the page number and go directly to that page. This off course is much faster than browsing the book from page 1 on and checking every page if your term you are looking for is on the page. More or less, a database does the same thing if you have indices on your tables.

However, sometimes the database will have enough information available from the indices already. In those cases it can skip the part of looking up the pages (the main data) itself and just return the information from the index.. A lot of time is saved and thus both your database and code will be faster.. That’s basically the whole deal with covering indices: the index will cover everything you need to know..

How to check

When you EXPLAIN your queries take a look in the Extra field. When it says Using index it means your query will use the covering index and doesn’t fetch the data blocks.

So, you might ask the question: why not always add all fields to my indices? Basically it’s the same reason you don’t add indices to all your fields: it will decrease performance since your index tables will be too large. Also be careful for which queries you want to place covering indices on. A query that is run once every hour is not a very good candidate. A query that will run a few times every page view for instance probably is..

But as always: when you are modifying indices make sure your data sets are big enough. Cardinality and selectivity are very important issues and MySQL could behave very differently between a 10-record data set and a 10.000 record set. Always check your queries at live data to make sure things are used the way you want…