Performance Optimization and Indexing in Rails

What is indexing?

Database queries can be slow and make your application painful to use. One way to speed things up is to use indexing.

Indexing is the process of building a data structure from a table column in order to search through that table more efficiently. The type of data structure that gets built depends on what type of database you’re using. Different databases have different options. In PostgreSQL there are 4 options: binary tree, hash, generalized search tree, or generalized inverted index. If you don’t specify a specific type, PostgreSQL defaults to binary tree.

If a column on a table is indexed, and you search a table based on that column (eg. Post.where(title: “My Post”) – this searches the Post table based on the title column), the database query will traverse the data structure set up by indexing to find what it’s looking for.

Using data structures like a binary tree to search increases the speed of that search immensely. Without it, every single row in a table would have to be looked at. With it, we can greatly reduce the number of table entries that need to be looked at. You can learn more about why binary tree’s are faster here.

How do I index a column?

Assuming you have your database set up with at least one table, indexing a column isn’t too complicated.

First, you need to create a rails migration:

$ rails g migration AddIndexToTable

Then open the migration, and use Rails’s add_index method to create the index. The first parameter is the name of the table you want to add the index to. The second parameter is the name of the column you want to index:

def change
  add_index :posts, :title
end

Run the migration, and your database queries should be much faster!

 

Should I index all of my columns?

No. While indexing speeds up reading from a database, it also slows down writing to a database. This is because every time it has to add something to a table, it now also has to update the data structure that was made for each indexed column. If your application does a lot more reading than writing, then this trade off is usually a good one, but you should only index columns that you actually query on. If you never need to search for a post based on its title, then it doesn’t make sense to index the title column of the Post table.

Generally, join tables and foreign keys should be indexed. Primary ids are indexed automatically in Rails.

Additional reading/sources: PostgreSQL docs, Rake Routes Blog, Turing’s Performance Dojo Lesson

 

Leave a comment