The Many Problems with SQL Server’s Index Recommendations

Indexing
6 Comments

These days, I jump back & forth a lot between SQL Server and Postgres. (We use AWS Aurora Postgres to store SQL ConstantCare®‘s data.) Whenever I come back to the sweet, sweet graphical execution plans in SQL Server Management Studio, I breathe a sigh of relief. Dang, these things are so much easier for me to interpret. It’s like coming home.

Graphical plans include missing index recommendations, telling you how you should store your data for faster queries:

And if you don’t have time to review one query at a time, SQL Server makes wide-ranging analysis easy too, letting you query dynamic management views like sys.dm_db_missing_index_details to get index recommendations for the entire database. You can even use tools like sp_BlitzIndex to analyze and report on ’em.

Except…

Both of these – the index recommendations in the query plan and the ones in the DMVs – suffer from some pretty serious drawbacks.

They’ll recommend huge indexes. They don’t give a damn about the overhead of multiplying the table’s size. They’re razor-sharp focused on improving the response time of queries that need to find rows, but they don’t care about how much slower your deletes, updates, and inserts go.

The suggested key columns aren’t even in order. I know, you think they have something to do with selectivity, but they don’t: they’re just a comma-delimited list of columns ordered by equality vs inequality search, then by their field order in the table. Don’t believe me? Here’s a query to reproduce it.

The indexes may not even get used. Because the key columns aren’t necessarily in the right order, there are times where you’ll see SQL Server recommend an index, only to realize that the key order doesn’t even make sense for the filtering, grouping, joining, and ordering that the query really does. But does SQL Server go back and tell you to drop that index? Nope – it’ll just keep recommending indexes to support your hoarding habit.

The indexes may make the query slower. I’m not talking about unintended side effects where one query gets faster and another gets slower – I’m talking about SQL Server recommending an index for one specific query, and then that very query gets slower and uses more CPU. That demo’s always an eye-opener in my Fundamentals of Index Tuning class.

They’ll recommend redundant indexes. You’ll often see SQL Server recommend one index on ColumnA, ColumnB, and then another separate index recommendation for just ColumnA, and then yet another recommendation on ColumnA + include ColumnB. Even worse, they’ll recommend indexes that are subsets of existing indexes, or tell you to make a wider index (but not tell you to drop a narrower subset index that would be replaced by the new one.)

They’ll recommend indexes on huge tables. Whether the table has ten thousand rows or ten billion rows, you get the same basic index recommendations. I dunno about you, but I gotta approach indexing a lot more carefully when I know that the index would take an hour to create.

They’ll recommend nonclustered (but not clustered) indexes on heaps. I can almost hear SQL Server talking itself through this. “Well, sure, the entire table is literally organized in random order, but … I sure would like a copy of it sorted by just the primary key.”

They’ll recommend indexes for rarely-executed queries. If you let an analyst loose in a database, they’ll run all kinds of crazy queries in just one day’s time. SQL Server sees those crazy queries, recommends all kinds of indexes to make ’em go faster, and then…stores those recommendations, showing them over and over again whenever you check index design for a table, even weeks or months later. Sure, the index recommendations will get reset eventually, but…

They don’t tell you when this data was reset. You have no idea how much history you’re looking at – maybe a day’s worth of queries, maybe six months.

They don’t let you reset the data on your terms. Sure, you could restart the entire SQL Server, or set the database offline, but who on earth can actually do that just to do index analysis? It’s left up to you, dear reader, to log this data to a table regularly and then do differentials on it to understand when the indexes would have been used.

“But it gets better in Azure SQL DB, right?”

I got really excited when Microsoft announced automatic indexing in Azure SQL DB. After all, if the robots are going to do our job for us, they’re gonna need better data to work with, right? I excitedly devoured the white paper about it to see what new instrumentation they’d added in order to help solve the problem.

And they didn’t actually fix the missing index DMVs.

They just wrote code to work around the limitations.

The bad news is that they’re probably not going to fix the broken missing index DMVs anytime soon. Sure, we got really incremental improvements in SQL Server 2019, but they don’t address any of the above stuff. That’s a bummer.

The good news is that if they can work around it, so can we. I’ll tell you about the newest feature of SQL ConstantCare® in the next post.

Previous Post
Announcing a New Live Online Class: How I Use the First Responder Kit
Next Post
SQL ConstantCare® Now Gives Index Advice, Too.

6 Comments. Leave new

  • richardarmstrong-finnerty
    November 14, 2019 8:51 am

    Being a fabulously lucky winner of a free 6-course prize, I have been extremely impressed with the quality & content.

    (Please, please, please let UK citizens get the ability to actually buy your course soon)

    Anyway, one of the gems that took me by surprise in the Fundamentals of Index Tuning course was that SQL Server secretly, slyly, doesn’t necessarily display the most significant missing index. Thank you Microsoft!

    Reply
  • Microsofts focus on new features no one will use (Big Data) over fixing obvious shortcomings like this….arrgh

    Reply
  • When you look at the missing index recommendations in their strictest scope as “x index will make y query run faster”, rather than “we recommend you apply x index to your database immediately”, the advice makes much more sense. They’re a great starting point for a DBA to optimise their indexes, as long as you recognise their limitations.

    Reply
  • Thanks for the post! Always appreciate you sharing your knowledge. Just curious; what is the technical reason you use PostgreSQL over MS SQL Server for the Consultant Toolkit data? Or is it just a matter of licensing costs?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.