Why SQL Server does not guarantee sorting results without ORDER BY

Hello again. OTUS launches course again in June “MS SQL Server Developer”, traditionally, in anticipation of the start of the course, we begin to share with you material on the topic.

If your query does not contain ORDER BY, then you cannot be sure that the sorting of the results will not change over time.

Of course, at first everything will be pretty predictable, but as changes occur (in indexes, tables, server configuration, amount of your data), you may encounter some unpleasant surprises.

Let’s start with something simple: do a SELECT for the Users table of the Stack Overflow database. This table has a cluster index for the Id column, which starts from one and increases to a trillion. For this query, data is returned in cluster index order:

But if you create an index on DisplayName and Location, then SQL Server suddenly decides to use a new index, not a clustered one:

Here is the execution plan:

Why did SQL Server decide to use this index, although it did not need to sort by DisplayName and Location? Because this index is the smallest copy of the data that needs to be obtained. Let’s look at the sizes of indices using sp_BlitzIndex:

The cluster index (CX / PK) has about 8.9 million rows and its size 1.1 GB.

In the non-clustered index for DisplayName, Location is also about 8.9 million rows, but its size is only 368 MB. If you need to do a scan to get the query results, then why not choose the smallest data source, since it will be faster. It is for this reason that SQL Server did this.

“Yes, but my request contains WHERE.”

Ok, now that we have an index for DisplayName and Location, let’s try a query that looks for a specific name (DisplayName). Results are sorted by DisplayName:

The execution plan shows that the index used is DisplayName and Location:

But if you search by a different value, then the results will no longer be sorted by DisplayName:

SQL Server found that Alex’s many and more wise to perform a Clustered Index Scan instead of Index Seek + Key Lookup:

Even in these really simple cases, you cannot guarantee that SQL Server will always use the copy of data that you expect.

Recently, I have come across much more complicated cases:

  • Removing the index used in the query
  • Enabling Forced Parameterization, which changes the estimate of the expected number of rows, forcing SQL Server to select a different index
  • Change the compatibility level of the database (Compatibility Level) with the inclusion of a new mechanism for assessing cardinality (Cardinality Estimator), which gives another version of the plan.

Learn more about the course.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *