Being a developer I often find myself in situations where I need to troubleshoot SQL queries that are not performing well and generally need optimizations. I would like to share some interesting tips on how SQL Server handles queries and what execution plans might be better than others. A query execution plan is an ordered set of steps that are used to access the data in SQL server. If you are working with SQL Server Management Studio once you have started a new query window press or CTRL + M so you can enable execution plans. In the examples below I will be using tables from AdventureWorks database which is a sample database provided by Microsoft.
Filters are not necessarily faster. In this first example we are going to query the Person table but we would compare 2 different queries. Suprisingly, as we would find out, they would produce identical execution plans which basically means SQL server would have to go through the exact same steps and amount of work to fetch the requested data.
The first query selects all the data from the Person table and the execution plan would consist of one step for clustered index scan. SQL Server would store the table data in a special format sorted by the primary key. This is also called clustered index. So what the execution plan tells us below is that SQL Server would have to traverse the full data as stored by the clustered index. Hovering over the SELECT node would actually give us a little bit more information and the value for the Estimates Subtree Cost would actually give us a baseline for performance comparison of other queries.
The second query is still quering the Person table but this time we are only selecting the values from the primary key column and also we are filtering on modfied date. At first glance we would expect faster performance due to the fact that we are selecting less data and also would expect less records back. As we can see though the execution plan steps are exactly the same and also the Subtree cost is the exact same number. So basically even though our queries differ and at first glance we would expect the second one to be less intensive, SQL Server would have to do the exact same amount of work in order to return the results.
Filters would be faster with an index. Besides the clustered index that we already discussed, we can add additional non-clustered indexes for our tables. This would basically tell SQL Server to store a copy of that same data in a different format. Imagine now we created a new non-clustered index for the Person table on the ModifiedDate field. SQL Server would store copy of the data but only for the ModifiedDate and the primary key fields.
So now if we are to run the same query the execution plan would be different and so would be the Subtree Cost value. SQL Server would now be seeking the new index and will be doing less work to return the result records. Seek is much faster operation than scan because it would only touch qualifying rows and SQL Server would immediately navigate to those rows that satisfy the predicate (filter condition).
Read your statistics. Another way to assess queries and performance is to execute
This would let us see under the Messages tab of SQL Server Management Studio the logical reads that are executed when the queries run. Remember the example with the Person table. Let’s expand the search for greater date range so that we can get more records back. I will compare the exact same query with and without the non-clustered index on the ModifiedDate field.
We get 3820 reads when there is no index vs 47 reads when there is one. This makes it pretty obvious how much more work SQL Server would have to do without the non-clustered index we introduced.
Be aware, indexes could slow you down. I intentionally chose my query to filter on the ModifiedDate, which is a field where the value would change frequently. The problem with creating index on it is exactly that, if it changes often that means the data needs to be maintained in two places, one for the clustered index and one for the non-clustered index. In other words every change would cause two inserts, updates or deletes. So often you would have to make a judgement on what fields to use for the index and whether that would actually help or harm down the road.
Moreover SQL Server would always try to use the narrowest index that would satisfy the query. If this index has never been used then SQL Server would have to read it and load it in memory. If there are too many indexes per table you might find yourself in a situation where SQL Server would not be using the cached in memory data but would be loading a different index every time. The more indexes you have the more memory and hard drive space they would take.
Sorting can be slow. Let’s take a step back and look at another example that deals with sorting. We often like to use ‘order by’ in our queries. Depending on the data we are sorting we might find it more beneficial to take the sorting outside the SQL Server. In the query below I am referencing a table that has a field containing error messages which is essentially a very large field. If we just sorted on the primary key then the cost and execution plan would be identical as if we are just executing the same query without the ‘order by’ because the data in the clustered index is already in the correct order.
SQL server would first perform the scan and then it would go ahead and sort the data. Notice the cost and how much it has grown. So ‘order by’ does matter when it comes to performance and it is something to be aware of.
These are just a few simple examples that help us understand query performance and how to analyze execution plans.
SVETLANA DUDINA-CHESHMEDJIEVA | .Net Business Solutions Team Leader
Svetlana provides project direction, oversees tasks and project work orders, and is responsible for delivering projects on-time and on-budget. She works closely with the whole development team, provides guidance and ensures that the latest technologies are being utilized as well as the Agile and test driven methodologies are followed. She has been involved in all aspects of the application development life cycle including architecture, design, development, implementation, testing, troubleshooting, deployment and documentation. She has extensive knowledge in building custom web and windows applications as well integrating existing systems. Svetlana has more than 13 years of experience with nine years of .Net development. She is proficient in technologies such as C#, ASP.NET, Microsoft SQL Server and Transact SQL, Entity Framework, WPF, WCF, Windows Services, jQuery and java script. She has gained more than 7 years of experience with solutions, integrations and development revolving around Microsoft Dynamics Great Plains and CRM. She is always eager to learn the new and latest technologies as well as resolving complicated and challenging development or design problems.