I encountered an odd situation where appending
OPTION (RECOMPILE) to my query causes it to run in half a second, while omitting it causes the query to take well over five minutes.
There are times that using
OPTION(RECOMPILE) makes sense. In my experience the only time this is a viable option is when you are using dynamic SQL. Before you explore whether this makes sense in your situation I would recommend rebuilding your statistics. This can be done by running the following:
And then recreating your execution plan. This will ensure that when your execution plan is created it will be using the latest information.
OPTION(RECOMPILE) rebuilds the execution plan every time that your query executes. I have never heard that described as
creates a new lookup strategy but maybe we are just using different terms for the same thing.
When a stored procedure is created (I suspect you are calling ad-hoc sql from .NET but if you are using a parameterized query then this ends up being a stored proc call) SQL Server attempts to determine the most effective execution plan for this query based on the data in your database and the parameters passed in (parameter sniffing), and then caches this plan. This means that if you create the query where there are 10 records in your database and then execute it when there are 100,000,000 records the cached execution plan may no longer be the most effective.
In summary – I don’t see any reason that
OPTION(RECOMPILE) would be a benefit here. I suspect you just need to update your statistics and your execution plan. Rebuilding statistics can be an essential part of DBA work depending on your situation. If you are still having problems after updating your stats, I would suggest posting both execution plans.
Often when there is a drastic difference from run to run of a query I find that it is often one of 5 issues.
- STATISTICS – Statistics are out of date. A database stores statistics on the range and distribution of the types of values in various column on tables and indexes. This helps the query engine to develop a “Plan” of attack for how it will do the query, for example the type of method it will use to match keys between tables using a hash or looking through the entire set. You can call Update Statistics on the entire database or just certain tables or indexes. This slows down the query from one run to another because when statistics are out of date, its likely the query plan is not optimal for the newly inserted or changed data for the same query (explained more later below). It may not be proper to Update Statistics immediately on a Production database as there will be some overhead, slow down and lag depending on the amount of data to sample. You can also choose to use a Full Scan or Sampling to update Statistics. If you look at the Query Plan, you can then also view the statistics on the Indexes in use such using the command DBCC SHOW_STATISTICS (tablename, indexname). This will show you the distribution and ranges of the keys that the query plan is using to base its approach on.
- PARAMETER SNIFFING – The query plan that is cached is not optimal for the particular parameters you are passing in, even though the query itself has not changed. For example, if you pass in a parameter which only retrieves 10 out of 1,000,000 rows, then the query plan created may use a Hash Join, however if the parameter you pass in will use 750,000 of the 1,000,000 rows, the plan created may be an index scan or table scan. In such a situation you can tell the SQL statement to use the option OPTION (RECOMPILE) or an SP to use WITH RECOMPILE. To tell the Engine this is a “Single Use Plan” and not to use a Cached Plan which likely does not apply. There is no rule on how to make this decision, it depends on knowing the way the query will be used by users.
- INDEXES – Its possible that the query haven’t changed, but a change elsewhere such as the removal of a very useful index has slowed down the query.
- ROWS CHANGED – The rows you are querying drastically changes from call to call. Usually statistics are automatically updated in these cases. However if you are building dynamic SQL or calling SQL within a tight loop, there is a possibility you are using an outdated Query Plan based on the wrong drastic number of rows or statistics. Again in this case OPTION (RECOMPILE) is useful.
- THE LOGIC Its the Logic, your query is no longer efficient, it was fine for a small number of rows, but no longer scales. This usually involves more indepth analysis of the Query Plan. For example, you can no longer do things in bulk, but have to Chunk things and do smaller Commits, or your Cross Product was fine for a smaller set but now takes up CPU and Memory as it scales larger, this may also be true for using DISTINCT, you are calling a function for every row, your key matches don’t use an index because of CASTING type conversion or NULLS or functions… Too many possibilities here.
In general when you write a query, you should have some mental picture of roughly how certain data is distributed within your table. A column for example, can have an evenly distributed number of different values, or it can be skewed, 80% of the time have a specific set of values, whether the distribution will varying frequently over time or be fairly static. This will give you a better idea of how to build an efficient query. But also when debugging query performance have a basis for building a hypothesis as to why it is slow or inefficient.