Tag - Cardinality Estimator

SQL Server 2019 CE, Local Variables and Nullable Columns

In this article, I will demonstrate a small change in the Cardinality Estimator component under the latest compatibility level – 150 (SQL Server 2019).

Let’s first create and populate a sample table:

DROP TABLE IF EXISTS dbo.T;
CREATE TABLE dbo.T(
id INT IDENTITY(1,1) NOT NULL,
c1 INT NOT NULL,
c2 INT NULL,
c3 CHAR(5) DEFAULT 'test',
CONSTRAINT PK_T PRIMARY KEY CLUSTERED (id)
);
GO
CREATE INDEX ix1 ON dbo.T(c1);
GO
CREATE INDEX ix2 ON dbo.T(c2);
GO
INSERT INTO dbo.T(c1)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 10000
FROM dbo.GetNums(1000000);
GO

--update the c2 column to not null for about 5% rows
UPDATE dbo.T SET c2 = c1 WHERE c1 >= 9500;
GO

Ensure that the database runs under the compatibility level 140 and consider the following simple query with a local variable:

ALTER DATABASE New SET COMPATIBILITY_LEVEL = 140;
GO
DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c1 = @v;
Here is its execution plan:

e00
This is a very known situation – an estimation for an equality predicate with local variable. The estimated number of rows is calculated as the All density attribute from the stats density vector multiplied by the Rows from the stats header.

e02


SELECT 0.00010002*1000000;
--100.02000000

The c1 column does not allow NULLs. What would be the estimation for a nullable column? The c2 column in our table is nullable. Let’s have a check:

DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c2 = @v;


e01

The estimation is significantly higher and even lead to a plan with Clustered Index Scan! However, it follows the same formula for the estimation. Let’s have a look at the statistics object:

e03

By using the same formula, the estimated number of rows is:

SELECT 0.001992032*1000000
--1992.032000000

This is exactly what we see in the execution plan. OK, that was CL 140, let’s see how SQL Server 2019 handles this simple case.

When we switch to CL 150, the plan and estimations for the c1 column (non-nullable) are the same. However, the estimation for the nullable column is changed!

ALTER DATABASE New SET COMPATIBILITY_LEVEL = 150;
GO
DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c2 = @v;
Here is the plan:

e04

This time, the estimation is not so high, and we are getting more appropriated plan, based on the Nested Loop join operator. But, where is 104.397 coming from? Let’s look at the statistics again:

e05

Since the column is nullable, CE apply the All density against non-nullable rows only. Thus, the formula seems to be All density x (RowsEQ_ROWS for RANGE_HI_KEY = NULL).

SELECT 0.001992032*(1000000-947697);
--104.189249696

Well, it is not exactly 104.397, but very, very close to it.  I did not find official documentation for this, therefore cannot confirm that the mentioned calculation is always correct. However, this new behavior makes more sense: the average value from the stats density vector refers to non-nullable set of values, and it’s wrong to multiply it with all rows.

Funfact – the behavior is actually not new at all – that’s exactly how the old CE (under the compatibility levels <=110) handles this case.

I would like to thank my colleague Vladimir Đorđević for making me aware of this behavior.

Thanks for reading.

Beyond Statistics Histogram Part 4 – Significant Performance Regression in SQL Server 2014

In the previous articles we’ve covered differences in estimations between the old and new CE related to queries that use predicates with literals which values are beyond the statistics histogram. We saw some issues with key ascending columns in SQL Server 2012 which can be solved with the trace flags 2389, 2390 and 4139. And we saw that the same queries perform well in SQL Server 2014 without flags. It seems that this is a reason why these flags are not supported in the newest SQL Server version.

You can find some articles about new CE and all of them wrote about improved estimations for key ascending column queries in SQL Server 2014. You can think that there is no problem with key ascending columns anymore. This is only particularly true. In this article I will show you another query pattern with significant performance regression in SQL Server 2014.

Key Ascending Problem Solved?

In one of my previous articles Beyond Statistics Histogram – Part 1 we saw that the same query performs significantly better in SQL Server 2014 (from more than 4M to 455 logical reads) and that the new CE, opposite to the old CE, does not estimate 1 row whenever a value beyond stats histogram is used in a predicate. It estimates significantly more. More enough to govern the Query Optimizer to use a correct index. The problem described in the mentioned article (a most common ascending key problem) is caused by the estimation of 1 for the orderdate predicate. In this concrete case it would be enough to estimate more than 110 rows (which is the estimate for the custid predicate) and the Query Optimizer would use the index on custid, as we initially expected. So, the new CE did it – it estimated 90.000, significantly more than needed, and the plan was OK – the problem solved.

But where did this estimation come from? That’s an interesting question, maybe irrelevant in this particular example, but it definitely deserves an answer.

Let’s create a new sample table and populate it initially with 1M rows and add then additional 100K rows by using this code. As we saw in previous posts this is not enough to trigger statistics update and the statistics object still see only orders until 31.12.2013. Now when we have a sample table and sample data let’s implement our very simple requirement: to return all orders from yesterday and today, sorted by amount descending. Here is the solution::
SELECT * FROM dbo.Orders WHERE orderdate >= DATEADD(DAY,-1,GETDATE()) ORDER BY amount DESC;

A very basic query with expected Index Seek on the orderdate index followed by Key Lookup, since we don’t expect more than 500-600 rows. And this is exactly what we get. In SQL Server 2012.

L1

What about SQL Server 2014? Well, “a little bit different”…

L2

A clustered index scan and memory grant of 900 MB to return and sort 60 rows only! And yes, this is again a “key ascending problem”. And it is not solved. It is actually introduced with the new CE! There is no such problem under the old CE! OK, the old estimation of 1 was not correct, but it’s definitely better than 30.000 given by the new CE. You can even fix the estimation of 1 under the old CE by using TF 2390, which was the solution for the key ascending problem before SQL Server 2014. Let’s execute the same query with TF 2390:
SELECT * FROM dbo.Orders WHERE orderdate >= DATEADD(DAY,-1,GETDATE()) ORDER BY amount DESC OPTION (QUERYTRACEON 9481, QUERYTRACEON 2390);

And now observe the execution plan:

L3

You can see an excellent estimation and expected execution plan with appropriate memory grant. So, with the old CE we have one acceptable and one excellent execution plan. With the new CE a suboptimal one with significantly overestimated memory grant.

New CE simply ignores literals if they are beyond the statistics histogram. It simply estimates 30% of the total number of modifications for the leading statistics column since the last time statistics were updated. In our case this number is 100.000 and the estimation is then 30.000. And the new CE has the same estimation (30.000) for the following two queries:
SELECT * FROM dbo.Orders WHERE orderdate >= '20150314' OPTION (RECOMPILE);
SELECT * FROM dbo.Orders WHERE orderdate >= '34450101' OPTION (RECOMPILE);

In case of BETWEEN operator the estimation is 9% of the modification_counter value, but again the literals are completely ignored. I am not happy with this fact at all. The estimation done by the old CE was hardcoded to 1, but the new CE has hardcoded it too, not to the constant, but to the expression (i.e. modification_counter * 0.3 for  > or <, modification_counter *0.09 for BETWEEN etc.). To make the problem even worse, the trace flags 2389, 2390 and 4139 are not supported in SQL Server 2014 and you cannot force the optimizer to re-calculate statistics object on the fly. This functionality would help here, but it is removed. I created the following Connect Item for this issue. If you think that this could be useful please vote for it.

Conclusion

The key ascending problem caused by constant underestimation in SQL Server 2012 can lead to significant performance problems with the default QO behavior. However, with TF 2389, 2390 and 4139 the problem can be solved with acceptable costs. With the new Cardinality Estimator in SQL Server 2014 the problem disappear for common query patterns, but the execution of some other queries, which perform well in old SQL Server versions, shows significant performance regressions. New CE significantly overestimates number of rows for predicates using literals beyond the statistics histogram which usually induces an execution plan with the scan of large tables. Unfortunately all trace flags which are helpful in previous versions are not supported anymore. We hope, however, that this functionality still can be back, at least until the status of the corresponding connect item is active.

Thanks for reading.

Beyond Statistics Histogram – Part 3 (TF 2389 feat. TF 4139)

In the previous article we’ve shown how can we use TF 2390 to handle queries with literals beyond statistics histogram in SQL Server 2012. We saw that SQL Server creates some statistics on the fly for all columns used in filter expressions. If we would enable trace flag on the instance level that means that whenever SQL Server creates an execution plan, it will touch indexes on all filtered columns in all queries. And not all columns have problems with ascending statistics. It would make more sense to apply these action only on columns branded by SQL Server as “Ascending” columns. This is exactly use case for trace flag 2389. It performs the same action and ends up with the same result as TF 2390, but it is applied to columns that SQL Server knows as ascending columns.

SQL Server marks a column as ascending column in a little bit esoteric way. Statistics object on the column should be updated three times in a row with the option “FULL SCAN” and for each update more than 90% of newly added rows should have value in that column greater than the max value from the actual statistic histogram. If this is true the column is branded as ascending. If it’s not the column is marked as stationary.

To demonstrate this let’s create a table and populate it with 10M rows following the procedure we described above. Click here to download the code. After we have updated statistics three times we can see that SQL Server marked the statistics object on the orderdate column as ascending.

DBCC TRACEON(2388);
DBCC SHOW_STATISTICS ('OrdersAsc','ix_orderdate');
DBCC TRACEOFF(2388);

Here is the output that confirms that the leading column Type is Ascending.

1201

Now we add additional 1M rows into the table. Again, this is not enough for automatically statistics update and statistic object remains unchanged.

Let’s recall our query from the previous post, to get all orders for customer with ID 160 in 2014. We will create two queries: one that uses TF 2390 and the other with TF 2389 and an Extended Events session to check which calculations on the fly are performed during the query execution in both cases. Here is the code for the trace session:

CREATE EVENT SESSION TF2389 ON SERVER
ADD EVENT sqlserver.sp_statement_completed(SET collect_statement=(1)
WHERE (sqlserver.database_id=10))
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=4096 KB,MAX_DISPATCH_LATENCY=1 SECONDS)
GO
ALTER EVENT SESSION TF2389 ON SERVER STATE = START;
GO

Let’s execute both queries:

SELECT * FROM dbo.OrdersAsc
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2390, RECOMPILE);
SELECT * FROM dbo.OrdersAsc
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2389, RECOMPILE);

As mentioned above we got exactly the same execution plan and execution details. The only difference we can see in the trace file:

1202

TF 2389 instructs SQL Server to check the value in the orderdate column, while TF 2390 checks the column custid, too. So, if you want to apply this workaround only against really ascending columns use TF 2389. However, TF 2389 works only for columns branded as ascending; if column status changes back to Unknown it does not work anymore. TF 2390 works in both cases. So, if you are sure that column will be always ascending, use TF 2389.

And what happens if the column is marked as Stationary? Well, neither 2390 nor 2389 can help in this case! The plan is suboptimal, the same as without trace flags, with a lot of logical reads and unacceptable long execution time. Until July 2014 SQL Server 2012 didn’t have a workaround for dealing with stationary columns. Since that you can use another trace flag (TF 4139) to simulate the same behavior with stationary columns. TF 4139 is available from SQL Server 2012 SP1 CU 10 and SQL Server 2012 SP2 CU 1. More info about it you can find in this KB article.

Use this code to populate a table identical to previous one, but where the orderdate column is marked as Stationary.

Now we’ll execute the same query, this time in three different versions:

SELECT * FROM dbo.OrdersStat
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2390, RECOMPILE);
SELECT * FROM dbo.OrdersStat
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2389, RECOMPILE);
SELECT * FROM dbo.OrdersStat
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 4139, RECOMPILE);

Let’s observe the execution plans:

1203

As we can see trace flags 2389 and 2390 are helpless, and only TF 4139 ensures the correct execution plan.

Conclusion

Let’s summarize dealing with ascending columns with the help of trace flags so far:

TF 2389 helps the Query Optimizer to come up with better estimations and correct execution plans only for columns branded as Ascending.

TF 2390 instructs SQL Server to perform on-the-fly calculation for statistics objects for columns marked as Ascending or Unknown.

TF 4139 works with all three Leading Column Types: Ascending, Unknown and Stationary. However, it is available from SQL Server 2012 SP1 CU10 or SP2 CU1

And one note regarding the status of the flags in the latest SQL Server version: SQL Server 2014 does not support these flags. It ignores all of them, since the behavior we described in this and previous post is implemented in the new database engine and SQL Server 2014 without trace flags comes up with similar execution details as SQL Server 2012 with the flags. It seems that we don’t need these flags in SQL Server 2014. It seems. Until the next post.

Thanks for reading.

Beyond Statistics Histogram – Part 2 (TF 2390)

In the previous article of this series we introduced the problem with queries using literals which are beyond statistics histogram, but exist in the table. We saw that this can be a serious problem in large tables in SQL Server 2012. We saw also that SQL Server 2014 handles it better.

In this article we’ll see how to solve this problem in previous versions of SQL Server.

Whenever performance problems are caused by stale statistics one of the possible solutions is to update statistic. However, this process is resource and IO intensive and as soon as the update is finished new rows come in the table and statistics is again out of date. Of course, the amount of newly added or updated data is significantly reduced with regularly updates.

It is recommended to schedule update statistics on most frequently used key ascending columns in volatile tables to reduce discrepancy between actual data distribution and those in the statistics histogram. Even if you schedule statistics update for a large table the most recent entries are again beyond statistics histogram. With the update statistics we can reduce impact of rows missing in the statistics histogram, but what we can do if update is too expensive? How can we help optimizer when statistics are stale on the key ascending columns? The answer: by using trace flags.

Let’s show again the query and the execution plan from the previous post:

SELECT * FROM dbo.Orders
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101';

1001

1002

If we would execute the same query with the trace flag 2390 the estimations and execution details looks significantly better:

SELECT * FROM dbo.Orders
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2390);

1003

1004

Using TF 2390 helped the optimizer to choose the proper execution plan with an Index Seek on the custid column and execution details looks as we expected.

What is the estimation number of rows for the orderdate predicate only with the TF 2390? Let’s check the estimated execution plan:

SELECT * FROM dbo.Orders WHERE orderdate >= '20140101' AND orderdate < '20150101' OPTION(QUERYTRACEON 2390);

1005

The old CE in SQL Server 2012 with the TF 2390 estimates about 666K rows. Just to remind you that the new CE comes up with 90K. This difference is important and we’ll discuss it in the next post. For this query it’s clear that the other predicate is winner when we use selectivity as criteria and there is no doubt for the Query Optimzer regarding the index usage.

How it gets the devil number of Ks for the Estimated Number of Rows? What’s the magic behind the TF 2390?

To check this we’ll create an Extended Event Session and add the event sp_statement_completed. It is recommended to limit the session to get only events for the specific database. Here is what we collected when we executed the above query:

1006

We see two queries doing some statistics calculations by using the max value from the actual table for both columns involved in the query: custid and orderdate. The estimated number of rows of 666K strongly correlates to the expression (last order date from the actual tablemax RANGE_HI_KEY from stats histogram) * average number of orders per day. It assumes uniform distribution of data beyond the stats histogram which is in case of the ordering system a reasonable assumption. The estimation of 666K compared to 1M of actual rows is a good estimation. (Especially compared to the default estimation (without TF) of 1 or 90K from SQL Server 2014) Our statistics object has been created automatically when we created the index on the orderdate column (see appropriate code in the previous post). If we wold have only statistics object (auto- or manually created) but without an index, TF 2390 would not improve performance. It requires an index on the column; otherwise it is not able to efficiently find the maximum value. The statistics is created on the fly and does not persist in the system. For more details you can find in the Dmitry Pilugin’s (blog | twitter) excellent article series..

Now we have a good estimation for the orderdate predicate; for the custid we had it anyway. The literal used for the custid (160) isn’t behind the histogram. Why then SQL Server calculated statistics on the fly of it too? TF 2390 instructs SQL Server to perform this calculation against all columns which participate in the filter criteria regardless of used literals. For instance, even if we would look for orders from 2013 which are in the histogram, the stats calculation on the fly would be performed. Actually, not for all columns; it will work for all columns which Leading column Type is Ascending or Unknown. For columns with the leading type Stationary no stats will be calculated on the fly and therefore these columns cannot have benefits from the TF 2390. I will cover all three leading types and how to handle differences in the next post.

How to find out the column leading type? This is detailed described in the excellent article Statistics on Ascending Columns written by Fabiano Amorim (blog | twitter). Again we need a trace flag. This time TF 2388.

DBCC TRACEON(2388);
DBCC SHOW_STATISTICS(Orders, IX_Date);
DBCC TRACEOFF(2388);

When TF 2388 is switched on DBCC SHOW_STATISTICS shows additional info about statistics objects. One of additional attributes is leading column type.

1007

We can see that SQL Server does not know the leading column type for the column orderdate; it is marked as Unknown. Therefore TF 2390 does statistics calculation on the fly when this column participates in filter expressions. Default behaviour of the query optimizer is to use meta data and statistics information during the plan generation, the table or index structure are not touched at all. By using TF 2390 this is not true, SQL Server touches the appropriate index in order to calculate statistics on the fly as a part of good enough plan finding.

Conclusion

The problem with queries using literals which are beyond statistics histogram can be solved in SQL Server 2012 by using TF 2390. Instead of estimation of 1 row the QO estimates by non-persistent statistics calculation on the fly. This brings a small overhead since the calculation on the fly will be performed against all columns participating in filter expressions (not only on those where it is required), but it is anyway better as default CE behavior prior to SQL Server 2014. The solutions done by TF 2390 and old CE and new CE in SQL Server 2014 for this query pattern looks similar, but some companies don’t allow usage of trace flags in th eproduction system. Therefore for them the recommended solution is to use SQL Server 2014 CE. But again, for this query pattern, not for all of them!

Thanks for reading.

Beyond Statistics Histogram – Part 1

In this article I will start to examine estimations and execution plans for queries that compare values in columns with literal values which are beyond the statistics histogram for that column. The problem is also known as key ascending column statistics problem.

As in previous posts we remain here conservative regarding sample tables. We’ll create again our Orders table and populate it with 10M rows. All orders will have the order date from 1998 to 2013. In addition we’ll create one index on the custid and orderdate columns. Use the following link to download the code to create and populate the table.

Please be aware that above script will run few minutes and will create a table with the size about 1.5 GB and almost 500M for two indexes.

Since we have created indexes after the table population at this point statistics are up to date. And only at this point. As soon as data changes or new orders are inserted into the table, statistics are out of date; thus they are always out of date, at least for volatile tables. That means that most recent orders are not seen by statistics objects. Can this causes a serious problem? Yes, for statistics on columns where most of newly added rows has value which is greater than the maximum value from the statistics histogram for this column. If values are monotonically increasing these columns are called key ascending columns. For example transaction date, insert timestamp, sequence number column are typical key ascending columns.

Let’s back to our table and simulate this behavior. We’ll insert another 1M rows into the Orders table. For all orders we’ll put some order date from 2014, so that all of them have value which is greater than the maximum value from the statistics histogram. The code you can get here. Please note that order dates are populated random and the column is not key ascending column, it just met the above criteria that newly added rows have value which is greater than the maximum value from the statistics histogram on the order date column.

One million of rows is a large number but still not enough to trigger automatic statistics update. Therefore our statistics object is still the same as it was after initial load:

Beyond1_1

The most recent order for the statistics histogram is still 30th December 2013, although we added 1M rows after that!

Now we need to execute some common queries against such table. For instance let’s help our customer with the ID 160 to find all his or her orders from 2014! This is a pretty easy task, isn’t? Some our customers would also know how to write such simple query:

SELECT * FROM dbo.Orders
WHERE custid = 160 AND orderdate >= '20140101' AND orderdate < '20150101';

A simple query with two predicates; we have an index on both of them. Imagine that our customer is a typical customer (i.e. his number of orders is not far away from the average number of orders). We expect fast response time, in the execution plan we expect to see an Index Seek on the index on the customer id and not too much logical reads (<10K in this case).

What we get?

Beyond1_2

Beyond1_3

Ooops… Elapsed time 1.5 seconds! More than 4 millions of logical reads!? Only one expectation is met: the plan contains an Index Seek operator. Unfortunately on the “wrong” index. In the plan we can see that the query engine takes all orders from 2014 and then (in the Key Lookup operator) checks if an order has been made by the customer 160. Opposite to what we expected! We have expected that the QO will first get all orders for customer 160 and then extract those from 2014. It’s very uncommon, even impossible, that one customer has more orders than all customers for a whole year. It is more efficient to get rid of most of the orders in the first step and then in a second step to execute the other filter. That’s how we think it should work. That’s exactly how it works! Where is the problem then?

How SQL Server in this case decides should an index or which one from two indexes will be used? It estimates selectivity of individual predicates and then according to selectivity makes a decision. If two much rows will be returned it will not use an index at all. Otherwise it will use the index on the column with higher selectivity. Let’s decompose our query to two queries with individual predicates and choose Display Estimated Execution Plan.

SELECT * FROM dbo.Orders WHERE custid = 160;
SELECT * FROM dbo.Orders WHERE orderdate> = '20140101' AND orderdate < '20150101';

Here is the result:

Beyond1_4

Now it’s clear why SQL Server decided to use the index on the order date column. It expected only one row! Actually it did not expect rows at all, but when SQL Server does not expect rows it returns 1 and not 0. This is strange behavior and has something to do with basic assumptions in the old cardinality estimator.

So, 1 is less than 100 and therefore it makes more sense to use the index on the orderdate than one on the custid. Of course, this is great when estimations are more or less correct, but in this case the one with orderdate is significantly wrong. The only problem with it is that SQL Server does not know it. It does not know for any order from 2014 although we are in November 2014! This is very danger as we can see from the execution plan. Almost 2 seconds and 4 Mio logical reads. And imagine that you have 10K customers logged in the system where 100 of them check their recent orders…

OK, it’s clear that default behavior in SQL Server 2012 leads to serious problems with significant performance issues.

Let’s see what happens when we execute the same query against SQL Server 2014 with the database compatibility level 120.

Beyond1_5

Beyond1_6

The elapsed time dropped to 26 milliseconds! Instead of millions of loaded pages only 455 logical reads were needed now. And an Index Seek against the “right” column. Impressive! New cardinality estimator rocks!

Let’s look at the estimation process in the new CE. We’ll display again the estimated execution plan for individual predicates as we did for SQL Server 2012. Here is the result:

Beyond1_7

New CE expects significantly more rows for the date predicate (90K compared to old CE’s 1) and the query optimizer had to make an easy decision and to use an Index Seek on the customer id. The new CE stills underestimates the number of rows; the actual number is 1M, but to solve the estimation problem for this query pattern it was enough to get rid of truly underestimations of 1 row. It’s enough (in this particular case) to estimate for instance 200 rows and the expected index would be used. The new CE comes up with 90K and solved the problem. In one of our next posts we will look more detailed in this estimation. You will see the other side of the medal of the new CE.

One our predicate (orderdate) has been affected with outdated statistic, the other predicate wasn’t affected. However, from the estimated plans you can see that the estimated number of rows is not the same for old and new CE. The old CE estimates 99.7 while the new CE comes with a little bit more 109.7. A “little bit” is 10% in this case. A detailed explanation about this change in the new CE you can find in my previous post.

Conclusion

New CE solves problems for queries with large tables which are caused by conservative and hardcoded underestimation done by the old CE (1). It estimates significantly more and therefore avoid the problem we described here. However, as we mentioned the way how new CE estimates is good enough to avoid this problem, there are some other query patterns too. We’ll cover them in the next posts.

Thanks for reading.

A Funny Bug with New CE or Introduction to Out-of-Histogram Estimations

Since the release of SQL Server 2014 you can find several articles and blog posts about the changes in the new redesigned cardinality estimator. An excellent and detailed guide about the changes in the new CE you can find in the white paper Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator. Almost all of them describe changes related to cardinality estimation process for queries with multiple predicates, some of them with details about exponential back-off algorithm implementation. This is an important change, the change in the CE assumption set, but very rare it will lead the new CE to use an execution plan different than the one with the old CE. Estimations done with the new CE are higher than before SQL Server 2014, but they are usually not different more than one order of magnitude.

The most important and significantly (sometimes dramatically) different estimations you can find in queries with predicates referring to the values which are out of statistics histogram. The old CE estimates 1 in this case, the new CE comes with significantly different values. In the mentioned white paper and some posts and presentations you can find the statement that the new CE comes with better estimations (since we already know that the old CE’s estimation of 1 is often wrong), but I would use the word different instead of better. In the next few posts I will try to explain why.

In this article I will describe a funny bug with a query with a predicate which refers to the values which are not in the range covered by the appropriate statistics histogram. The regression is significant; however, the data distribution and query pattern are so rare and strange, that I have used the word “funny” in the bug description.

First let me introduce my sample table named atTestNewCE. The size is about 1.5 GB, it has 8.7 Mio rows. Here are few of them:

9_1

9_2

The table has an index on the columns fStatus and fC1 and the appropriate statistics is up-to-date. Here are statistic details:

9_3

You can immediately see a strange thing. The column fStatus has value 2 for all rows. This is strange and very rare, but as you can see, it exists and in a large company you can regularly find unusual data distributions and strange queries in the system.

OK, where is the problem with this column? Let’s imagine that we need to do something with rows which has values 1 or 3 in this column. We’ll write a query to extract these rows. And we’ll write it twice to execute it with the old and new CE. Since my test database is in 120 compatibility mode I will use TF 9481 to simulate the old CE behavior. Here are the queries:

SELECT fC1, fC2TimestampGMT
FROM atTestNewCE
WHERE fStatus IN (1,3) ORDER BY fC1 OPTION(QUERYTRACEON 9481);

SELECT fC1, fC2TimestampGMT
FROM atTestNewCE
WHERE fStatus IN (1,3) ORDER BY fC1;

And we are, of course, curious to see the execution plans. We have an index on the predicate column, the query is very selective, so, what we expect here is a Nested Loop Join plan with the Sort operator, but with a very small Memory Grant. Let’s see if CEs meet our expectations:

9_4

9_5

Well, isn’t that interesting? The old CE with the estimation 1 and the new CE with the table cardinality! And both of them know (stats is up-to-date) that there are no rows meet the predicate criteria.

What’s difference? The old CE believes that the statistics is up-to-date; while the new CE expects that the statistics is always out-of-date and always expects something out of statistics histogram. The assumption that the statistics is out-of-date is realistic and this change in the new CE is not wrong conceptually. You should not expect 1 row (even in this specific case, this estimation would be almost correct, the correct is 0), but 8.7 Mio rows definitely not!

Of course, with so different estimations we end up with different plans and significant Memory Grant overestimations in the case of new CE.

Workaround

As mentioned this query pattern and data distribution are not so frequent, but I will anyway show you a workaround. This workaround is at the same time an argument why I think that the described behavior is not a different assumption, but a bug.

When you write the expression A IN (1, 2) the query engine rewrites it to an equivalent form: A = 1 OR A = 2. It seems that the new CE has troubles with expressions having the OR operator (see my previous posts). If you would, however, use UNION ALL and rewrite the query:

SELECT fC1, fC2TimestampGMT
FROM atTestNewCE
WHERE fStatus = 1
UNION ALL
SELECT fC1, fC2TimestampGMT
FROM atTestNewCE
WHERE fStatus = 3
ORDER BY fC1;

The execution plan and details looks OK:

9_6

So, it seems again improper handling of OR operator.

This is an issue ONLY if the stats histogram contains only one element! As soon as the second entry comes into the histogram (regardless of uneven data distribution) both CEs come with the same estimations. Strange again, but at the same time it’s good, since it significantly reduce the surface for potential regressions.

Conclusion

This is an introduction article to changes between new and old CE regarding the values which are out of statistics histogram. These are the most significant differences with potential significant execution regressions. In the next few articles I will describe in details problems related to Key Ascending columns in large tables in both SQL Server 2012 and 2014, where the old and new CE come with significantly different estimations leading to different execution details.

Thanks for reading.

New Cardinality Estimator Part 8 – Bug No. 2

In this article I will describe another query pattern where new CE in SQL Server 2014 brings significant regression. And, similar to the case in this post I would interpret the behavior of new cardinality estimator in this case as a bug, too.

Let’s recreate our usual sample table and populate it this time with 10M rows. Use the following code to create and populate the test table (the help function dbo.GetNums is developed by Itzik Ban-Gan and you can download it here).

IF OBJECT_ID('dbo.Orders', 'U') IS NOT NULL DROP TABLE dbo.Orders;
CREATE TABLE dbo.Orders(
id INT IDENTITY(1,1) NOT NULL,
custid INT NOT NULL,
orderdate DATETIME NOT NULL,
amount MONEY NOT NULL,
note CHAR(100) NULL,
CONSTRAINT PK_Orders PRIMARY KEY CLUSTERED (id ASC)
)
GO
-- Fill the table with 1M rows
DECLARE @date_from DATETIME = '19980101';
DECLARE @date_to DATETIME = '20131231';
DECLARE @number_of_rows INT = 10000000;
INSERT INTO dbo.Orders(custid,orderdate,amount, note)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 10000 AS custid,
(
SELECT(@date_from +(ABS(CAST(CAST( NEWID() AS BINARY(8) )AS INT))%CAST((@date_to - @date_from)AS INT)))) orderdate,
50 + ABS(CHECKSUM(NEWID())) % 1000 AS amount,
'note' AS note
FROM dbo.GetNums(@number_of_rows)
GO
CREATE INDEX IX_Cust ON dbo.Orders(custid);
GO
CREATE INDEX IX_Date ON dbo.Orders(orderdate);
GO

Please be aware that above script will run few minutes and will create a table with the size about 1.5 GB.

8_1

Now we want to get some data from the table. We’ll provide a small list of Customer IDs and dates and want to get all orders for those customers with the order date after the given date from the input list. Let’s assume that we have the following list:
8_2

We want to get all orders for customer with ID 897 with the order date after 1st October 2013 and all orders for customer with ID 2112 with the order date after 1st September 2013. Let’s write a query:

DECLARE @T AS TABLE(
custid INT NOT NULL,
orderdate DATETIME NOT NULL
);
INSERT INTO @T(custid,orderdate) VALUES(897, '20131001'), (2112,'20130901');
SELECT o.*
FROM @T t
INNER JOIN dbo.Orders o ON o.custid = t.custid AND o.orderdate > t.orderdate;

 

Here are the execution plan and execution details for the above query generated by SQL Server 2012 database engine:

8_3

8_4

It looks good. Whenever you join a large table with a table variable you want to see Nested Loop Join in the execution plan, logical reads and exertion time are OK, too.

Let’s see what happens when we execute the same query against SQL Server 2014 with the database compatibility level 120:

8_5

8_6

You don’t want to see Hash Join with large tables for a query which returns a few rows. The execution took more than 2 seconds with 167K logical reads and all just to return 25 rows!

Similar to the problem discussed in this post, it seems that new CE has a bug when it has to estimate rows for JOIN with multiple predicates when at least one is non-equal or at least one column is null-able. The following queries will end up with Hash (or Merge) Join regardless of table size.

SELECT * FROM A
INNER JOIN B ON A.id = B.id AND A.c1 > B.c1;
SELECT * FROM A
INNER JOIN B ON A.id = B.id AND (A.c1 = B.c1 OR A.c1 IS NULL);

Why I don’t use a term “bug” rather than “regression” in this case? Let’s remove the second part of requirement and ask for all orders for customers with IDs from a given list. This is a superset of the results we got in our original query.

DECLARE @T AS TABLE(
custid INT NOT NULL,
orderdate DATETIME NOT NULL
);
INSERT INTO @T(custid,orderdate) VALUES(897, '20131001'), (2112,'20130901');
SELECT o.*
FROM @T t
INNER JOIN dbo.Orders o ON o.custid = t.custid;

Let’s observe the execution plan under the new SQL Server 2014 database engine:

8_7

It looks good! So, for less restrictive filter it estimates significantly less rows and the estimation is correct with a desirable plan. If we would remove the predicate with equals operator and leave the other one, i.e.

SELECT o.*
FROM @T t
INNER JOIN o.orderdate &amp;gt; t.orderdate;

SQL Server would come up with the cardinality of the large table (10M), which is expected and common for both CEs.

The point of difference is how they combine two predicates. It could be that new CE does not know how to implement exponential back-off algorithm in this case and I would accept some guess value, but this value should be between 1 and the cardinality of the first predicate (1901). Even if the predicates are fully correlated the resulted cardinality (pred1 AND pred2) cannot be greater than the lowest particular predicate cardinality. The resulted cardinality is simply cardinality of the large table which is not only statistically, but also logically incorrect. Therefore I would interpret this as a bug.

Workaround

Is there here any workaround? Yes, we can split our query in two steps. In the first step we’ll process only first predicate and return all orders for all customer IDs from the list and store them in a temp structure and in a separate statement remove the orders from there according to the second predicate:

DECLARE @T AS TABLE(
custid INT NOT NULL,
orderdate DATETIME NOT NULL
);
INSERT INTO @T(custid,orderdate) VALUES(897, '20131001'), (2112,'20130901');
DECLARE @Result AS TABLE(
id INT NOT NULL PRIMARY KEY,
custid INT NOT NULL,
orderdate DATETIME NOT NULL,
amount MONEY NOT NULL,
note CHAR(100) NULL,
orderdate2 DATETIME NOT NULL
);
INSERT INTO @Result
SELECT o.*, t.orderdate AS orderdate2
FROM @T t
INNER JOIN dbo.Orders o ON o.custid = t.custid;
DELETE FROM @Result WHERE orderdate2 >= orderdate;
SELECT id, custid, orderdate, amount, note
FROM @Result;

Let’s observe the execution plan:

8_8

A little bit more code, less intuitive, not exactly fast and elegant as under the old CE, but Nested Loop Join is here again. And generally, when you work with very large tables in OLTP you don’t want to see the other join operators. Of course, another option is to use the old compatibility level for database or to use OPTION(QUERYTRACEON 9481) for the statement execution, both with limits and restrictions already discussed in this post series.

Conclusion

New CE definitely brings some regression, but in few cases it comes up with logically incorrect estimations which can significantly affect queries where large tables are joined with table variables. In the here described case the performance degradation is dramatic and unacceptable. Fortunately, there is a workaround which allows us to solve the problem and still use new compatibility level for our database. And again, it’s very important to test all important queries from your workload. At the end for you it’s more important how the new CE behaves against your workload rather than for most of the workloads of most of Microsoft customers most of the time. Thanks for reading.

New Cardinality Estimator Part 7 – Single Table, Multiple Predicates and Trace Flags

This week I have got a question regarding my post New Cardinality Estimator Part 4 – Single Table and Multiple Predicates. The question was about possibilities to affect the estimation with multiple predicates done by new cardinality estimator.

To demonstrates these possibilities we will use again the AdventureWorks2012 database and the SalesOrderHeader table.

SQL Server 2012
Let’s execute the following queries in SQL Server 2012. First two queries have only one predicate and in the third query they are combined with the AND operator:

SELECT * FROM Sales.SalesOrderHeader WHERE OnlineOrderFlag = 0;
SELECT * FROM Sales.SalesOrderHeader WHERE SalesPersonID = 282;
SELECT * FROM Sales.SalesOrderHeader WHERE OnlineOrderFlag = 0 AND SalesPersonID = 282;

Let’s observe the estimation details:

Figure7_1

As we can see in one of the previous posts SQL Server 2012 assumes that there is no correlation between predicates and any additional predicate significantly reduce the estimated number of rows. SQL Server estimates (accurate) that 271 orders are made by the sales person identified by id 282, but when we in addition filter by online flag too, the estimation decreases to only 33 rows.
However, in this particular case we know that an order having a sales person as attribute is always done offline, so, its OnlineOrderFlag attribute is from business point of view always 0. Therefore these two predicates are strongly correlated and with the assumption about independency between predicates the Query Optimizer cannot make a good estimation. Can we do something to get better estimation? Yes, we can use the trace flag 4137 to enforce the full correlation between predicates (which results with the lowest cardinality of individual predicates).

SELECT * FROM Sales.SalesOrderHeader WHERE OnlineOrderFlag = 0 AND SalesPersonID = 282 OPTION (QUERYTRACEON 4137);

Figure7_2

SQL Server 2014
In SQL Server 2014 predicates are correlated. They are not independent anymore, but they are also not fully correlated. Here are the estimations for the first three queries made by new cardinality estimator:

Figure7_3

The estimated number of rows is higher than with SQL Server 2012, but again far away from the actual number of rows. Again, we can use the trace flag. This time the trace flag number for enforcing full correlation is 9471.
Since predicates in SQL Server 2014 are somehow correlated and you know that the predicates are not correlated you have another one flag 9472 to force the same estimation for multiple predicates as with old CE. Here are estimations with new CE:

Figure7_4

Conclusion
In SQL Server 2012 predicates are independent. If you want to force full correlation (simple containment) use the trace flag 4137.

SQL Server 2014 assumes that predicates are correlated. To enforce full correlation use the trace flag 9471. For the same estimation as with SQL Server 2012 (no correlation) use the trace flag 9472.

New Cardinality Estimator Part 6 – Simple JOIN And Foreign Key

In this article we will check if PK-FK relationship is respected by new cardinality optimizer in case of a simple JOIN with tables in the mentioned relation. The old CE did not meet expectations, we hope that new CE will handle it better.

We will use again the AdventureWorks2012 database and the JOIN between Production.TransactionHistory and Production.Product tables. The transaction table has a key (ProductID) which refers the same key from the product table. So, each row in the transaction table has some value in the ProductID and this value has corresponding row in the product table.

First we’ll look how the old CE deals with it. Therefore we set trace flag 9481 at the statement level to instruct the optimizer to use old CE. Let’s assume the following query:

SELECT  t.TransactionID, t.TransactionDate, t.ActualCost, t.ProductID
FROM Production.TransactionHistory t
INNER JOIN Production.Product p ON t.ProductID = p.ProductID
OPTION (QUERYTRACEON 9481);

Let’s observe the execution plan:

Figure6_1

In the execution plan we can see the transaction table only. Since only columns from this table are returned, the product table is used as a filter. However, the optimizer is aware of the foreign key relationship and uses these metadata to ensure that it is not required to access the product table. This is expected behavior (and the same happens in new CE) and so far everything looks perfect.

However, in reports or lists IDs are not welcome; users would rather have some descriptions instead of IDs. Therefore we will update our query according to these expectations. We will look up the product name from the product table and show it instead of product id.

SELECT t.TransactionID, t.TransactionDate, t.ActualCost, p.Name
FROM Production.TransactionHistory t
INNER JOIN Production.Product p ON t.ProductID = p.ProductID
OPTION (QUERYTRACEON 9481);

Let’s observe the plan again:

Figure6_2

This time the product table appears in the plan and this is, of course, expected otherwise the query engine would not be possible to return the product name. However, the Estimated Number of Rows is underestimated. The fact that we should return the product name in addition to transaction table attributes does not change the number of returned rows. The product table should not change the cardinality of the join and the final cardinality is simple cardinality of the transaction table. Each row in the transaction table has exactly one corresponding row in the product table and this ensured by the PK-FK relationship. So, the optimizer does not respect the foreign key in this case. This is definitely a bug. It is reported in this connect item. An estimation improvement can be achieved by using additional trace flag 2301 to support advance optimization decisions.

SELECT t.TransactionID, t.TransactionDate, t.ActualCost, p.Name
FROM Production.TransactionHistory t
INNER JOIN Production.Product p ON t.ProductID = p.ProductID
OPTION (QUERYTRACEON 9481, QUERYTRACEON 2301);

Let’s check the plan:

Figure6_3

The execution plan is of course the same, but the estimation is closer to the actual number of rows. But it’s still not correct. And in this case (PK-FK relationship) it should be. We should not expect that implementation of cardinality estimation model always come up with accurate estimations, this is not possible, but in this particular case we expect this because we provide all necessary metadata (PK, FK) which can help the optimizer to get correct estimations. The final status of the connect item is “won’t fix” but I am curious to see the same query executing under the new CE. Let’s execute it with new CE (just remove the option a the end of the statement) and look at the execution plan and estimations:

Figure6_4

It seems that it’s fixed in new version! This is an expected result and it looks promising for this query pattern. In order to see if the same happens for other queries I have executed 86 such queries in the AdventureWorks2012 database (join of two tables with PK-FK relationship) with old and new CE and here are the results.

Figure6_5

With  old CE only for 18 queries (of 86) has an absolutely accurate estimation, while with new CE 78 queries have perfect estimation (more than 90%). This is a significant improvement, but as we can see the estimated number of rows does not match the actual one for all queries under new CE. In almost all cases (84 of 86) new CE comes up with better or the same estimation as the old CE, but in two queries the old CE did it better. If you are curious to see when the old CE handles better a join with foreign key relationship check the following queries:

SELECT sp.BusinessEntityID, p.LastName
FROM Sales.SalesPerson sp
INNER JOIN Person.Person p ON sp.BusinessEntityID = p.BusinessEntityID
OPTION (QUERYTRACEON 9481);
GO
SELECT sp.BusinessEntityID, p.LastName
FROM Sales.SalesPerson sp
INNER JOIN Person.Person p ON sp.BusinessEntityID = p.BusinessEntityID;

First query is executed under the old CE and the second with newest version available in SQL Server 2014 CTP2. Let’s compare the executions:

Figure6_6

In this case the old CE made a correct estimation and new CE did not handle it well. The query patter is the same, the tables are small. I am still investigating reasons why this query is so specific that new CE struggled with it and will post the results in one of future articles. I am playing with extended event sessions including the XE event.  In this Bob Beauchemin’s (blog | twitter) post you can find how to explore cardinality execution details under new CE via Extended Events.

New Cardinality Estimator Part 5 – Bug?

In this article I will describe a case which is not properly handled by the new cardinality estimator, introduced in SQL Server 2014 CTP2. The discrepancy between estimated and actual values is significant, generated execution plans are sub-optimal and execution details are 5 – 50x worst compared to the old CE plans,. Since performance regression is significant I would interpret the behavior of new cardinality estimator in this case as a bug.

First we would need a sample table. Let’s create it and populate it with 1M rows. Use the following code to create and populate the test table  (the help function dbo.GetNums is developed by Itzik Ban-Gan and you can download it here).

IF OBJECT_ID('dbo.Orders', 'U') IS NOT NULL DROP TABLE dbo.Orders;
CREATE TABLE dbo.Orders(
id INT IDENTITY(1,1) NOT NULL,
custid INT NOT NULL,
orderdate DATETIME NOT NULL,
amount MONEY NOT NULL,
note CHAR(100) NULL,
duedate DATETIME NULL,
CONSTRAINT PK_Orders PRIMARY KEY CLUSTERED (id ASC)
)
GO
-- Fill the table with 1M rows
DECLARE @date_from DATETIME = '19980101';
DECLARE @date_to DATETIME = '20130630';
DECLARE @number_of_rows INT = 1000000;
INSERT INTO dbo.Orders(custid,orderdate,amount, note)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 10000 AS custid,
(
SELECT(@date_from +(ABS(CAST(CAST( NEWID() AS BINARY(8) )AS INT))%CAST((@date_to - @date_from)AS INT)))) orderdate,
50 + ABS(CHECKSUM(NEWID())) % 1000 AS amount,
'note' AS note
FROM dbo.GetNums(1,@number_of_rows)
ORDER BY 1
GO
UPDATE dbo.Orders SET duedate = DATEADD(DAY, 14, orderdate)
GO
CREATE INDEX IXo1 ON dbo.Orders(custid)
GO

Prior to SQL Server 2008 we could find a lot of stored procedures with a comma separated list of Ids packed in a VARCHAR(4000) or an XML or with more than 20 parameters. SQL Server 2008 introduced table valued parameters which allow us to pack all input parameters in a table structure and use this structure as stored procedure parameter. Within the stored procedure the table valued parameter behaves as a table variable and can participate in set-based operations. Therefore sometimes filtering of a large table is not implemented with predicates in the WHERE clause but by joining the table to a table variable. For the purpose of this article we would need a query filtering a large persistent table by joining it to a table variable.  Let’s define a table variable with two attributes: an integer and a datetime and populate it with only one row:

DECLARE @T AS TABLE(
custid INT NOT NULL,
ord_date DATETIME NOT NULL
);
INSERT INTO @T(custid,ord_date) VALUES(1, '20070806');

Now we want to extract order details for customers whose ID is in the table variable but only with order dates specified in the same variable. i.e.

SELECT o.*
FROM @T t
INNER JOIN dbo.Orders o
ON o.custid = t.custid AND o.orderdate = t.ord_date;

The execution plan and estimation details are shown in the following Figure:

Figure5_1

The plan and estimations are identical for both cardinality estimators. Estimated Number of Rows outputted from a table variable did not change in new CE. It is still assumed 1 row. The execution plan shows us that SQL Server scans the table variable and for each entry in it performs an Index Seek to find out if orders exist for this customer ID and then for each so identified customer ID it performs Key Lookup operation which is used for additional filtering – to extract only orders with order date from the table variable and to get additional columns. This is an expected execution plan – we are OK with it, both CEs come up with it – La Vita è Bella…

Let’s slightly change our requirements. Instead of using the orderdate column we will use the duedate column. The difference between these two columns is that values in duedate column are shifted for 2 weeks. An additional difference is that the duedate column is null-able. That requires that we have to clarify our requirements for cases where duedate is not available for some orders. So, we want to return order details for customers which Ids are in the table variable but only for orders with the duedate which matches the value from the table variable. If the value for duedate is not available for an order but customer ID matches the value from the table variable, the order should be returned, too. (I hope it’s clear to me what I wanted to tell here). Let’s make it clear and write the query:

DECLARE @T AS TABLE(
custid INT NOT NULL,
ord_date DATETIME  NULL
);
INSERT INTO @T(custid,ord_date) VALUES(1, '20090701');
SELECT o.*
FROM @T t
INNER JOIN dbo.Orders o
ON o.custid = t.custid AND (o.duedate= t.ord_date OR o.duedate IS NULL)
OPTION(QUERYTRACEON 9481); --old CE
GO
DECLARE @T AS TABLE(
custid INT NOT NULL,
ord_date DATETIME  NULL
);
INSERT INTO @T(custid,ord_date) VALUES(1, '20090701');
SELECT o.*
FROM @T t
INNER JOIN dbo.Orders o
ON o.custid = t.custid AND (o.duedate= t.ord_date OR o.duedate IS NULL); --new CE

Let’s observe the execution plans and the most important execution details. First with old CE:

Figure5_2

And now the plan and execution details under new CE:

Figure5_3

The execution details with new CE definitive don’t look good. Instead of expected Nested Loop Join operator (expected because we have only one customer id and additional filter on the duedate) we see Hash Match Join operator handling a join between table variable with one row and the Orders table. We can see that the Orders table is fully scanned and that the index on the custid is completely ignored. As result we see significant estimation discrepancy  with the order of magnitude 6! Of course the missing estimation resulted with 50 times more logical reads and 5 times longer response time with significant CPU activity. At the same time the old CE handles this situation correct and according to expectations. The same problem occurs if a temp table is used instead of table variable. It will also not help to replace the OR statement by the functions ISNULL or COALESCE. The same issue.

Workaround

Is there any workaround? Yes, we could use the query hint LOOP JOIN and force the optimizer to use Nested Loop Join operator instead of Hash Match Join. Let’s see if it works:

DECLARE @T AS TABLE(
custid INT NOT NULL,
ord_date DATETIME  NULL
);
INSERT INTO @T(custid,ord_date) VALUES(1, '20090701');
SELECT o.*
FROM @T t
INNER LOOP JOIN dbo.Orders o
ON o.custid = t.custid AND (o.duedate= t.ord_date OR o.duedate IS NULL); --new CE

Let’s observe the execution plan:

Figure5_4

So, the execution plan is identical to the plan generated by the old CE, logical reads and execution times are also OK. It seems that we found a workaround. The only remaining problem is this bad estimation. Our query hint affects the choice about physical join operator but all estimations remain untouchable. However, in this case it does not matter, according to execution parameters. We got the same performance as with the old CE. Intuitively, we suspect that something could be wrong – we don’t want to see such  estimation discrepancy, even if the plan and execution look OK. Or estimations are irrelevant when we use query hints? No, they are still relevant and still could affect performance. Soon we’ll see how.

Let’s slightly change our query (in the last version with the LOOP JOIN hint) and put the ORDER BY o.amount at the end of the query. Now let’s execute so modified query and look at the execution plan:

Figure5_5

Adding the ORDER BY in the query affected the execution plan and now we can see a Sort operator. Sort operator requires memory and we can see that for this purpose the query requested 228 MB of memory. Just to remind at this point that our query returns only one row and since the Sort operator is at the end of the data flow (after all unnecessary rows are discarded) we would expect a minimal memory grant (1MB). However,  memory is not granted according to our expectations, but according to cardinality estimations and SQL Server expects 1M rows to be sorted – therefore 228 MB is requested (and granted) for sorting purposes. Actually it would request memory grant which is enough to sort the whole Orders table. This wasting of memory can be important and can significantly degrade overall performance! So, be careful whenever you see significant discrepancy between Estimated and Actual Number of Rows. It could be that you still don’t have problem with it, but the potential for it is great.

Conclusion

It seems that new optimizer does not handle well joins between table variables (or temp tables) and regular tables when null-able columns are involved in join conditions. Regardless of number of predicates in the ON clause if a null-able column is included and its predicate has to return true if the value is missing (NULL) the new CE ignores all the other predicates and selectivities and perform the full table scan. As a result the execution plan uses a Hash Match Join operator even if only few rows are returned. As opposed to the new CE the old CE estimates well and the execution plans are good. A workaround for this problem could be a query hint LOOP JOIN if there is no operator which require memory, or to use OPTION(QUERYTRACEON 9481) to instruct the optimizer to use the old cardinality estimator. Or to wait for a fix in the RTM Version or a cumulative update.