Category - SQL Server 2017

SQL Server 2019 CE, Local Variables and Nullable Columns

In this article, I will demonstrate a small change in the Cardinality Estimator component under the latest compatibility level – 150 (SQL Server 2019).

Let’s first create and populate a sample table:

DROP TABLE IF EXISTS dbo.T;
CREATE TABLE dbo.T(
id INT IDENTITY(1,1) NOT NULL,
c1 INT NOT NULL,
c2 INT NULL,
c3 CHAR(5) DEFAULT 'test',
CONSTRAINT PK_T PRIMARY KEY CLUSTERED (id)
);
GO
CREATE INDEX ix1 ON dbo.T(c1);
GO
CREATE INDEX ix2 ON dbo.T(c2);
GO
INSERT INTO dbo.T(c1)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 10000
FROM dbo.GetNums(1000000);
GO

--update the c2 column to not null for about 5% rows
UPDATE dbo.T SET c2 = c1 WHERE c1 >= 9500;
GO

Ensure that the database runs under the compatibility level 140 and consider the following simple query with a local variable:

ALTER DATABASE New SET COMPATIBILITY_LEVEL = 140;
GO
DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c1 = @v;
Here is its execution plan:

e00
This is a very known situation – an estimation for an equality predicate with local variable. The estimated number of rows is calculated as the All density attribute from the stats density vector multiplied by the Rows from the stats header.

e02


SELECT 0.00010002*1000000;
--100.02000000

The c1 column does not allow NULLs. What would be the estimation for a nullable column? The c2 column in our table is nullable. Let’s have a check:

DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c2 = @v;


e01

The estimation is significantly higher and even lead to a plan with Clustered Index Scan! However, it follows the same formula for the estimation. Let’s have a look at the statistics object:

e03

By using the same formula, the estimated number of rows is:

SELECT 0.001992032*1000000
--1992.032000000

This is exactly what we see in the execution plan. OK, that was CL 140, let’s see how SQL Server 2019 handles this simple case.

When we switch to CL 150, the plan and estimations for the c1 column (non-nullable) are the same. However, the estimation for the nullable column is changed!

ALTER DATABASE New SET COMPATIBILITY_LEVEL = 150;
GO
DECLARE @v INT = 600000;
SELECT * FROM dbo.T WHERE c2 = @v;
Here is the plan:

e04

This time, the estimation is not so high, and we are getting more appropriated plan, based on the Nested Loop join operator. But, where is 104.397 coming from? Let’s look at the statistics again:

e05

Since the column is nullable, CE apply the All density against non-nullable rows only. Thus, the formula seems to be All density x (RowsEQ_ROWS for RANGE_HI_KEY = NULL).

SELECT 0.001992032*(1000000-947697);
--104.189249696

Well, it is not exactly 104.397, but very, very close to it.  I did not find official documentation for this, therefore cannot confirm that the mentioned calculation is always correct. However, this new behavior makes more sense: the average value from the stats density vector refers to non-nullable set of values, and it’s wrong to multiply it with all rows.

Funfact – the behavior is actually not new at all – that’s exactly how the old CE (under the compatibility levels <=110) handles this case.

I would like to thank my colleague Vladimir Đorđević for making me aware of this behavior.

Thanks for reading.

Quo vadis iliti j*** te fičer koji sve radi umesto mene a ja moram ništa

Ovaj post ima previše psovki da bi bio napisan na engleskom jeziku, al računam da nije malo i to što će dopreti do onih koji se služe srpskohrvatskim jezikom.

G1Postalo je gotovo nesnosno kako predstavnici mnogog proizvođača softvera (ne samo iz ogranka marketinga) najnovije fičere, alate i servise prodaju koristeći sintagmu “to sve radi za vas ispod haube,  ne morate kurcem mrdnuti, ne morate ništa znati o tome, ne morate programirati, sve je transparentno, dva-tri klika, malo drag and drop i fičer će sve uradit za vas”. Tome se pridružuje sve više predavača i tzv. eksperata ističući u prvi plan da je neki fičer do jaja jer njegov korisnik gotovo ništa ne mora da zna i ne mora više da brine o sistemu. Doduše i ne mora da brine, al sve dok ga sistem kurčinom ne tresne po čelu.

Toliko je novih stvari koje ne znaš, da prosto ne znaš gde i kako da počneš da ih učiš, a vremena nikad dovoljno. I onda ti neko kaže e ovo uopšte ne moraš da znaš, to radi samo, naravno da ćeš to malo vremena da posvetiš nečemu drugom, potpuno se oslanjajući na taj moćni fičr. Dok god ta stvar radi ono što korisnik očekuje i to radi dovoljno brzo, nema nikakvih problema. Dođe li do neželjenih efekata ili je stvar prespora onda kompania od tebe očekuje da nešto preduzmeš. A ako ne znaš kako to radi, ako ne razumeš concept, nemaš workaround, nisi se pripremio za tu situaciju onda možeš kurac.

Nove softverske verzije, novi alati samo su oruđe koje vam može pomoći da dođete do proizvodnog cilja: bržeg razvoja, skalabilnog sistema, boljih performansi. Ali to su samo jebeni alati. Oni ne rade za vas ili umesto vas; vi njih koristite i vi treba da radite sa njima. Treba dobro da ih poznajete, da znate šta i koliko mogu i možda još bitnije šta ne mogu. I valja gvirnuti poda haubu da vidiš kako to sve radi da bi znao da li je to za tebe.

Kad javno kažeš ovaj fičer je do jaja, biće ti query do 100 puta brži, a ne kažeš kad neće (a većini jebeno neće) i da može pod nekim uslovima i da bude i sporiji time si uljuljkao posetioca svoje sesije i potencijalno ga naveo na lošu poslovnu odluku. On obično nema vremena da nakon konferencije dalje razrađuje tu temu – bio je, čuo od eksperta i uzeće to kao istinu. I kad se suoči sa teškoćama u vezi sa tim fičerom, vendor ili spiker će bu obećati noviji, brži, jači i bolji.

Zato treba ljude podsećati na to da je baš lepo imati super fičere i alate, i da nesumljivo u njih treba investirati, ali da je još uvek jebeno najbitnije znanje i da treba da što više uče i znaju i da se prema svemu odnose kritički, a ne da se poput šiparica primaju na pojedine stvari pa se onda u javnim blog-postovima razočaravaju. I da treba da investiraju u znanje.

 

Workshop: Performance and Query Tuning with SQL Server 2017 and 2019

PFTAs part of the SQL Saturday Linz conference, I will have a performance tuning workshop on 17th January in Tabakfabrik in Linz. The full workshop name is Performance and Query Tuning with SQL Server 2017 and 2019. As the name suggests, the workshop covers performance tuning techniques specific for two recent SQL Server versions. It is aimed at application and database developers who already work or plan to work with the latest versions of SQL Server – 2017 and 2019. The first part of the workshop is focused on execution plans, statistics and query optimizer. It covers common mistakes made by developers that lead to poor performing queries and brings tips how to avoid them.

The second part of the workshop includes a detailed view of Query Store, a new powerful troubleshooting tool introduced in SQL Server 2016 and improved in the later versions. You can see how you can use it not only to troubleshoot, but also to fix regressed queries, and it will be demonstrated its power by upgrade to new SQL Server version or changing compatibility level. At the end, you can learn about database engine improvements in the SQL Server 2019 CTP2 feature with the most promising name – Intelligent Query Processing.

The workshop will cover the following topics:

  • How SQL Server uses statistics and what are common issues with them
  • Quick overview of index usage; how to create a useful index
  • Typical mistakes made by developers that lead to poor performing queries
  • Troubleshooting parameter sniffing issues in different SQL Server versions
  • Cardinality estimation issues and tuning queries with large tables
  • Tips for writing well-performed queries
  • Query Store as game changer in SQL Server performance troubleshooting
  • Fixing execution plans for regressed queries by using Query Store
  • Improvements in SQL Server 2019 Intelligent Query Processing

Level: 300 – Advanced

Here is the link for registration: https://bit.ly/2rkj4wX.

Hope to see you in Linz!

Rgds,
M.

Altering an Indexed Text Column

In this post, I am going to demonstrate how increasing the size of a text column with an Hastindex on it can lead to serious troubles without warning. I tested it two instances of SQL Server: SQL Server 2016 and 2017, but the same behavior should be expected with older versions; only the maximal allowed column size would be different (900 instead of 1.700 bytes).

 

Use the following code to create a sample table and populate it with 1M rows:

DROP TABLE IF EXISTS dbo.AlterColumnTest;
CREATE TABLE dbo.AlterColumnTest(
id INT NOT NULL,
descr NVARCHAR(500) NOT NULL DEFAULT 'test',
CONSTRAINT PK_AlterColumnTest PRIMARY KEY CLUSTERED (id ASC)
);
GO
INSERT INTO dbo.AlterColumnTest(id)
SELECT n
FROM dbo.GetNums(1000000)
GO
CREATE INDEX ix1 ON dbo.AlterColumnTest(descr)
GO

As you can see, there is a non-clustered index on the descr column. From SQL Server 2016 the maximum size for index keys with nonclustered indexes has been increased to 1.700 bytes. That means that the size of the column cannot exceed value of 850. However, when you try to increased it to a value grater than 850, this will be accepted. The following command will be successfully executed:

ALTER TABLE dbo.AlterColumnTest ALTER COLUMN descr NVARCHAR(1000) NOT NULL;
GO 
Commands completed successfully.

 

The command will be executed instantly since it updates only meta-data; existing rows are not touched at all. And there is no warning about the potential exceeding of allowed index key size!

When you try to update a row so that it contains a 850 character long text value, the update will work:

UPDATE dbo.AlterColumnTest SET descr = REPLICATE(N'a', 850) WHERE id = 1;

(1 row affected)

(1 row affected)

However, if text is longer than this, you will get an error:

UPDATE dbo.AlterColumnTest SET descr = REPLICATE(N'a', 851) WHERE id = 1;

Msg 1946, Level 16, State 3, Line 34
Operation failed. The index entry of length 1702 bytes for the index 'ix1' exceeds the maximum length of 1700 bytes for nonclustered indexes.

OK, you have got an exception, only after we tried to store data longer than the limit and now you have to alter the text column to the allowed size (850):

ALTER TABLE dbo.AlterColumnTest ALTER COLUMN descr NVARCHAR(850) NOT NULL;

Instead of decreasing the column size and an intently operation, you will get the following error:

Msg 5074, Level 16, State 1, Line 42
The index 'ix1' is dependent on column 'descr'.
Msg 4922, Level 16, State 9, Line 42
ALTER TABLE ALTER COLUMN descr failed because one or more objects access this column.

To proceed with this action, you need to drop the underlined index, alter the column size and rebuild the index!

DROP INDEX IX1 ON dbo.AlterColumnTest;
GO
ALTER TABLE dbo.AlterColumnTest ALTER COLUMN descr NVARCHAR(850) NOT NULL WITH (ONLINE = ON);
GO
CREATE INDEX IX1 ON dbo.AlterColumnTest(descr) WITH (ONLINE = ON);
GO

For large tables, this action can take some time and during it, you will have no index on that column. The ALTER TABLE statement this time is not a meta-data operation; since the column size is decreased, the action affect all rows of the table, although there is no single rows with a column longer than the new length! For the ALTER statement, you can choose between the offline and online operation. In the first case, it will be faster, but the table will be locked and unavailable for querying and modifications; an online operation will let table be available for these actions, but it will take more time and resources (a new table will be created in shadow and will replace the original one).

This is definitely not something that you have expected after a successful initial command which increased the column size to 1.000 characters. At least a warning should be shown in this case.

Thanks for reading.

M.

#Tarabica 18 Conference– Slide Decks and Sample Code

tarabicaoutoftheboxAs always, I had a great time in Belgrade, at the large IT conference in Serbia #Tarabica18. I was very happy to see so many people in the room for my session SQL Server 2017 za developere and want to thank all the attendees. It was really pleasure to deliver the session in that audience. Here you can download slides decks and sample code from it.

Regards,

M.

SQL Community Meeting 24. Mai

pass-local-groups-logoAm Donnerstag, 24. Mai 2018 findet unser nächstes SQL Community Meeting ab 17:30 bei Microsoft Österreich in Wien mit zwei spannenden Vorträgen statt:

Apache Spark on Azure: Introduction to Azure DataBricks

Azure DataBricks is one of the newest services in Microsoft Azure. It provides a fast, easy to use and collaborative platform to run Apache Spark worloads in Azure thereby integrating very well into other Azure services like Data Factory or various storage services. In this sessions I will give you a short introduction to DataBricks and also show some real-world scenarios.

presented by Gerhard Brückl, MVP, Data Engineer @ pmOne Analytics GmbH

Performance Tuning with SQL Server 2017

Each new SQL Server version brings some performance improvements. In this session, you will see new SQL Server 2017 features that can not only improve performance of your queries, but also performance and query troubleshooting process.

presented by Miloš Radivojević, MVP, Principal Database Consultant @ bwin

Anmeldung unter http://passaustriameeting.eventbrite.de/

 

SQL Server 2017 Developer’s Guide

Together with Dejan Sarka (blog | twitter) and William Durkin (blog | twitter) I have finished writing the SQL Server 2017 Developer’s Guide book.

9781788476195The book is available at the Packt Publishing portal https://www.packtpub.com/big-data-and-business-intelligence/sql-server-2017-developer%E2%80%99s-guide and Amazon https://www.amazon.com/SQL-Server-2016-Developers-Guide-ebook/dp/B01MS5L01Q.

Here is a very brief book description:

Microsoft SQL Server 2017 is the next big step in the data platform history of Microsoft as it brings in the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. Compared to its predecessor, SQL Server 2017 has evolved into Machine Learning with R services for statistical analysis and Python packages for analytical processing. This book prepares you for more advanced topics by starting with a quick introduction to SQL Server 2017’s new features and a recapitulation of the possibilities you may have already explored with previous versions of SQL Server. The next part introduces you to enhancements in the Transact-SQL language and new database engine capabilities and then switches to a completely new technology inside SQL Server: JSON support. We also take a look at the Stretch database, security enhancements, and temporal tables.

Furthermore, the book focuses on implementing advanced topics, including Query Store, columnstore indexes, and In-Memory OLTP. Towards the end of the book, you’ll be introduced to R and how to use the R language with Transact-SQL for data exploration and analysis. You’ll also learn to integrate Python code in SQL Server and graph database implementations along with deployment options on Linux and SQL Server in containers for development and testing.

By the end of this book, you will have the required information to design efficient, high-performance database applications without any hassle.

What You Will Learn

  • Explore the new development features introduced in SQL Server 2017
  • Identify opportunities for In-Memory OLTP technology
  • Use columnstore indexes to get storage and performance improvements
  • Extend database design solutions using temporal tables
  • Exchange JSON data between applications and SQL Server
  • Migrate historical data to Microsoft Azure by using Stretch database
  • Use the new security features to encrypt or mask the data
  • Control the access to the data on the row levels
  • Simplify performance troubleshooting with Query Store
  • Discover the potential of R and Python integration
  • Model complex relationships with the graph databases in SQL Server 2017

I hope, you’ll enjoy reading it.