Can You Trust Your Constraints?

Thumb
  • November 05, 2015

Can You Trust Your Constraints?

When a Check Constraint exists on a column, it checks the insertion of new records to determine that the value for that column conforms to the limitations of that constraint.  Check Constraints allow us to define what are acceptable values for that column.  They are of great importance for ensuring Data Integrity. 

For instance, a sales manager may determine that discount levels cannot be greater than 30%, so the DBA puts a Check Constraint on the Discount column to only allow values less than or equal to 30.

            CREATE TABLE Customers

                        (CustID int Identity(1,1)

                        , Name nvarchar(50)

                        , Discount int

                          CONSTRAINT chkDiscount CHECK(Discount <= 30));

 

Now, if a salesperson attempts to enter a record and gives their new customer a big discount, the record will not be inserted and an error will be returned indicating that “The INSERT statement conflicted with the CHECK Constraint ‘chkDiscount’”.

Perfect, we have accomplished our goal of not allowing big discounts.  However, we are still left with a few problems. 

First, if you are adding the constraint to an existing table the newly added constraint will check existing data and if any records violate the constraint, the attempt to add the constraint will fail.  You can add the constraint WITH NOCHECK to avoid checking existing data, but that is probably a bad idea, since data integrity goes down the drain.

CREATE TABLE Customers2

                        (CustID int Identity(1,1)

                        , Name nvarchar(50)

                        , Discount int);

            ALTER TABLE Customers2 WITH NOCHECK

            ADD CONSTRAINT chkDiscount CHECK(Discount <= 30);

 

Of course if no existing records violate the constraint, it can be added with no problem. Also, if you add the constraint at the time you create the table, no possible conflict exists.

 

The second problem occurs when you attempt to insert lots of records into the table, such as in a Bulk event.  In this case, each record that is inserted, must be evaluated against the Check Constraint which causes significant slowdown. 

 

To avoid this, most folks will disable the constraint and then re-enable it after the Bulk Insert. 

This brings us to the third problem and the purpose of this blog.  When you created the constraint in the first place, that constraint was considered to be “Trusted”.  That means that SQL Server knows for a fact that no records in the table contain a value for Discount that is > 30.  Therefore, the query optimizer can quickly evaluate a query like this:

 

SELECT * FROM Customers WHERE Discount = 45;

 

Because there can’t be any records that satisfy the query, the optimizer can do what is called a Constant Scan, basically meaning that it didn’t even bother to read the table because it knew for a fact that the table contained no records that could satisfy it. 

 

But what happens when we disable the constraint, insert some records and then re-enable it? 

 

SQL Server can no longer know whether or not the table contains records that violate the constraint.  Therefore it has to go back to doing a Table Scan or Seek, depending on your indexes; this is considerably slower. 

 

So, how does SQL Server know whether or not to trust the validity of your data?  It checks the “Trusted” property of the constraint.  You can examine that same thing by looking at sys.check_constraints.

 

            SELECT name

                             ,OBJECT_NAME(parent_object_id) AS TableName

                             ,is_not_trusted

            FROM sys.check_constraints;

A “1” in the is_not_trusted column indicates that the constraint is not trusted by the query optimizer and therefore won’t be evaluated using the fast method of a Constant Scan.  A “0” indicates that the constraint can be trusted.

 

So, what’s the answer to our dilemma?  Option 1 is to never disable the constraint so that it will always be trusted.  Not a particularly good option because of the performance hit when adding many records.  Option 2 has to do with how you re-enable the constraint after your Bulk Insert (or whatever).  There are two ways to re-enable a constraint.  The first looks like this:

           

ALTER TABLE Customers CHECK CONSTRAINT ALL;

 

This re-enables the constraint without validating the existing records against the constraint.

 

The second is:

 

ALTER TABLE Customers WITH CHECK CHECK CONSTRAINT ALL;

 

This re-enables the constraint and verifies that all existing records comply with the constraint.  Of course, that will make re-enabling the constraint slower, but it also set the is_not_trusted property of the constraint back to “0” so that the optimizer can go back to using a Constant Scan on your table, making queries run faster.

 

Hope that helps!

Previous Post Next Post
Hit button to validate captcha