#0282 – SQL Server – T-SQL – Best Practice – Evaluate costliest condition towards the end of the logical AND operation


Knowing the internals of how a system works has always fascinated me. Although a bit theoretical, knowing about query evaluation is a very exciting topic and I attempt to learn something new about it every now and then.

One of my seniors had a mantra which I remember till date – when evaluating a logical “AND” condition, the costliest condition should always be the last one to be evaluated. Today, I will try to demonstrate why this is a good practice.

The evaluation of the logical AND is actually quite simple. A set of checks evaluate to FALSE even if one of the checks returns FALSE. This simple rule means that the AND evaluation must be a “short-circuit” operation, i.e. as soon as a condition is encountered which determines the end result, all checks beyond that point can be ignored.

The test

The test below demonstrates this concept. What I have in the script is a simple query that returns results based on the evaluation of two conditions – one of which is bound to raise an exception.

USE AdventureWorks2008R2;
GO
BEGIN TRY
    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 0)    --This one is false making the entire AND condition FALSE
      AND (1/0 = 0); --This one results in an exception
PRINT 'Execution was successful.';
END TRY
BEGIN CATCH
PRINT 'All conditions were evaluated resulting in an exception!'
+ CHAR(10) + 'Error Message : ' + ERROR_MESSAGE()
+ CHAR(10) + 'Error Number : ' + CAST(ERROR_NUMBER() AS VARCHAR(10));
END CATCH
GO

/*********************
RESULTS
*********************/
/*
(0 row(s) affected)
Execution was successful.
*/

In this first query, the first condition itself evaluates to FALSE. Hence, there is no reason for the evaluation of the logical AND to continue. The query therefore returns no results. Had both conditions been evaluated, we would have received an exception.

USE AdventureWorks2008R2;
GO
BEGIN TRY
    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 1)    --This one is TRUE,
                     --because of which the next condition will be evaluated
      AND (1/0 = 0); --This one results in an exception
PRINT 'Execution was successful.';
END TRY
BEGIN CATCH
PRINT 'All conditions were evaluated resulting in an exception!'
+ CHAR(10) + 'Error Message : ' + ERROR_MESSAGE()
+ CHAR(10) + 'Error Number : ' + CAST(ERROR_NUMBER() AS VARCHAR(10));
END CATCH
GO
/*********************
RESULTS
*********************/
/*
(0 row(s) affected)
All conditions were evaluated resulting in an exception!
Error Message : Divide by zero error encountered.
Error Number : 8134
*/

Because the first condition retuned a TRUE in the query above, it returned an exception because the evaluation of the logical AND operation had to evaluate both conditions.

Conclusion

This simple test proves that the logical AND operation is a “short-circuit” operation. Therefore, the costliest condition (maybe a sub-query) in a set to be evaluated with a logical AND must be placed towards the end of the sequence to prevent un-necessary evaluations. Only if the preceding conditions are TRUE, will the more complex conditions be evaluated under this logic.

Until we meet next time,

Be courteous. Drive responsibly.

Advertisement

4 thoughts on “#0282 – SQL Server – T-SQL – Best Practice – Evaluate costliest condition towards the end of the logical AND operation

  1. harsh

    When I am swapping conditions in 1st query, still it is give same result.

    SELECT *
    FROM HumanResources.Employee
    WHERE (1/0 = 0)
    AND (1 = 0);

    PRINT ‘Execution was successful.’;

    Like

    Reply
  2. Nakul Vachhrajani

    @harsh: Check the execution plans (estimated plan will also be fine):

    USE AdventureWorks2008R2;
    GO
    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 0) AND (1/0 = 0);

    You will find that SQL Server does a constant scan – it does not even go to the table (this can be seen from the Actual Execution plan also).

    However, in the following case, SQL Server attempts to evaluate the entire query:

    USE AdventureWorks2008R2;
    GO
    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 1) AND (1/0 = 0);

    Considering this, let’s attempt to run a batch of 2 queries – one where the short-circuit should prevent evaluation of a complex sub-query, and other in which it would be executed:

    USE AdventureWorks2008R2;
    GO
    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 0) AND ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1);
    GO

    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 1) AND ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1);
    GO

    You can try executing these queries and validate the execution plan & performance to understand the importance of the placement.

    Like

    Reply
  3. harsh

    It is not guaranteed that placement of conditions affects result and/or performance.
    1)

    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 0) AND (1/0 = 0);
    SELECT *
    FROM HumanResources.Employee
    WHERE (1/0 = 0) AND (1 = 0);

    Here, Result is same and execution plan is same.

    2)

    SELECT *
    FROM HumanResources.Employee
    WHERE (1 = 0) AND ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1);

    SELECT *
    FROM HumanResources.Employee
    WHERE ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1)
    AND (1 = 0)

    Again, result is same and execution plan also. Even, 1=0 condition is written second, SQL Server performs constant scan first and don’t scan table.

    3)

    SELECT *
    FROM HumanResources.Employee
    WHERE (BusinessEntityID = 0) AND ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1);

    SELECT *
    FROM HumanResources.Employee
    WHERE ((SELECT EmailPromotion
    FROM Person.Person
    WHERE Person.BusinessEntityID = Employee.BusinessEntityID) = 1)
    AND (BusinessEntityID = 0)

    (Note: I have not checked this queries with Actual AdentureWorks DB.I have assumed no records will have BusinessEntityID =0 )

    Here, even if first criteria (BusinessEntityID =0) evaluate to FALSE , still it perform 2nd criteria. And also, this time its not constant scan.

    4)

    SELECT * FROM TableName WHERE ISNUMERIC(ColumnName)=1 AND CAST(ColumnName AS DECIMAL)>0

    SELECT * FROM TableName WHERE CAST(ColumnName AS DECIMAL)>0 AND ISNUMERIC(ColumnName)=1

    Here, it actually affects result.

    If column has any non-numeric data, then 2nd query will throw error.

    Like

    Reply
  4. Nakul Vachhrajani

    @harsh: I guess we are both saying the same thing, but in different ways.

    Because the logical AND is a short-circuit operator, in most cases one can gain some benefit by writing a query such that the database engine can easily determine whether to continue the evaluation of all the expressions in the AND or not. This is why I propose that the most complex query, which may not be the most decisive may be placed towards the end.

    Of course, these recommendations depend upon the nature of data in a system and the design of the quries (for example, if all conditions are simple comparisions, one may not gain at all. But, if sub-queries are involved, a gain may be seen if the preceeding conditions tend to evaluate to “false” often).

    Like

    Reply

Let me know what you think about this post by leaving your feedback here!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.