Category Archives: Imported from BeyondRelational

These posts are imported from my old blog page: http://beyondrelational.com/modules/2/blogs/77/nakuls-blog.aspx

#0183 – SQL Server – MIN_ACTIVE_ROWVERSION – What is the currently active ROWVERSION value?


Helping a team with an implementation has it’s own unique challenges, and the biggest challenge is – they ask questions! I get a kick out of this challenge because it gives me and the teams around me a lot to learn. Currently while helping a team to implement optimistic concurrency, somebody asked me a question:

What is the next value of ROWVERSION at any given instant of time?

The question makes absolute sense. If we can determine the active value of an identity column and determine which value would be used next, there is no reason why we cannot determine the active value of a ROWVERSION column. After all, ROWVERSION is an auto-incrementing number.

The answer comes in the form of a non-deterministic function – MIN_ACTIVE_ROWVERSION. This function returns the lowest active ROWVERSION value in the current database. A ROWVERSION value is active if it is being used in a transaction that has not yet been committed.

Explanation

The definition from Books-On-Line is a little too heavy to digest in one go. So, here’s a break-down of the sentence.

A new ROWVERSION value is typically generated when an INSERT or UPDATE is performed on a table that contains a column using the ROWVERSION data-type. A transaction may be performing multiple INSERTs and UPDATEs – causing multiple changes to the ROWVERSION value. However, the lowest of these values (effectively, the value that was first used when the transaction was started) is what is returned by the function – MIN_ACTIVE_ROWVERSION().

For all practical purposes, this is same as the value of @@DBTS + 1.

Demonstration

Here’s a demonstration of the function – MIN_ACTIVE_ROWVERSION. This demo requires two active sessions to the SQL Server instance.

Let’s begin by creating a test table, and inserting some test data into it. We will use the MIN_ACTIVE_ROWVERSION() to determine the value of the ROWVERSION column that will be used by SQL Server.

USE tempdb
GO

--Safety Check
IF OBJECT_ID('SourceData') IS NOT NULL
BEGIN
    DROP TABLE SourceData
END
GO

--Create the demo (source) table
CREATE TABLE SourceData (RowId INT IDENTITY (1,1),
                         RowVal VARCHAR(20),
                         DataVersion ROWVERSION
                        )
GO

--Fetch the currently active value of ROWVERSION
SELECT MIN_ACTIVE_ROWVERSION() AS MinActiveRowVersion

--Now, insert some test data
INSERT INTO SourceData (RowVal) VALUES ('Windows'),
                                       ('SQL Server'),
                                       ('BizTalk Server'),
                                       ('Exchange Server')
GO

SELECT RowId, RowVal, DataVersion
FROM SourceData

--Fetch the currently active value of ROWVERSION
SELECT MIN_ACTIVE_ROWVERSION() AS MinActiveRowVersion

MIN_ACTIVE_ROWVERSION() gives us the ROWVERSION that will be used next

Now, switch to another session connected to the same SQL Server instance, and execute the following query:

-- In a separate window, begin a transaction, insert some test data, but do not commit
-- Session 02
USE tempdb
GO
BEGIN TRANSACTION MinRowVersion
    --Fetch the currently active value of ROWVERSION
    SELECT MIN_ACTIVE_ROWVERSION() AS MinActiveRowVersion
    
    INSERT INTO SourceData (RowVal) VALUES ('System Center'),
                                           ('Office')
    
    SELECT RowId, RowVal, DataVersion
    FROM SourceData
    
    --Fetch the currently active value of ROWVERSION
    SELECT MIN_ACTIVE_ROWVERSION() AS MinActiveRowVersion
    
--Session 02 - We are NOT performing a COMMIT on the transaction
--COMMIT TRANSACTION MinRowVersion

Value of MIN_ACTIVE_TRANSACTION() within an "open" transaction

Although the transaction inserts two (2) records, the MIN_ACTIVE_ROWVERSION() returns the same value as seen at the end of the statements executed in session #01. This is because the transaction in session #02 is not yet committed.

Committing the transaction in session #02 should now reflect the updated value of ROWVERSION column value.

--Session 02
COMMIT TRANSACTION MinRowVersion

--Return back to Session 01
--Session 01
--Fetch the currently active value of ROWVERSION
SELECT MIN_ACTIVE_ROWVERSION() AS MinActiveRowVersion

--Cleanup
IF OBJECT_ID('SourceData') IS NOT NULL
BEGIN
    DROP TABLE SourceData
END
GO

Once "open" transactions are "committed", the MIN_ACTIVE_ROWVERSION() value changes

References:

Until we meet next time,
Be courteous. Drive responsibly.

#0182 – SQL Server – ROWVERSION – Uniqueness Myth – Possibility of duplicate values exists


Recently, I was helping a team implement optimistic concurrency into their application. In my previous post (SQL Server – DDL – ROWVERSION v/s TIMESTAMP – a key difference), we reviewed a critical DDL difference with respect to table definition between the ROWVERSION and TIMESTAMP columns.

Many DBAs and developers believe that ROWVERSION is a unique value. This statement causes an interpretation that a table with a ROWVERSION column will always have unique values across the database.

The above statement is not accurate – the ROWVERSION column does get a new value during an insert, however, there is a possibility that ROWVERSION columns can hold duplicate values. This can happen during a bulk insert (SELECT…INTO).

Here’s a simple demonstration:

USE tempdb
GO

--Safety Check
IF OBJECT_ID('RowVersionDemo') IS NOT NULL
BEGIN
    DROP TABLE RowVersionDemo
END
GO

IF OBJECT_ID('SourceData') IS NOT NULL
BEGIN
    DROP TABLE SourceData
END
GO

--Create the demo (source) table
CREATE TABLE SourceData (RowId INT IDENTITY (1,1),
                         RowVal VARCHAR(20),
                         DataVersion ROWVERSION
                        )
GO

--Insert some test data
INSERT INTO SourceData (RowVal) VALUES ('Windows'),
                                       ('SQL Server'),
                                       ('BizTalk Server'),
                                       ('Exchange Server')

--Bulk Insert the source data into destination table
SELECT SourceData.RowId, SourceData.RowVal, SourceData.DataVersion
INTO RowVersionDemo FROM SourceData

--Select the data inserted (for demo purposes)
SELECT RowId, RowVal, DataVersion
FROM SourceData

SELECT RowId, RowVal, DataVersion
FROM RowVersionDemo
GO

--Check the data-types of the columns created as part of the BULK-INSERT:
SELECT isc.TABLE_NAME,
       isc.COLUMN_NAME,
       isc.DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS AS isc
WHERE isc.TABLE_NAME = 'RowVersionDemo'
GO

--Cleanup
IF OBJECT_ID('RowVersionDemo') IS NOT NULL
BEGIN
    DROP TABLE RowVersionDemo
END
GO

IF OBJECT_ID('SourceData') IS NOT NULL
BEGIN
    DROP TABLE SourceData
END
GO

0182

We can see here that SQL Server inserted the column data “as-is” into the destination tables. A look at the data-types of the destination table shows us that it has been created as a “timestamp” column, which is a documented bug within Microsoft SQL Server.

In short, using ROWVERSION does not guarantee uniqueness – one can have duplicate values amongst the ROWVERSION columns within the same database.

References:

Until we meet next time,
Be courteous. Drive responsibly.

#0181 – SQL Server – DDL – ROWVERSION v/s TIMESTAMP – a key difference


A couple of years ago when I started blogging, I wrote about an approach for implementing optimistic concurrency in an application (Link). The approach revolves around the storage, retrieval and comparison of the ROWVERSION value – which is an improvement upon the TIMESTAMP data-type, which has been marked as deprecated.

There are a handful of differences between the ROW VERSION and TIMESTAMPS columns. Yet, most accidental DBAs and novice database developers make the mistake of thinking that they are one and the same. While I had mentioned all the differences in my post on optimistic concurrency, here’s a recap:

  1. The ROWVERSION data type is just an incrementing number and does not preserve a date or a time
  2. A table can have only one ROWVERSION column
  3. ROWVERSION should not be used as keys, especially primary keys
  4. Using a SELECT INTO statement has the potential to generate duplicate ROWVERSION values
  5. Finally for those who came from SQL 2005, the TIMESTAMP is deprecated. Please avoid using this in new DDL statements
  6. Unlike TIMESTAMP, the ROWVERSION column needs a column name in the DDL statements

The last difference is what throws off most people – they simply replace all instances of “TIMESTAMP” in their DDL code and expect it to work – which will not be the case.

Here’s a simple demonstration.

A typical TIMESTAMP implementation would look like this:

USE tempdb
GO

--Safety Check
IF OBJECT_ID('RowVersionDemo') IS NOT NULL
BEGIN
    DROP TABLE RowVersionDemo
END
GO

--Create a test table
CREATE TABLE RowVersionDemo (Id INT IDENTITY(1,1),
                             Data VARCHAR(20),
                             TIMESTAMP
                            )
GO

--Insert some test data
INSERT INTO RowVersionDemo (Data) VALUES ('Windows'),
                                         ('SQL Server')
GO

--Fetch Data
SELECT RowVersionDemo.Id, RowVersionDemo.Data, RowVersionDemo.TIMESTAMP
FROM RowVersionDemo

TIMESTAMP value inserted into a table

As you can see, we just placed a TIMESTAMP column in the table definition – without really specifying the column name. Simply replacing all instances of TIMESTAMP with ROWVERSION will result in an error:

--ROWVERSION - ERROR
USE tempdb
GO

--Safety Check
IF OBJECT_ID('RowVersionDemo') IS NOT NULL
BEGIN
    DROP TABLE RowVersionDemo
END
GO

--Create a test table
CREATE TABLE RowVersionDemo (Id INT IDENTITY(1,1),
                             Data VARCHAR(20),
                             ROWVERSION
                            )
GO

Msg 173, Level 15, State 1, Line 5

The definition for column ‘ROWVERSION’ must include a data type.

ROWVERSION is a data-type and needs a column name to be associated with it. Hence, the following will work:

--ROWVERSION - FIXED
USE tempdb
GO

--Safety Check
IF OBJECT_ID('RowVersionDemo') IS NOT NULL
BEGIN
    DROP TABLE RowVersionDemo
END
GO

--Create a test table
CREATE TABLE RowVersionDemo (Id INT IDENTITY(1,1),
                             Data VARCHAR(20),
                             DataVersion ROWVERSION
                            )
GO

--Insert some test data
INSERT INTO RowVersionDemo (Data) VALUES ('Windows'),
                                         ('SQL Server')
GO

--Fetch Data
SELECT RowVersionDemo.Id, RowVersionDemo.Data, RowVersionDemo.DataVersion
FROM RowVersionDemo

ROWVERSION values stored in the database

The above simple demonstration proves that TIMESTAMP and ROWVERSION cannot be used in the same way when defining the database tables. Conversion from TIMESTAMP to ROWVERSION is not a direct find-replace operation.

There are more knowledge nuggets coming up related to the ROWVERSION data-type. Stay tuned to this blog for more!

Further Reading

  • Implementing Optimistic Concurrency using ROWVERSION:

Until we meet next time,
Be courteous. Drive responsibly.

#0180 – SQL Server – Profiler – SQL:StmtCompleted shows parameters, not their values


The SQL Server Profiler is a very powerful diagnostic tool that helps you take a look at what’s happening inside of SQL Server when you execute a query. I hope that you are finding the 10-part tutorial series on the SQL Server Profiler (http://beyondrelational.com/modules/12/tutorials/631/getting-started-with-sql-server-profiler.aspx?tab=tutorials&bs=57&ts=46) useful. The aim of the tutorial is to help you get started with the SQL Server profiler and debug any database issues within your application with ease.

However, the more you explore, the more queries and questions you come up with and attempting to answer them is perhaps the best way to learn.

One of the kind readers asked me a question the other day via the ASK Module (http://beyondrelational.com/ask/nakul/default.aspx) related to the SQL Server Profiler (http://beyondrelational.com/ask/nakul/questions/1557/sql-server-profiler-trace-query.aspx), the answer to which I will discussing in today’s post.

The Question

Assume that one of your applications is executing the following query to get a list of all salaried employees for AdventureWorks Cycles.

USE AdventureWorks2012
GO
SELECT * FROM HumanResources.Employee WHERE SalariedFlag = 1

However, when looked at from the SQL Server Profiler, you see that instead of the value “1”, you see a parameterized version of the query if the “SQL:StmtStarting” or “SQL:StmtCompleted” events are captured:

SELECT * FROM HumanResources.Employee WHERE SalariedFlag = @p1

The actual parameter value (“1”) is available when the “RPC:Completed” event is captured. The question therefore is:

Why do “SQL:StmtStarting” and “SQL:StmtCompleted” not show the parameter values? Is there a way by which they can show the parameter values, so that additional events don’t need to be traced?

The Answer

The question is very interesting, and the answer lies in the internals of how SQL Server prepares itself for multiple executions of the same query in the future. However, let’s answer the second part of the question first. The answer to the second part of the question is quite simply, no. To the best of my knowledge, there is no way that we can have both the parameterized version of the query and it’s values captured the same event – “SQL:StmtStarting” or “SQL:StmtCompleted”.

That being said, let’s look at “why” the parameterized version of the query is seen.

One of the first things that SQL Server’s database engine does is to create a query plan for execution. Preparing a query plan is an expensive operation and therefore, SQL Server caches this plan for future re-use. Subsequent executions of the query will re-use the already available query plan and not spend time in creating a new plan – the specific values used in the comparisons may differ from instance to instance, but the plan itself remains the same.

We are therefore looking at two different components to the execution of a query:

  1. A generic, parameterized query plan – which is what is seen in the “SQL:StmtStarting” and “SQL:StmtCompleted” events
  2. A set of instance-specific variables/parameter values – which is what is seen in the instance specific “RPC:Completed” event

The following is an easy set of queries for better understanding of the concept.

/* WARNING */
/*
This query is provided "as-is" and without warranty.
The query is provided for demonstration purposes only.

PLEASE RUN THIS ON A DEVELOPMENT ENVIRONMENT ONLY.
EXECUTING THIS ON A QA/PRODUCTION ENVIRONMENT MAY RESULT IN SERIOUS PERFORMANCE COMPLICATIONS.
*/
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

USE AdventureWorks2008R2
GO
DECLARE @sqlstring NVARCHAR(100)
DECLARE @paramDefinition NVARCHAR(50)
DECLARE @paramValue BIT

SET @sqlstring = N'SELECT * FROM HumanResources.Employee WHERE SalariedFlag = @p1'
SET @paramDefinition = N'@p1 BIT'
SET @paramValue = 1

EXEC sp_executesql @sqlstring, @paramDefinition, @p1 = @paramValue

--Check if the SQL Server is re-using the query plan
-- USECOUNTS value should be 1
SELECT usecounts, cacheobjtype, objtype, text 
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
WHERE usecounts > 0 AND 
    	text like '%SELECT * FROM HumanResources.Employee%'
ORDER BY usecounts DESC;

/********* RESULTS ****************/
/* (Compressed for sake of brevity)
usecounts   cacheobjtype	objtype	     text
  1         Compiled Plan	Prepared     (@p1 BIT)SELECT * FROM HumanResources.Employee WHERE SalariedFlag = @p1
  1         Compiled Plan	Adhoc        DECLARE @sqlstring ...............
*/

We can see that the version SQL Server has cached is the one with the generalized parameters – not the one with the parameter values.

image

Let us execute the same query again – this time with a different parameter value.

--Execute the query again, but with a different value
SET @paramValue = 0
EXEC sp_executesql @sqlstring, @paramDefinition, @p1 = @paramValue

--Check if the SQL Server is re-using the query plan
-- USECOUNTS value should be 2
SELECT usecounts, cacheobjtype, objtype, text 
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
WHERE usecounts > 0 AND 
    	text like '%SELECT * FROM HumanResources.Employee%'
ORDER BY usecounts DESC;

/********* RESULTS ****************/
/* (Compressed for sake of brevity)
usecounts   cacheobjtype	objtype	     text
  2         Compiled Plan	Prepared    (@p1 BIT)SELECT * FROM HumanResources.Employee WHERE SalariedFlag = @p1
  1         Compiled Plan	Adhoc       DECLARE @sqlstring ...............
*/
GO

The results show that the SQL Server was able to reuse the generic query plan.

image

Reference:

Until we meet next time,

Be courteous. Drive responsibly.

#0179-SQL Server-Parameter proofing your stored procedure execution-Using ordinal positions v/s Parameter names


Experiences teach us a lot of things – and one of them is to always take some more time when developing code to make sure it is isolated from future changes that one may make. Today’s post is based on one such incident that recently took place at the office.

How do you pass parameters to your stored procedures?

Stored procedures can be called in multiple ways, but when we talk about passing parameters to a procedure, there are essentially only two ways:

  1. Using ordinal position, for e.g. EXEC proc_MyProcedure 1,2
  2. Using parameter names, for e.g. EXEC proc_MyProcedure @param1 = 1, @param2 = 2

Now, we know that using ordinal positions is not a recommended best practice when using them in a query. Yet, I am quite sure that a large portion of the code currently being used production uses ordinal positions to pass parameters to a stored procedure. I will honestly admit here that I used to be one of the many who are victims of this bad practice – but all it takes is one experience to understand the importance of doing things right the first time around.

An example

Let’s take an example to understand why passing parameters using ordinal positions is not a recommended practice. We will create a simple procedure as shown below.

USE AdventureWorks2012
GO

CREATE PROCEDURE proc_GetEmployeesBySalariedFlagAndOrganizationalLevel
            @salariedFlag BIT,
            @organizationLevel TINYINT
AS
    BEGIN
        SET NOCOUNT ON

        SELECT Employee.BusinessEntityID,
               Employee.OrganizationLevel,
               Employee.SalariedFlag,
               Employee.NationalIDNumber,
               Employee.BirthDate,
               Employee.JobTitle,
               Employee.SickLeaveHours,
               Employee.VacationHours
        FROM HumanResources.Employee
        WHERE Employee.OrganizationLevel = @organizationLevel
          AND Employee.SalariedFlag = @salariedFlag
    END
GO

As you can see, our demo procedure has 2 parameters – a flag indicating whether we want salaried employees or not and the other to indicate the organizational level of the employees that we need to fetch the data for.

Passing parameters using ordinal positions

In order to execute this stored procedure, we can simply use the ordinal positions as shown below:

--Executing using ordinal positions
USE AdventureWorks2012
GO
EXEC proc_GetEmployeesBySalariedFlagAndOrganizationalLevel 0, 4

Execution of the stored procedure returns us some 187 odd records.

image

Passing parameters using parameter names

In order to execute the procedure using parameter names, we can use the following statement (takes more number of key strokes to write, but will reap benefits in the future):

--Executing using parameter names
USE AdventureWorks2012
GO
EXEC proc_GetEmployeesBySalariedFlagAndOrganizationalLevel @salariedFlag = 0, @organizationLevel = 4

Executing this also returns us the same 187 records.

What can go wrong?

As part of a review, assume that the architect asks the developers to swap out the two parameters – the stored procedure remains the same, but the architect believes that the organization level holds a higher priority from a business perspective and therefore requests the change.

USE AdventureWorks2012
GO

ALTER PROCEDURE proc_GetEmployeesBySalariedFlagAndOrganizationalLevel
            @organizationLevel TINYINT,
            @salariedFlag BIT
AS
    BEGIN
        SELECT Employee.BusinessEntityID,
               Employee.OrganizationLevel,
               Employee.SalariedFlag,
               Employee.NationalIDNumber,
               Employee.BirthDate,
               Employee.JobTitle,
               Employee.SickLeaveHours,
               Employee.VacationHours
        FROM HumanResources.Employee
        WHERE Employee.OrganizationLevel = @organizationLevel
          AND Employee.SalariedFlag = @salariedFlag
    END
GO

Which of the two methods to you think would continue to work after the change? Let’s run them and see:

--Executing using ordinal positions
USE AdventureWorks2012
GO
EXEC proc_GetEmployeesBySalariedFlagAndOrganizationalLevel 0, 4

--Executing using parameter names
USE AdventureWorks2012
GO
EXEC proc_GetEmployeesBySalariedFlagAndOrganizationalLevel @salariedFlag = 0, @organizationLevel = 4

Result:

The execution using ordinal positions now returns only 1 result, whereas the execution mode with the parameter names continues to return 187 records.

image

Conclusion

As a matter of fact, if one would observe the prepared statements being fired against a database through a data adapter or via any automated code generator, one would observe that they are always parameterized using parameter names, not using ordinal positions.

In conclusion, all I would say is that using ordinal positions can lead to troubles in case the underlying object changes sometime in the future. Named referencing and parameterization is always the safer approach, and while it takes a lot more key strokes (unwelcome with the development community), they save a lot of rework when the change is actually to be made.

This experience was enough to make me use named parameterization when making stored procedure calls always.

How do you execute your stored procedures – I am interested to know. Do share your observations/thoughts/approaches before you leave.

Until we meet next time,

Be courteous. Drive responsibly.