Recently, I was helping a team implement optimistic concurrency into their application. In my previous post (SQL Server – DDL – ROWVERSION v/s TIMESTAMP – a key difference), we reviewed a critical DDL difference with respect to table definition between the ROWVERSION and TIMESTAMP columns.
Many DBAs and developers believe that ROWVERSION is a unique value. This statement causes an interpretation that a table with a ROWVERSION column will always have unique values across the database.
The above statement is not accurate – the ROWVERSION column does get a new value during an insert, however, there is a possibility that ROWVERSION columns can hold duplicate values. This can happen during a bulk insert (SELECT…INTO).
Here’s a simple demonstration:
USE tempdb GO --Safety Check IF OBJECT_ID('RowVersionDemo') IS NOT NULL BEGIN DROP TABLE RowVersionDemo END GO IF OBJECT_ID('SourceData') IS NOT NULL BEGIN DROP TABLE SourceData END GO --Create the demo (source) table CREATE TABLE SourceData (RowId INT IDENTITY (1,1), RowVal VARCHAR(20), DataVersion ROWVERSION ) GO --Insert some test data INSERT INTO SourceData (RowVal) VALUES ('Windows'), ('SQL Server'), ('BizTalk Server'), ('Exchange Server') --Bulk Insert the source data into destination table SELECT SourceData.RowId, SourceData.RowVal, SourceData.DataVersion INTO RowVersionDemo FROM SourceData --Select the data inserted (for demo purposes) SELECT RowId, RowVal, DataVersion FROM SourceData SELECT RowId, RowVal, DataVersion FROM RowVersionDemo GO --Check the data-types of the columns created as part of the BULK-INSERT: SELECT isc.TABLE_NAME, isc.COLUMN_NAME, isc.DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS AS isc WHERE isc.TABLE_NAME = 'RowVersionDemo' GO --Cleanup IF OBJECT_ID('RowVersionDemo') IS NOT NULL BEGIN DROP TABLE RowVersionDemo END GO IF OBJECT_ID('SourceData') IS NOT NULL BEGIN DROP TABLE SourceData END GO
We can see here that SQL Server inserted the column data “as-is” into the destination tables. A look at the data-types of the destination table shows us that it has been created as a “timestamp” column, which is a documented bug within Microsoft SQL Server.
In short, using ROWVERSION does not guarantee uniqueness – one can have duplicate values amongst the ROWVERSION columns within the same database.
References:
- Connect Cases related to ROWVERSION being displayed as “TIMESTAMP”
- Optimistic Concurrency using ROWVERSION: http://beyondrelational.com/modules/2/blogs/77/posts/11268/optimistic-concurrency-a-conceptual-introduction.aspx
Until we meet next time,
Be courteous. Drive responsibly.
If one reads the Books Online entry for rowversion, it clearly states:
> Duplicate rowversion values can be
> generated by using the SELECT INTO
> statement in which a rowversion column
> is in the select list. We do not
> recommend using rowversion in this
> manner.
http://msdn.microsoft.com/en-us/library/ms182776.aspx
LikeLike
@Marc: True. But the sad part is that 80% of the SQL developers out there have a misconception that ROWVERSION are unique values. The intention of this post is to bust the uniqueness myth around ROWVERSION as is indicated by the title.
LikeLike
I find it hard to believe that 80% of SQL developers are using a relatively new feature of the product without at least looking at BOL. If so, then the state of the SQL development community has severely degraded since I joined the industry.
The first thing I learned when learning to program was **RTFM** I am shocked how often blog posts are often word-for-work, copy-and-paste entries from BOL and the community reacts as if it is the first time they’ve seen the information.
You may want to check BOL yourself… check out the differences between SELECT… INTO and bulk insert. They are very different beasts.
LikeLike