#0273 – SQL Server – IDENTITY columns – Myths – IDENTITY columns cannot be added to existing tables – Part 02


As seen in the first part of this post, we can add IDENTITY columns to existing tables without any issue. IDENTITY columns cannot be NULL. Hence, when we add a new column as an IDENTITY column, every record existing in the table has to be assigned an IDENTITY value. So, the question arises that which record should be the first one to be assigned with the Identity seed, which record will be the second record and so on. In short:



How existing records are assigned the IDENTITY value?


We saw in the first part of the post that because a table was ordered via a clustered primary key, the IDENTITY specification was directly applied in that order. However, when the table is a heap (i.e. does not have a clustered index defined on it and is therefore unordered), the IDENTITY values are assigned based on the order in which records were inserted into the table.


The script and it’s results below prove this point.

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table, this time WITHOUT a Clustered Key
CREATE TABLE dbo.tIdentity (RecordId INT,
                            IdentityValue VARCHAR(20),
                            CONSTRAINT pk_tIdentityRecordId PRIMARY KEY NONCLUSTERED (RecordId)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (3, 'Three'),
       (5, 'Five'),
       (1, 'One'),
       (0, 'Zero'),
       (4, 'Four');
GO

--2b. As a test, inserting some interleaving data separately
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (2, 'Two'),
       (6, 'Six');
GO

--3. Check the values inserted into the table
--   Observe the order of the records that come up in the results
SELECT RecordId, IdentityValue
FROM dbo.tIdentity;
GO

--4. Alter the table to add an IDENTITY Column
ALTER TABLE dbo.tIdentity
    ADD IdentityId INT IDENTITY(1,1);
GO    

--4. Check the values inserted into the table
--   Observe the order of the records that come up in the results
--   v/s the value of the IdentityId
SELECT RecordId, IdentityId, IdentityValue
FROM dbo.tIdentity;
GO

----4b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

/**********************************************
               RESULTS
**********************************************/
/*
RecordId    IdentityValue
----------- --------------
3           Three
5           Five
1           One
0           Zero
4           Four
2           Two
6           Six

RecordId    IdentityId  IdentityValue
----------- ----------- --------------
3           4           Three
5           6           Five
1           2           One
0           1           Zero
4           5           Four
2           3           Two
6           7           Six
*/

Now, let us define a CLUSTERED INDEX on this table such that the records are re-arranged. Once the clustered index is defined, we will insert a couple of records and then look at the assignment of the IDENTITY column. We can see that IDENTITY values continue to be assigned in the order in which records are inserted, however, the records are now ordered based on the clustered index specification.

USE tempdb;
GO

-- 5. Define a clustered index for the Primary Key
ALTER TABLE dbo.tIdentity
    DROP CONSTRAINT pk_tIdentityRecordId;
GO

ALTER TABLE dbo.tIdentity
    ADD CONSTRAINT pk_tIdentityRecordId PRIMARY KEY CLUSTERED (RecordId);
GO

--6. As a test, inserting some interleaving data separately
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (-1, 'Minus One'),
       (7, 'Seven'),
       (8, 'Eight');
GO

--7. Check the values inserted into the table
--   Observe the order of the records that come up in the results
--   v/s the value of the IdentityId
SELECT RecordId, IdentityId, IdentityValue
FROM dbo.tIdentity;
GO

----7b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO  

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
RecordId    IdentityId  IdentityValue
----------- ----------- --------------
-1          8           Minus One
0           1           Zero
1           2           One
2           3           Two
3           4           Three
4           5           Four
5           6           Five
6           7           Six
7           9           Seven
8           10          Eight
*/

Conclusion (Part 02)


The behavior of IDENTITY columns and of SQL Server seen in this post is quite interesting – IDENTITY values are always generated in the order in which the records are inserted. When adding IDENTITY column to an existing table, this is true only if the table is not ordered, i.e. does not have a clustered index. Once the IDENTITY values have been assigned to all existing records in the table, all future values will be generated in the order of insertion.


Until we meet next time,


Be courteous. Drive responsibly.

Advertisement

1 thought on “#0273 – SQL Server – IDENTITY columns – Myths – IDENTITY columns cannot be added to existing tables – Part 02

  1. Pingback: #0390 – SQL Server – SSIS – OLE DB Destination – Table Fast Load – Keep Identity option | SQLTwins by Nakul Vachhrajani

Let me know what you think about this post by leaving your feedback here!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.