As seen in the first part of this post, we can add IDENTITY columns to existing tables without any issue. IDENTITY columns cannot be NULL. Hence, when we add a new column as an IDENTITY column, every record existing in the table has to be assigned an IDENTITY value. So, the question arises that which record should be the first one to be assigned with the Identity seed, which record will be the second record and so on. In short:
How existing records are assigned the IDENTITY value?
We saw in the first part of the post that because a table was ordered via a clustered primary key, the IDENTITY specification was directly applied in that order. However, when the table is a heap (i.e. does not have a clustered index defined on it and is therefore unordered), the IDENTITY values are assigned based on the order in which records were inserted into the table.
The script and it’s results below prove this point.
USE tempdb; GO SET NOCOUNT ON; --1. Prepare the environment -- Create a table, and add some test data into it --Safety Check IF OBJECT_ID('tIdentity','U') IS NOT NULL DROP TABLE dbo.tIdentity; GO --Create a table, this time WITHOUT a Clustered Key CREATE TABLE dbo.tIdentity (RecordId INT, IdentityValue VARCHAR(20), CONSTRAINT pk_tIdentityRecordId PRIMARY KEY NONCLUSTERED (RecordId) ); GO --2. Add some test data INSERT INTO dbo.tIdentity (RecordId, IdentityValue) VALUES (3, 'Three'), (5, 'Five'), (1, 'One'), (0, 'Zero'), (4, 'Four'); GO --2b. As a test, inserting some interleaving data separately INSERT INTO dbo.tIdentity (RecordId, IdentityValue) VALUES (2, 'Two'), (6, 'Six'); GO --3. Check the values inserted into the table -- Observe the order of the records that come up in the results SELECT RecordId, IdentityValue FROM dbo.tIdentity; GO --4. Alter the table to add an IDENTITY Column ALTER TABLE dbo.tIdentity ADD IdentityId INT IDENTITY(1,1); GO --4. Check the values inserted into the table -- Observe the order of the records that come up in the results -- v/s the value of the IdentityId SELECT RecordId, IdentityId, IdentityValue FROM dbo.tIdentity; GO ----4b.One may also want to check the value of IDENT_CURRENT --SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC; --GO /********************************************** RESULTS **********************************************/ /* RecordId IdentityValue ----------- -------------- 3 Three 5 Five 1 One 0 Zero 4 Four 2 Two 6 Six RecordId IdentityId IdentityValue ----------- ----------- -------------- 3 4 Three 5 6 Five 1 2 One 0 1 Zero 4 5 Four 2 3 Two 6 7 Six */
Now, let us define a CLUSTERED INDEX on this table such that the records are re-arranged. Once the clustered index is defined, we will insert a couple of records and then look at the assignment of the IDENTITY column. We can see that IDENTITY values continue to be assigned in the order in which records are inserted, however, the records are now ordered based on the clustered index specification.
USE tempdb; GO -- 5. Define a clustered index for the Primary Key ALTER TABLE dbo.tIdentity DROP CONSTRAINT pk_tIdentityRecordId; GO ALTER TABLE dbo.tIdentity ADD CONSTRAINT pk_tIdentityRecordId PRIMARY KEY CLUSTERED (RecordId); GO --6. As a test, inserting some interleaving data separately INSERT INTO dbo.tIdentity (RecordId, IdentityValue) VALUES (-1, 'Minus One'), (7, 'Seven'), (8, 'Eight'); GO --7. Check the values inserted into the table -- Observe the order of the records that come up in the results -- v/s the value of the IdentityId SELECT RecordId, IdentityId, IdentityValue FROM dbo.tIdentity; GO ----7b.One may also want to check the value of IDENT_CURRENT --SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC; --GO --4. Cleanup IF OBJECT_ID('tIdentity','U') IS NOT NULL DROP TABLE dbo.tIdentity; GO /********************************************** RESULTS **********************************************/ /* RecordId IdentityId IdentityValue ----------- ----------- -------------- -1 8 Minus One 0 1 Zero 1 2 One 2 3 Two 3 4 Three 4 5 Four 5 6 Five 6 7 Six 7 9 Seven 8 10 Eight */
Conclusion (Part 02)
The behavior of IDENTITY columns and of SQL Server seen in this post is quite interesting – IDENTITY values are always generated in the order in which the records are inserted. When adding IDENTITY column to an existing table, this is true only if the table is not ordered, i.e. does not have a clustered index. Once the IDENTITY values have been assigned to all existing records in the table, all future values will be generated in the order of insertion.
Until we meet next time,
Pingback: #0390 – SQL Server – SSIS – OLE DB Destination – Table Fast Load – Keep Identity option | SQLTwins by Nakul Vachhrajani