Tag Archives: Series

Articles that are too big to fit into one part.

#0273 – SQL Server – IDENTITY columns – Myths – IDENTITY columns cannot be added to existing tables – Part 02


As seen in the first part of this post, we can add IDENTITY columns to existing tables without any issue. IDENTITY columns cannot be NULL. Hence, when we add a new column as an IDENTITY column, every record existing in the table has to be assigned an IDENTITY value. So, the question arises that which record should be the first one to be assigned with the Identity seed, which record will be the second record and so on. In short:



How existing records are assigned the IDENTITY value?


We saw in the first part of the post that because a table was ordered via a clustered primary key, the IDENTITY specification was directly applied in that order. However, when the table is a heap (i.e. does not have a clustered index defined on it and is therefore unordered), the IDENTITY values are assigned based on the order in which records were inserted into the table.


The script and it’s results below prove this point.

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table, this time WITHOUT a Clustered Key
CREATE TABLE dbo.tIdentity (RecordId INT,
                            IdentityValue VARCHAR(20),
                            CONSTRAINT pk_tIdentityRecordId PRIMARY KEY NONCLUSTERED (RecordId)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (3, 'Three'),
       (5, 'Five'),
       (1, 'One'),
       (0, 'Zero'),
       (4, 'Four');
GO

--2b. As a test, inserting some interleaving data separately
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (2, 'Two'),
       (6, 'Six');
GO

--3. Check the values inserted into the table
--   Observe the order of the records that come up in the results
SELECT RecordId, IdentityValue
FROM dbo.tIdentity;
GO

--4. Alter the table to add an IDENTITY Column
ALTER TABLE dbo.tIdentity
    ADD IdentityId INT IDENTITY(1,1);
GO    

--4. Check the values inserted into the table
--   Observe the order of the records that come up in the results
--   v/s the value of the IdentityId
SELECT RecordId, IdentityId, IdentityValue
FROM dbo.tIdentity;
GO

----4b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

/**********************************************
               RESULTS
**********************************************/
/*
RecordId    IdentityValue
----------- --------------
3           Three
5           Five
1           One
0           Zero
4           Four
2           Two
6           Six

RecordId    IdentityId  IdentityValue
----------- ----------- --------------
3           4           Three
5           6           Five
1           2           One
0           1           Zero
4           5           Four
2           3           Two
6           7           Six
*/

Now, let us define a CLUSTERED INDEX on this table such that the records are re-arranged. Once the clustered index is defined, we will insert a couple of records and then look at the assignment of the IDENTITY column. We can see that IDENTITY values continue to be assigned in the order in which records are inserted, however, the records are now ordered based on the clustered index specification.

USE tempdb;
GO

-- 5. Define a clustered index for the Primary Key
ALTER TABLE dbo.tIdentity
    DROP CONSTRAINT pk_tIdentityRecordId;
GO

ALTER TABLE dbo.tIdentity
    ADD CONSTRAINT pk_tIdentityRecordId PRIMARY KEY CLUSTERED (RecordId);
GO

--6. As a test, inserting some interleaving data separately
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (-1, 'Minus One'),
       (7, 'Seven'),
       (8, 'Eight');
GO

--7. Check the values inserted into the table
--   Observe the order of the records that come up in the results
--   v/s the value of the IdentityId
SELECT RecordId, IdentityId, IdentityValue
FROM dbo.tIdentity;
GO

----7b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO  

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
RecordId    IdentityId  IdentityValue
----------- ----------- --------------
-1          8           Minus One
0           1           Zero
1           2           One
2           3           Two
3           4           Three
4           5           Four
5           6           Five
6           7           Six
7           9           Seven
8           10          Eight
*/

Conclusion (Part 02)


The behavior of IDENTITY columns and of SQL Server seen in this post is quite interesting – IDENTITY values are always generated in the order in which the records are inserted. When adding IDENTITY column to an existing table, this is true only if the table is not ordered, i.e. does not have a clustered index. Once the IDENTITY values have been assigned to all existing records in the table, all future values will be generated in the order of insertion.


Until we meet next time,


Be courteous. Drive responsibly.

#0272 – SQL Server – IDENTITY columns – Myths – IDENTITY columns cannot be added to existing tables – Part 01


Generally, IDENTITY columns are used when developing new tables so that when records are inserted into a table, they are assigned the appropriate values.


Hence if ever the need arises to insert an IDENTITY column into an existing question, the big question is – can it be done?



Is it really possible to add an IDENTITY column to an existing table?


The answer is quite simple – Yes!


The only restrictions that ALTER TABLE enforces upon us are that a new column must be added and that it should be of a data-type compatible with IDENTITY Columns. To demonstrate these points, I will perform two experiments shown below.


Modifying an existing column to an IDENTITY column


Other than recreating the entire table, an existing column cannot be modified to an IDENTITY column. Copy-Paste the query below into SSMS and simply parse it by using Ctrl+ F5 key combination or the blue check mark on the SSMS toolbar. The following error will be encountered:

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (RecordId INT,
                            IdentityValue VARCHAR(20)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (3, 'Three'),
       (2, 'Two'),
       (1, 'One'),
       (0, 'Zero'),
       (4, 'Four'),
       (5, 'Five'),
       (6, 'Six');
GO

--3. Check the values inserted into the table
SELECT RecordId, IdentityValue
FROM dbo.tIdentity;
GO

--4. Alter the table to add an IDENTITY Column
ALTER TABLE dbo.tIdentity
    ALTER COLUMN RecordId INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED;
GO    

--4. Check the values inserted into the table
SELECT RecordId, IdentityValue
FROM dbo.tIdentity;
GO

----4b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
Msg 156, Level 15, State 1, Line 4
Incorrect syntax near the keyword 'IDENTITY'.
*/

Tip: If your design ever needs to be modified to convert an existing column into an IDENTITY column, this can be achieved using the following high-level steps:



  1. Create an identical table under a different name, but with an IDENTITY column (e.g. tmp_tIdentity)
  2. Set IDENTITY_INSERT to ON for this new table
  3. Pump data from the old table to the new table (i.e. from tIdentity to tmp_tIdentity)
  4. Set IDENTITY_INSERT to OFF for this new table
  5. Drop the old table
  6. Rename the newly created table (i.e. rename from tmp_tIdentity to tIdentity)

Adding a new Identity column to an existing table


While existing columns cannot be converted to IDENTITY columns, new IDENTITY column can always be added to an existing table (provided there are no other IDENTITY columns on it).


The below script quite simply creates a table and then add an IDENTITY column to it. As can be seen from the results below, although the records were inserted randomly, the identity values are assigned based on the way in which the records are ordered. This is an important consideration, as we will see in part 02 of this post. In this case, we have a clustered key on the RecordId column and therefore, the values in the IDENTITY column follow the same order as the RecordId.

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (RecordId INT PRIMARY KEY CLUSTERED,
                            IdentityValue VARCHAR(20)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (RecordId, IdentityValue)
VALUES (3, 'Three'),
       (5, 'Five'),
       (1, 'One'),
       (0, 'Zero'),
       (4, 'Four'),
       (2, 'Two'),
       (6, 'Six');
GO

--3. Check the values inserted into the table
SELECT RecordId, IdentityValue
FROM dbo.tIdentity;
GO

--4. Alter the table to add an IDENTITY Column
ALTER TABLE dbo.tIdentity
    ADD IdentityId INT IDENTITY(1,1);
GO    

--4. Check the values inserted into the table
SELECT RecordId, IdentityId, IdentityValue
FROM dbo.tIdentity;
GO

----4b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
--Before adding the IDENTITY value
RecordId    IdentityValue
----------- --------------
0           Zero
1           One
2           Two
3           Three
4           Four
5           Five
6           Six

--After adding the IDENTITY value
RecordId    IdentityId  IdentityValue
----------- ----------- --------------
0           1           Zero
1           2           One
2           3           Two
3           4           Three
4           5           Four
5           6           Five
6           7           Six
*/

Conclusion (Part 01)


The above experiments show that Microsoft SQL Server allows us to define IDENTITY columns on existing tables, under certain restrictions. This can be a very useful feature in case an original, legacy design needs to be changed for a universal adoption of IDENTITY columns in your product.


Until we meet next time,


Be courteous. Drive responsibly.

#0271 – SQL Server – IDENTITY columns – Myths – IDENTITY values (including SEED & INCREMENT) cannot be negative


When one thinks of an incremental number, a series of positive numbers comes to mind. Since IDENTITY columns are an auto-incrementing series of numbers, it is human to associate IDENTITY columns with positive numbers. However when we talk about numbers and the number line, we can go to infinity from both sides of 0 (-Infinity….-5, -4, -3, -2, -1, 0, +1, +2, +3, +4, +5,….+Infinity).


Hence, the question that comes up is whether IDENTITY values (including the seed & increment values) can be negative. The only way to test it out is to use an example:

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY (-1, -1),
                            IdentityValue VARCHAR(20)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('Minus One'),
       ('Minus Two'),
       ('Minus Three'),
       ('Minus Four'),
       ('Minus Five');
GO

--3. Check the values inserted into the table
SELECT IdentityId, IdentityValue
FROM dbo.tIdentity;

----3b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValue;
--GO

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
IdentityId  IdentityValue
----------- --------------
-1          Minus One
-2          Minus Two
-3          Minus Three
-4          Minus Four
-5          Minus Five
*/

As can be seen from the result of the script above, it is possible for the SEED, INCREMENT and the IDENTITY column value itself to be negative. Keeping in mind the rules of the number line (from elementary school), please note that as one moves away from 0 (in either direction), the number keeps increasing in value and is therefore incrementing.


Conclusion


As shown by the quick experiment above, it is quite possible to have negative values for IDENTITY columns. As of writing this post, I do not have a practical implementation of this in mind. However, if you have ever encountered a situation where auto-incrementing negative values might be useful, do drop a line in the blog comments below.


Until we meet next time,


Be courteous. Drive responsibly.

#0270 – SQL Server – IDENTITY columns – Myths – The value for SEED and INCREMENT must be 1


When I interview students and ask them about IDENTITY values, they are often under the impression that they have a fixed starting value (i.e. SEED) and increment (i.e. IDENTITY values go from 1, 2, 3, 4, 5… and so on).


For those who read my Introduction to this series on IDENTITY values, I had mentioned that it is a common implementation to have the IDENTITY value increment by 1, but it is not the only implementation possible. Let’s look at this by means of some examples:


SEED & INCREMENT values – DEFAULT behavior


It might come as a surprise to many that the customary definition of IDENTITY (1,1) is used for consistency purposes only. Simply defining a column as IDENTITY would cause the database engine to assume a default value of 1 for the seed and identity.


The script below shows this by not specifying anything for the seed & increment values when defining the IDENTITY column.

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
--NOTE! No values have been explicitly defined for the SEED & INCREMENT
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY,
                            IdentityValue VARCHAR(10)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('One'),
       ('Two'),
       ('Three');
GO

--3. Check the values inserted into the table
SELECT IdentityId, IdentityValue
FROM dbo.tIdentity;

----3b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
IdentityId  IdentityValue
----------- -------------
1           One
2           Two
3           Three
*/

As can be seen from the results shown above, the database engine used default values of 1 for both seed & increment.


SEED & INCREMENT values – Custom values


Now, let us attempt a similar experiment but with a seed value of 10 and an increment value of 5.

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY (10, 5),
                            IdentityValue VARCHAR(10)
                           );
GO

--2. Add some test data
INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('Ten'),
       ('Fifteen'),
       ('Twenty');
GO

--3. Check the values inserted into the table
SELECT IdentityId, IdentityValue
FROM dbo.tIdentity;

----3b.One may also want to check the value of IDENT_CURRENT
--SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
--GO

--4. Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
IdentityId  IdentityValue
----------- -------------
10          Ten
15          Fifteen
20          Twenty
*/

As can be seen from the results shown above, the identity columns worked perfectly fine with seed and increment values other than 1.


The only limitation that this approach might end up with is the limited range of allowed values. For example, if the seed is kept at 1000, the column can accommodate an equivalent number of lesser records before the max. value for the given data-type is reached.


Conclusion


As shown by the experiments above, it is not necessary for any application to use the default values of IDENTITY columns – they can be set to values acceptable by the business and domain rules applicable to the application.


Until we meet next time,


Be courteous. Drive responsibly.

#0269 – SQL Server – IDENTITY columns – Myths – Duplicate Values cannot exist IDENTITY columns


The SQL Server database engine automatically generates the next value for use by the IDENTITY column on a table. This often leads to a misconception that IDENTITY columns always contain unique values.


The truth is that they don’t have to be unique. It is perfectly valid to have duplicate values in IDENTITY columns and today, I present two such scenarios under which this is possible:


Duplicate values in IDENTITY columns – Explicit Inserts


As already know, explicit values can be inserted into IDENTITY columns. It is therefore possible to insert duplicate values in this scenario:

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY(1,1),
                            IdentityValue VARCHAR(10)
                           );
GO

--Add some test data
INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('One'),
       ('Two'),
       ('Three');
GO

--2. Check the current identity value of the table using IDENT_CURRENT()
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueBeforeTest;
GO

--3. Turn Explicit IDENTITY_INSERT ON and insert duplicate IDENTITY values
SET IDENTITY_INSERT dbo.tIdentity ON;
GO

INSERT INTO dbo.tIdentity (IdentityId, IdentityValue)
VALUES (2, 'Two2'),
       (3, 'Three2');
GO

SET IDENTITY_INSERT dbo.tIdentity OFF;
GO

--4. Check the current identity value of the table using IDENT_CURRENT()
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterExplicitInsert;
GO

SELECT * FROM dbo.tIdentity;
GO

--5. Run DBCC CHECKIDENT and see what happens
DBCC CHECKIDENT ('dbo.tIdentity');
GO
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
GO

--Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
IdentityValueBeforeTest
-----------------------
3

IdentityValueAfterExplicitInsert
--------------------------------
3

IdentityId  IdentityValue
----------- -------------
1           One
2           Two
3           Three
2           Two2
3           Three2

Checking identity information: current identity value '3', current column value '3'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

IdentityValueAfterDBCC
----------------------
3
*/

As can be seen from the script above, we were successfully able to insert duplicate values for the IdentityId column in the table. Also, the current IDENTITY value continues to remain 3 although there are about 5 records inserted in the table.


Duplicate values in IDENTITY columns – Re-seed IDENTITY column


The DBCC CHECKIDENT can be used to get the current value of the IDENTITY column value. By default, this DBCC statement does not reseed the IDENTITY column. However, if the RESEED parameter is specified, it allows the administrator to reset the “seed” value of the column to any value supplied (Refer the Introductory Post for a refresher).


The query shown below:



  • Creates a test table and inserts some test data into it

  • Verifies the value of the IDENTITY column

  • Reseed the IDENTITY column using DBCC CHECKIDENT with the RESEED parameter

  • Insert some test data into the test table

  • Verify the values in the table and the current value of the IDENTITY column

USE tempdb;
GO

SET NOCOUNT ON;

--1. Prepare the environment
--   Create a table, and add some test data into it

--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY(1,1),
                            IdentityValue VARCHAR(10)
                           );
GO

--Add some test data
INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('One'),
       ('Two'),
       ('Three');
GO

--2. Check the current identity value of the table using IDENT_CURRENT()
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueBeforeTest;
GO

--3. Reseed the IDENTITY column
DBCC CHECKIDENT ('dbo.tIdentity', RESEED, 1);
GO

INSERT INTO dbo.tIdentity (IdentityValue)
VALUES ('Two2'),
       ('Three2');
GO

--4. Check the current identity value of the table using IDENT_CURRENT()
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValuesAfterReseed;
GO

SELECT * FROM dbo.tIdentity;
GO

--5. Run DBCC CHECKIDENT and see what happens
DBCC CHECKIDENT ('dbo.tIdentity');
GO
SELECT IDENT_CURRENT('dbo.tIdentity') AS IdentityValueAfterDBCC;
GO

--Cleanup
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

/**********************************************
               RESULTS
**********************************************/
/*
IdentityValueBeforeTest
-----------------------
3

Checking identity information: current identity value '3', current column value '1'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

IdentityValuesAfterReseed
-------------------------
3

IdentityId  IdentityValue
----------- -------------
1           One
2           Two
3           Three
2           Two2
3           Three2

Checking identity information: current identity value '3', current column value '3'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

IdentityValueAfterDBCC
----------------------
3
*/

Duplicate values in IDENTITY columns – BCP/BULK INSERT/OPENROWSET – -E, KEEPIDENTITY, -f switches


A long time ago, I wrote about the BCP utility being an underappreciated feature of Microsoft SQL Server.


When data is exported or inserted in bulk via the bcp and BULK INSERT statements, it is possible to preserve the value of IDENTITY columns using the following switches:



  • BCP uses the –E switch

  • BULK INSERT uses the KEEPIDENTITY switch

  • OPENROWSET uses the –f switch

Conclusion


As is the case with all columns in SQL Server, uniqueness can only be guaranteed by the use of a UNIQUE constraint. Without the constraint, IDENTITY columns may end up having duplicate values.


Often, applications and T-SQL code use IDENTITY columns as a unique sequence of numbers – this should be done with due care as duplicate values may cause the code to break.


Until we meet next time,


Be courteous. Drive responsibly.