One of the most frequently used, but perhaps one of the least understood feature of Microsoft SQL Server has to be the IDENTITY columns.
IDENTITY columns are any columns that are used as identifier columns. They create an automatically incrementing identification number on a given table.
Over the course of time, I have seen many novice engineers make statements that imply certain common implementations of IDENTITY to be the default and only behavior that these columns exhibit. For example, it’s a common implementation to have the IDENTITY value increment by 1, but it is not the only implementation possible. As we will see in the upcoming posts, increments can have a lot of different flavours.
In order for Microsoft SQL Server to create an automatically incrementing identification number the IDENTITY property consists of two attributes:
- SEED – this specifies the value that should be used by the first record inserted into the table
- INCREMENT – this specifies the increment or the value to be added to the seed to determine successive identity numbers
Here are some general rules around IDENTITY columns:
- A table may have only one column defined with the IDENTITY property
- The column must be defined as NOT NULL and must not have a DEFAULT constraint defined on it
- Valid data types for IDENTITY are decimal, int, bigint, numeric, smallint and tinyint
- Default value for both seed and increment is 1
The question therefore comes up is:
Why are IDENTITY columns useful?
Generally, IDENTITY columns are used as primary key values. When I created my first table that used the IDENTITY property, I was unaware of its significance – I had used it because it was part of the design standard used by the product. Upon study, I later realized the following:
By their very nature, IDENTITY values make for “non-intelligent” keys
IDENTITY values are auto-generated when a record is inserted. What this means is that IDENTITY ensures that the values generated are naturally ordered – making them ideal candidates for a clustered index
Generating the clustered index on an IDENTITY column essentially means that the logical ordering of records essentially remains the same
Similarly, record retrieval is faster and key management also becomes easier
Because IDENTITY can be used on BIGINT datatypes, it has a very vast addressing capacity – generally sufficient for most OLTP systems
These are the primary reasons why IDENTITY columns have become an important and commonly used design choice for integer-based primary key values.
From my next post, I will be presenting a series on the various aspects of IDENTITY and myths that surround the IDENTITY column. Are there any questions that come to your mind when working with IDENTITY columns? Do leave a note and I will try my best to answer them as I move along.
Until we meet next time,