Category Archives: Imported from BeyondRelational

These posts are imported from my old blog page: http://beyondrelational.com/modules/2/blogs/77/nakuls-blog.aspx

#0265 – SQL Server – An introduction to IDENTITY columns


One of the most frequently used, but perhaps one of the least understood feature of Microsoft SQL Server has to be the IDENTITY columns.


IDENTITY columns are any columns that are used as identifier columns. They create an automatically incrementing identification number on a given table.


Over the course of time, I have seen many novice engineers make statements that imply certain common implementations of IDENTITY to be the default and only behavior that these columns exhibit. For example, it’s a common implementation to have the IDENTITY value increment by 1, but it is not the only implementation possible. As we will see in the upcoming posts, increments can have a lot of different flavours.


An Introduction


In order for Microsoft SQL Server to create an automatically incrementing identification number the IDENTITY property consists of two attributes:



  • SEED – this specifies the value that should be used by the first record inserted into the table

  • INCREMENT – this specifies the increment or the value to be added to the seed to determine successive identity numbers

Here are some general rules around IDENTITY columns:



  • A table may have only one column defined with the IDENTITY property

  • The column must be defined as NOT NULL and must not have a DEFAULT constraint defined on it

  • Valid data types for IDENTITY are decimal, int, bigint, numeric, smallint and tinyint

  • Default value for both seed and increment is 1

The question therefore comes up is:



Why are IDENTITY columns useful?


Why IDENTITY?


Generally, IDENTITY columns are used as primary key values. When I created my first table that used the IDENTITY property, I was unaware of its significance – I had used it because it was part of the design standard used by the product. Upon study, I later realized the following:




  • By their very nature, IDENTITY values make for “non-intelligent” keys


  • IDENTITY values are auto-generated when a record is inserted. What this means is that IDENTITY ensures that the values generated are naturally ordered – making them ideal candidates for a clustered index



    • Generating the clustered index on an IDENTITY column essentially means that the logical ordering of records essentially remains the same


    • Similarly, record retrieval is faster and key management also becomes easier


  • Because IDENTITY can be used on BIGINT datatypes, it has a very vast addressing capacity – generally sufficient for most OLTP systems

These are the primary reasons why IDENTITY columns have become an important and commonly used design choice for integer-based primary key values.



From my next post, I will be presenting a series on the various aspects of IDENTITY and myths that surround the IDENTITY column. Are there any questions that come to your mind when working with IDENTITY columns? Do leave a note and I will try my best to answer them as I move along.


Until we meet next time,


Be courteous. Drive responsibly.

#0264 – SQL Server – How to check if a database belongs to your product?


One question that I often face from teams is:



Given a set of SQL server databases, how do we identify that a given database belong to our product?


There are various mechanisms that can be used, depending upon how the underlying database design is. I have two mechanisms that I use frequently.




  1. Check for the existance of a particular table (a table that stores your product version or some table which holds product-specific static data)


  2. Many database designs (especially those derived from legacy systems) are such that all tables have a column that is used for various generic purposes (to indicate whether a particular record was modified as part of a data clean-up, or a GUID column to facilitate replication, etc)



    • In such database designs, one can check if all tables have the particular column. Pseudo-code to achieve this check efficiently is shown below:
      USE AdventureWorks2008R2 ;
      GO
      IF NOT EXISTS ( SELECT st.*
      FROM sys.tables AS st
      LEFT OUTER JOIN sys.columns AS sc ON st.object_id = sc.object_id
      AND sc.name = ‘BusinessEntityId’
      WHERE st.type = ‘U’
      AND st.object_id IS NOT NULL
      AND sc.object_id IS NULL )
      BEGIN
      PRINT ‘All tables contain the column – BusinessEntityId.’ ;
      END
      ELSE
      BEGIN
      PRINT ‘Some or all tables do not contain the column – BusinessEntityId.’ ;
      END
      GO


    • The benefit of the above query is that it queries the meta-data and is therefore faster than a custom iterative mechanism

I trust this post was helpful. I would also like to know what is the method you use for detecting your product database?


Until we meet next time,


Be courteous. Drive responsibly.

#0263 – SQL Server – Interview Question – Script to Identify DML Triggers (Active/Disabled) and Trigger Events


For an entry level DBA position, some of the common interview questions around the subject of database DML triggers are:



  • How many triggers exist in your database?

  • Can you classify them into INSERT, UPDATE and DELETE triggers?

This post presents a script that answers all the questions above. The script below involves querying the Catalog Views – sys.triggers and sys.trigger_events to answer these questions:

USE AdventureWorks2008R2 ;
GO
SELECT st.name AS TriggerName,
OBJECT_NAME(st.parent_id) AS ParentTableName,
st.is_ms_shipped AS IsMSShipped,
st.is_disabled AS IsDisabled,
st.is_not_for_replication AS IsNotForReplication,
st.is_instead_of_trigger AS IsInsteadOfTrigger,
te.type AS TypeId,
te.type_desc AS TypeDescription,
te.is_first AS IsTriggerFiredFirst,
te.is_last AS IsTriggerFiredLast
FROM sys.triggers AS st
INNER JOIN sys.trigger_events AS te ON te.object_id = st.object_id
WHERE st.parent_id > 0
AND st.is_ms_shipped = 0
ORDER BY st.parent_id, te.type ASC;
GO

/* Results:
TriggerName ParentTableName IsMS Is IsNotFor IsInstead Type Type IsTrigger IsTrigger
Shipped Disabled Replication Trigger Id Description FiredFirst FiredLast
——————– ——————– ——- ——– ———– ——— —- ———– ———– ———
dVendor Vendor 0 0 1 1 3 DELETE 0 0
iWorkOrder WorkOrder 0 0 0 0 1 INSERT 0 0
uWorkOrder WorkOrder 0 0 0 0 2 UPDATE 0 0
iPurchaseOrderDetail PurchaseOrderDetail 0 0 0 0 1 INSERT 0 0
uPurchaseOrderDetail PurchaseOrderDetail 0 0 0 0 2 UPDATE 0 0
uPurchaseOrderHeader PurchaseOrderHeader 0 0 0 0 2 UPDATE 0 0
iduSalesOrderDetail SalesOrderDetail 0 0 0 0 1 INSERT 0 0
iduSalesOrderDetail SalesOrderDetail 0 0 0 0 2 UPDATE 0 0
iduSalesOrderDetail SalesOrderDetail 0 0 0 0 3 DELETE 0 0
dEmployee Employee 0 0 1 1 3 DELETE 0 0
uSalesOrderHeader SalesOrderHeader 0 0 1 0 2 UPDATE 0 0
iuPerson Person 0 0 1 0 1 INSERT 0 0
iuPerson Person 0 0 1 0 2 UPDATE 0 0
*/


Here are some of the ways in which the query given above can be helpful in addition to answering interview questions:



  • Detemine if a trigger exists on a database

  • The IsDisabled flag shows whether a trigger is active or not

  • Is the trigger an INSTEAD OF trigger? 


Further Reading:



  • sys.triggers: [Link]

  • sys.trigger_events: [Link]

  • Catalog views and DMVs – An introduction: [Link]

  • SQL Server – 2012 – What’s new? – Dynamic Management Views (DMV) and Catalog View changes: [Link]

Until we meet next time,


Be courteous. Drive responsibly.

#0262 – SQL Server – Scripts to generate and parse multi-part database object names – PARSENAME() function


In my previous post, we looked at the various naming conventions available to us when working with database objects.

Often we may face the need to either generate three-part object names for all objects in the database, or parse a three part name into it’s constituent parts. This post provides quick scripts to achieve this.

Generating Multi-Part Names

Generating multi-part names is a simple affair of querying the sys.objects table as shown in the script below:

USE AdventureWorks2012;
GO

SELECT CAST(SERVERPROPERTY('ServerName') AS VARCHAR(100)) + '.' + 
           DB_NAME() + '.' + 
           SCHEMA_NAME(so.schema_id) + '.' + 
           so.name AS FourPartName,
       DB_NAME() + '.' + 
           SCHEMA_NAME(so.schema_id) + '.' + 
           so.name AS ThreePartName,
       SCHEMA_NAME(so.schema_id) + '.' + 
           so.name AS TwoPartName,
       so.schema_id AS SchemaId,
       so.name AS ObjectName,
       so.type AS ObjectType,
       so.type_desc AS ObjectTypeDescription
FROM sys.objects AS so
WHERE so.is_ms_shipped = 0 AND so.type IN ('U','P','V','TR')
ORDER BY so.schema_id ASC, so.type DESC;
GO

Parsing Multi-Part Names

Parsing multi-part names does not involve any fancy string manipulation, but can be easily done using a system function – PARSENAME().

The PARSENAME function accepts two parameters – the multi-part object name to be parsed and a number from 1-4 indicating the part that needs to be fetched from the supplied name, as shown below:

  • 1 = Object Name
  • 2 = Schema Name
  • 3 = Database Name
  • 4 = Server Name

If a particular name is not found in the string supplied, the PARSENAME() function returns NULL.

The query to parse multi-part names into their constituents is available below:

USE AdventureWorks2012;
GO

DECLARE @objectNames TABLE (MultiPartName VARCHAR(100));

INSERT INTO @objectNames (MultiPartName)
VALUES ('W8SQLSERVERSQL2K12.AdventureWorks2012.HumanResources.Employee'),
       ('AdventureWorks2012.HumanResources.Employee'),
       ('HumanResources.Employee')

SELECT oname.MultiPartName AS SuppliedMultiPartName,
       PARSENAME(oname.MultiPartName,4) AS ServerName,
       PARSENAME(oname.MultiPartName,3) AS DatabaseName,
       PARSENAME(oname.MultiPartName,2) AS SchemaName,
       PARSENAME(oname.MultiPartName,1) AS ObjectName
FROM @objectNames AS oname;
GO

Notice that the PARSENAME() function returned NULL for the ServerName column when interpreting a three-part parameter. Similarly, we get NULL for ServerName and DatabaseName when working with a two-part name.

I trust the scripts above were useful to you.

Further Reading

Until we meet next time,

Be courteous. Drive responsibly.

#0261 – SQL Server – An Introduction to Multi-part naming standards for Object name referencing


The topic of this blog post is not something that is new, but is intended to address a common issue I have found in most database documentation. Most database documentations fail to address name resolution errors – i.e. the ability to ensure that a name uniquely identifies a database object.

For example, it is quite possible to create an Employee table in the Person or dbo schema in the AdventureWorks2008R2 database even though another table with the same name already exists in the HumanResources schema.

Good database documentation is one which leaves very little room for interpretation – especially when ownership of objects and security relationships is concerned. This post looks at the most commonly used naming conventions when documenting databases and finally comes up with a recommendation for their usage.

Three-Part Naming

The three part naming is the standard for unique identification of any database object. The three part naming convention essentially consists of:

  • Database Name
  • Schema Name
  • Object Name

The standard representation is like: [Database Name].[Schema Name].[Object Name].

Four Part Naming

A common variant of the 3-part naming convention is the 4 part naming convention, which also adds the SQL Server instance name to the mix, making the representation like: [SQL Instance Name].[Database Name].[Schema Name].[Object Name].

This naming convention is essential when attempting to access data across remote/linked servers. However, it is strongly tied to the SQL Server instance name, and therefore I do not prefer it when documenting my databases.

Two Part Naming

Most applications do not have a restriction on the database name (this was not the case for legacy systems). In such cases, having the [Database Name] in the three part naming convention does not make sense. Most documentation therefore adopt what is called as the two-part naming standard.

Under the two-part naming standard, a database object is represented as: [Schema Name].[Object Name].

It is my humble request to the kind reader to please use the two-part naming convention at the very least (even when using the default “dbo” schema).

Other Variants

Those coming directly from the world of SQL Server 2000 or those used to working within the default “dbo” schema would omit the “Schema Name” and the representation would be like: [Database Name]..[Object Name].

I do not subscribe to this theory as it makes the documentation open to interpretation.

For example, in cases when the database design is such that all objects use a schema other than “dbo” does omitting the schema name indicate that all objects use the available user-defined schema or that they use “dbo”? It is therefore, my recommendation that when writing database documentation, a two-part naming convention should be used at the very least.

Conclusion

Name resolution is not a problem limited to the database documentation. When ambiguous naming is encountered in database code, the results can be confusing and make troubleshooting even more difficult. The recommendation for usage of naming conventions in documentation or in database code is therefore:

  • The two-part naming convention is sufficient to prevent name resolution errors within the same database
  • When more than one database is involved, the three-part naming convention should be used
  • Similarly, when more than one SQL Server instance is involved, the full length four-part naming convention is required

All other variants leave room for interpretation, and are not something I would recommend.

I would be happy to hear about the naming convention followed by you, the kind reader. Do leave a note in the blog comments as you go.

Until we meet next time,

Be courteous. Drive responsibly.