Category Archives: Imported from BeyondRelational

These posts are imported from my old blog page: http://beyondrelational.com/modules/2/blogs/77/nakuls-blog.aspx

PIVOT & Dynamic Cross-tabs – Underappreciated features of Microsoft SQL Server


This is in continuation to my series on the Underappreciated features of Microsoft SQL Server. The series has been inspired from Andy Warren’s editorial on SQLServerCentral.com of the same name.

After a number of posts on my visit to Tech-Ed 2011 (India), we are now back on track and will continue from where we left off. We were discussing the T-SQL related underappreciated features and the next one in line was the PIVOT operator. Because PIVOT has been around since SQL Server 2005, and is simplifies a very important functionality, I was quite surprised to see this on the list of underappreciated features. Hence, I will start off with an introduction, provide and explain an example, and then point out the reader to a couple of reference resources that I used when I was learning this T-SQL enhancement.

Cross-tabs are a very important piece of any OLTP application. These systems typically have large amounts of data – all in great detail, but arranged “row-wise”. This arrangement is very much required for a high-performance OLTP system because it generally involves more incoming transactions and less of read operations. However, not every “row-wise” detail can be put up on a report – it makes great technical sense, but not business sense. For the data to have some business value, it needs to have the ability to be aggregated effectively when requested. What this means is that we need to convert one table-valued expression into another table.

The problem

Let’s take an example. In a sales setup, it is important to measure the number of orders placed by certain employees. In the AdventureWorks2008R2 sample database, this information is stored such that all unique values of interest (EmployeeId and VendorId) are in individual columns.

USE AdventureWorks2008R2
GO
SELECT poh.PurchaseOrderID, poh.EmployeeID, poh.VendorID
FROM Purchasing.PurchaseOrderHeader poh

image

What we need is that the unique values from one column in the expression (EmployeeID) are converted or transformed into multiple columns in the output, and an aggregation is performed on the remaining columns in the output. The unique values in the EmployeeID column themselves need to become fields in the final result set.

  EmployeeID as Columns
VendorID as Rows Aggregations – COUNT of PurchaseOrderIDs for the vendor in the row key for the employee in the column key

The Conventional Solution

Conventionally (before SQL 2005), we would have ended up doing something like:

USE AdventureWorks2008R2
GO
SELECT poh.VendorID,
       (SELECT COUNT(poh1.PurchaseOrderID) 
        FROM Purchasing.PurchaseOrderHeader poh1 
        WHERE poh.VendorID = poh1.VendorID AND poh1.EmployeeID = 250) AS Employee1,
       (SELECT COUNT(poh1.PurchaseOrderID) 
        FROM Purchasing.PurchaseOrderHeader poh1 
        WHERE poh.VendorID = poh1.VendorID AND poh1.EmployeeID = 251) AS Employee2,
       (SELECT COUNT(poh1.PurchaseOrderID) 
        FROM Purchasing.PurchaseOrderHeader poh1 
        WHERE poh.VendorID = poh1.VendorID AND poh1.EmployeeID = 256) AS Employee3,
       (SELECT COUNT(poh1.PurchaseOrderID) 
        FROM Purchasing.PurchaseOrderHeader poh1 
        WHERE poh.VendorID = poh1.VendorID AND poh1.EmployeeID = 257) AS Employee4,
       (SELECT COUNT(poh1.PurchaseOrderID) 
        FROM Purchasing.PurchaseOrderHeader poh1 
        WHERE poh.VendorID = poh1.VendorID AND poh1.EmployeeID = 260) AS Employee5
FROM Purchasing.PurchaseOrderHeader poh
WHERE poh.EmployeeID IN ( 250, 251, 256, 257, 260 )
GROUP BY poh.VendorID

Such a query would have given us the result as shown in the screen-shot below, which is what we require.

image

When we look at the execution plan of this query, we find that it is anything but efficient – in fact, it’s terrible! Due to space constraints, I am only showing the core of the execution plan (i.e. where most computations are concentrated).

image

Depending upon the situation, you may also want to use a complex series of the SELECT…CASE statements.

The solution – PIVOT & UNPIVOT operators

Come SQL 2005, we had a exciting T-SQL enhancements introduced by Microsoft. Some of them are the introduction of the PIVOT and UNPIVOT operators.

The following is the T-SQL query using PIVOT to carry out the same computation demonstrated above:

USE AdventureWorks2008R2;
GO
SELECT VendorID, 
       [250] AS Emp1, 
       [251] AS Emp2, 
       [256] AS Emp3, 
       [257] AS Emp4, 
       [260] AS Emp5
FROM (SELECT PurchaseOrderID, EmployeeID, VendorID
      FROM Purchasing.PurchaseOrderHeader) p
PIVOT (COUNT (PurchaseOrderID)
       FOR EmployeeID IN ( [250], [251], [256], [257], [260] )
      ) AS pvt
ORDER BY pvt.VendorID;

In this example, the results of the following sub-query are PIVOT’ed on the EmployeeID column.

USE AdventureWorks2008R2;
GO
SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader

Here is a walk-through of the above query:

  1. The PurchaseOrderID column serves as the grouping column along with the EmployeeID
  2. This set is then aggregated to produce an output as under:

image

Some points to consider:

  1. When PIVOT and UNPIVOT are used against databases that are upgraded to SQL Server 2005 or later, the compatibility level of the database must be set to 90 or higher
  2. When aggregate functions are used with PIVOT, the presence of any null values in the value column are not considered when computing an aggregation

To demonstrate UNPIVOT, let’s store the output of the PIVOT operation above into a temporary table variable and then execute the UNPIVOT operation on it. By doing so, there is a very important point I want to demonstrate.

USE AdventureWorks2008R2;
GO
--Declare a temporary table variable to hold the output of the PIVOT
DECLARE @pvt TABLE (VendorID INT, Emp1 INT, Emp2 INT, Emp3 INT, Emp4 INT, Emp5 INT)

INSERT INTO @pvt
SELECT VendorID, 
       [250] AS Emp1, 
       [251] AS Emp2, 
       [256] AS Emp3, 
       [257] AS Emp4, 
       [260] AS Emp5
FROM (SELECT PurchaseOrderID, EmployeeID, VendorID
      FROM Purchasing.PurchaseOrderHeader) p
PIVOT (COUNT (PurchaseOrderID)
       FOR EmployeeID IN ( [250], [251], [256], [257], [260] )
      ) AS pvt
ORDER BY pvt.VendorID

--UNPIVOT the table.
SELECT VendorID, Employee, Orders
FROM 
   (SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
   FROM @pvt) p
UNPIVOT
   (Orders FOR Employee IN 
      (Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt;
GO

The following is the output:

image

At this point, I would like you to understand the following major differences between PIVOT and UNPIVOT:

  1. UNPIVOT is not the exact reverse of PIVOT. This is because during a PIVOT operation, we aggregate data, which causes loss of data-granularity by merging multiple rows into a single row
  2. Also, because PIVOT ignores NULL values during the aggregations, they would no longer be present in the output of the UNPIVOT operation

Comparing PIVOT with the conventional query

Any feature is only useful if it has it’s own benefits. Obviously, PIVOT comes with the benefit of being more read-able and maintainable, but is it any better in terms of performance over the older, more conventional query?

So, I put both queries together, turned on the execution plan and executed them. The below was what the execution plan showed me. Do I need to say anything more?

image

Some great reference resources

I hope that you can now appreciate the benefits of using newer features of T-SQL as they come along. While there is an age-old saying that “If it isn’t broken, don’t fix it”, but in this case, we are not fixing anything – we are doing value addition – which is always welcome. As we say good-bye today, I will leave you with some great reference resources that helped me understand PIVOT:

  1. Getting started with PIVOT Queries in SQL Server 2005/2008 by Jacob Sebastian
  2. Another PIVOT Query example by Jacob Sebastian
  3. Posts on PIVOT & UNPIVOT tables by Pinal Dave
  4. Dynamic PIVOT in SQL Server 2005 by Madhivanan
  5. Dynamic cross-tab with multiple PIVOT columns by Madhivanan

Until we meet next time,

Be courteous. Drive responsibly.

AdventureWorks documentation – Data Dictionary


Today is a special day. It is Holy Thursday (or Maundy Thursday) – the day before Good Friday and the long Easter week-end, which is the Christian celebration of the Resurrection. It is the day when the Lord’s Supper or Holy Communion/Eucharist was instituted by Jesus Christ. This was made immortal by Leonardo Da Vinci in his famous fresco – “The Last Supper”.

Anyway, it’s still a working day in most parts of the world, and speaking of work reminds me that all of us have used the AdventureWorks family of sample databases at one time or the other. Ever since SQL Server 2005 came out, the AdventureWorks database has for me been the go-to place for studying about SQL Server, doing a quick test or demonstrating my thoughts to my managers or my team.

While the AdventureWorks family of databases is easily available from CodePlex, there is no well-known place where we can go to for lookup to the description of the schema and the tables. Recently, while researching for one of my posts, I stumbled upon the entire data dictionary for the AdventureWorks family of databases.

A data dictionary is a centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format. Data dictionaries are commonly used to circulate copies of the database schema to vendors and technology partners, and once a product is released, this may be made available to the end customers depending upon the need (eg. to allow for customization or study).

You can get the data dictionary for AdventureWorks sample databases at: http://technet.microsoft.com/en-us/library/ms124438(SQL.100).aspx

For those who would like to download the AdventureWorks database for SQL 11 (“Denali”), head over to: http://msftdbprodsamples.codeplex.com/releases/view/55330

I trust that the AdventureWorks data dictionary will help you a lot in finding your way through the database. Next week, we will be resuming the Underappreciated features of Microsoft SQL Server series.

Have a Happy Easter!

Until we meet next time,

Be courteous. Drive responsibly.

Want to ask me a question? Do so at: http://beyondrelational.com/ask/nakul/default.aspx

A script to verify a database backup


Today, I will be sharing a very small, but important script. Recently, one of the database backups we had received failed to restore. I was faced with a problem of determining whether the problem was with the backup itself, or whether it was an I/O subsystem issue or some other failure.

Like with all tools & utilities, SQL Server provides great options when used via commands instead of the UI. Similarly, the RESTORE command provides the facility to very easily validate a backup for you. Please find below the script I used to validate my backup and was able to determine that the backup received was indeed, corrupt.

USE MASTER
-- Add a new backup device
-- Ensure that the SQL Server can read from the physical location where the backup is placed
--                    TYPE      NAME		   PHYSICAL LOCATION
EXEC SP_ADDUMPDEVICE 'disk','networkdrive','\VPCW2K8Database BackupTest.bak'

-- Execute the Restore operation in VERIFY ONLY mode
-- Provide the actual paths where you plan to restore the database.
-- This is because VERIFYONLY also checks for available space
RESTORE
VERIFYONLY
FROM  networkdrive
WITH
MOVE N'TESTDB_DATA'    TO N'E:TestDBTestDB_Data.mdf',  
MOVE N'TESTDB_INDEXES' TO N'E:TestDBTestDB_Idx.mdf',  
MOVE N'TESTDB_LOG'     TO N'E:TestDBTestDB_LOG.ldf'

-- DROP THE DEVICE
--                   Name         , Physical File (OPTIONAL - if present, the file is deleted)
EXEC SP_DROPDEVICE 'networkdrive'

The checks performed by RESTORE VERIFYONLY include (per Books On Line):

  • That the backup set is complete and all volumes are readable
  • Some header fields of database pages, such as the page ID (as if it were about to write the data)
  • Checksum (if present on the media)
  • Checking for sufficient space on destination devices

What methods do you use to validate your backups? Do leave a small note as your comments.

Until we meet next time,

Be courteous. Drive responsibly.

XPS Error – Your current security settings do not allow this file to be downloaded.


This Friday, I will take a break from the SQL world and share with you a scare that I recently overcame.

XPS or XML Paper Specification is gaining popularity as an alternate document format for exchange over the Internet and E-mail. Sometime in the first week of April, somebody sent me a XPS file for my review via E-mail.

As soon as I attempted to open the file (in Internet Explorer), I encountered the following message:

Security Error – Your current security settings do not allow this file to be downloaded.” 

I did not expect this. Nothing had changed on my computer – no new installations, no changes to security settings – simply nothing! And yet, I was able to open XPS documents that I generated just fine – but this document that I had received did not open. How is this even possible?

After much toiling (about 20minutes!), I figured out the solution. To protect our security, Microsoft does not trust anything that comes from external sources and was hence blocking the opening of the XPS document because the operating system thought it to be a potential threat.

So, this is what I did:

  1. Right-click on the file
  2. Click the “Unblock” button in the “Security” section of the “General” file  properties tab
  3. Once done, I attempted to open the file, and success!

 

It’s a different thing altogether that I was not able to complete my review on that day, but it’s okay.

Finally, have you ever come across such an issue? If yes, what was your solution? Do let us know.

Until we meet next time,

Be courteous. Drive responsibly.

Getting started with Contained Databases in SQL 11 (“Denali”) – Part 04 – Converting a non-contained database


Since the start of the week, I am attempting to explore the concept of contained databases – something that I learnt about during a hands-on lab conducted by Jacob Sebastian. Please read my posts Part 01, Part 02 and Part 03 in case you missed them.

To get us back into context, let’s revisit the definition of a contained database:

"A contained database has no external dependencies on server-based settings or metadata to define the database. Users defined in a contained database do not have a dependency on logins created on the server, and therefore contain all information necessary to authenticate a user within the database."

On Day #2, we saw how to create our first contained database in the very latest SQL 11 (“Denali”) CTP01, and also learnt a bit about the changes made to the ALTER DATABASE clause in order to support contained databases. Yesterday, we saw how to move a contained database from database server instance #1 to #2.

But, most (I believe the number to be close to 80%) of the applications today are pre-existing applications, which means that they are already up and running on SQL Server 2008 or below. This also means that post-release, they will make the move to SQL 11 (“Denali”). At that time, would it not be good to know how to migrate or convert a non-contained database to a contained database?

For our exercise today, we will attempt to convert the uncontained AdventureWorks2008R2 databases to a contained database.

Creating uncontained users on an uncontained database

Pre-requisites: Please restore the AdventureWorks2008R2 on the SQL 11 (“Denali”) from the CodePlex links – http://msftdbprodsamples.codeplex.com/releases/view/55330

  1. Login to the SSMS and connect to our test instance of SQL 11 (“Denali”)
  2. Run the following script against the master database
    --Create a login on the server
    CREATE LOGIN UncontainedUser
    WITH PASSWORD = 'pass@word1'
    
    --Create a "non-contained" users for the newly created login on the server
    USE AdventureWorks2008R2
    GO
    CREATE USER UncontainedUser FOR LOGIN UncontainedUser
    GO
  3. This query creates a new server-level login, and an associated database-level user mapped to the server-level login

Identifying uncontained objects

Now it is time to identify uncontained objects that will cause issues with the migration. Our diagnostic tool in this exercise would be nothing else except the Dynamic Management Views (DMV)! With each passing release of SQL Server, DMVs are gathering immense power, and we will find them especially useful when working with contained databases.

The DMV that we will be using is sys.dm_db_uncontained_entities. This view is new to SQL 11 (“Denali”) and according to Books On Line,

Shows any uncontained objects used in the database. Uncontained objects are objects that cross the application boundary in a contained database. This view is accessible from both a contained database and a non-contained database. If sys.dm_db_uncontained_entities is empty, your database does not use any uncontained entities.

  1. In the Object Explorer pane, expand the Databases node, select the AdventureWorks2008R2 database, and then click the New Query button
  2. In the query window, run the following T-SQL command
    USE AdventureWorks2008R2
    GO
    SELECT class_desc,statement_type,feature_name,feature_type_name 
    FROM sys.dm_db_uncontained_entities
  3. Here are the results from executing the query
    image

    As discussed earlier in the week, we do not need to worry about the entry related to the ROUTE class. Besides this, we have the following uncontained objects in our AdventureWorks2008R2 database:

    1. A computed column based on the output of a system built-in function
    2. A database principal (the user “UncontainedUser” that we created above
    3. Three (3) full-text indices
  4. Run the following query against the AdventureWorks2008R2 database. This query produces a list of any users that have been configured in the database for SQL Server authentication and are mapped to a login that is not disabled
    USE AdventureWorks2008R2
    GO
    SELECT dp.name 
    FROM sys.database_principals dp 
    JOIN sys.server_principals sp ON dp.sid = sp.sid
    WHERE dp.authentication_type = 1
      AND sp.is_disabled = 0 
  5. In our case, the result is the one uncontained user that we created above
    image

Migrating AdventureWorks2008R2 database to a contained database

  1. Before we begin, please ensure that the SQL 11 (“Denali”) instance is all setup to handle contained databases (refer my posts from earlier this week for details)
  2. In the Object Explorer pane, expand the System Databases node, select the Master database, and then click the New Query button
  3. After ensuring that there are no open connections to AdventureWorks2008R2, run the following T-SQL command
    USE master
    GO
    ALTER DATABASE AdventureWorks2008R2 SET CONTAINMENT=PARTIAL;
    GO
  4. After the query completes successfully, select the AdventureWorks2008R2 database in the database list
  5. Run the following T-SQL command:
    USE AdventureWorks2008R2
    GO
    EXEC sp_migrate_user_to_contained @username = N'UnContainedUser', 
                                      @rename = N'keep_name', 
                                      @disable_login = N'disable_login'
    Per Books On Line (refer BOL page here):

    Converts a database user that is mapped to a SQL Server login, to a contained database user with password. In a contained database, use this procedure to remove dependencies on the instance of SQL Server where the database is installed.

  6. Now run the following query which we originally used to identify uncontained users in a database
    USE AdventureWorks2008R2
    GO
    SELECT dp.name 
    FROM sys.database_principals dp 
    JOIN sys.server_principals sp ON dp.sid = sp.sid
    WHERE dp.authentication_type = 1
      AND sp.is_disabled = 0 

Note that “UnContainedUser” no longer shows up in the list. This means that it has been migrated to a contained user and the login has been disabled at the server level. The server-level login has now been disabled.

image

An attempt to login conventionally fails:

image

Validating the move to Partially Contained database

To validate whether or not the partial containment succeeded, let’s see if we can login using the newly contained login to the AdventureWorks2008R2 in the contained mode.

  1. In the Object Explorer, click on the Connect drop-down list and choose “Database engine”
  2. Change the authentication to “SQL Server Authentication”
  3. In the login name and password boxes, pass in the login name and the password of the “UncontainedUser”
  4. Do not hit “Enter” key or click on “Connect”. Instead, click on the “Options” button on the “Connect to Server” dialog box
  5. In the Connect to Database box, type in the name of the contained database, in our case “AdventureWorks2008R2”

image

Now, click on “Connect” and confirm that we are indeed in “contained mode”.

image

In conclusion

Contained databases hold a lot of promise in reducing the administrative overhead associated with a database. I hope that you liked all the posts this week, and hope that you will now go and experiment with this wonderful concept. If you have any interesting findings or and/or links to other great posts on contained databases, do feel free to share them in the comments on this page.

Until we meet next time,

Be courteous. Drive responsibly.