Category Archives: #SQLServer

All about Microsoft SQL Server

SQL Server – Profiler – Part 2 – Profiler Templates, Template Types and Creating Customized Templates

Profiler Templates

We talked yesterday about SQL Traces and it’s components. SQL Server Profiler ships with templates that define the event classes and data columns to include in traces. It is important to understand that when we execute the SQL Server Profiler, we are capturing a SQL Trace and the template gives body to the trace, i.e. the template is used to represent the events and data columns captured in the trace, but the template is itself never “executed”.

Template Types

SQL Server Profiler offers predefined trace templates that allow you to easily configure the event classes that you will most likely need for specific traces. To view the list of available templates, click on the “New Trace” button on the toolbar, and then expand the “Use the template” drop-down.

NOTE: If you are using a version of the SQL Server Profiler that’s different than the SQL Server instance, then the template drop-down may be blank, and you may be required to use a customized template.

Below is a brief description of the various template types available in the SQL Server Profiler. To know more about the various Event Classes associated to each, please refer the Books On Line documentation referenced in the “References” section of this post. The ones I have used most in my career till date are: TSQL_Duration, TSQL_Locks and TSQL_SPs.

Template Name	Description
Blank	This is a blank template that can be used during template customization
SP_Counts	Captures the event raised whenever a stored procedure starts execution
Standard	This is the default trace. This is the most generic and commonly used trace that captures: 1. Connections (connect and disconnects) 2. Completion of Remote Procedure Calls 3. Start & completion of T-SQL statements & batches
TSQL	Captures all client-submitted T-SQL statements and the time issued. Use to debug client applications.
TSQL_Duration	Captures all client-submitted T-SQL statements, their execution time (in milliseconds), and groups them by duration. Use to identify slow queries.
TSQL_Grouped	Captures all client-submitted T-SQL statements and the time they were issued. Groups information by user or client that submitted the statement. Use to investigate queries from a particular client or user.
TSQL_Locks	Captures the client-submitted T-SQL statements and associated exceptional lock information. Use to troubleshoot deadlocks, lock time-out, and lock escalation events.
TSQL_Replay	Captures detailed information about Transact-SQL statements that is required if the trace will be replayed. Use to perform iterative tuning, such as benchmark testing.
TSQL_SPs	Although not the default, this is perhaps the most useful trace for most systems (provided it is not to be replayed). Captures detailed information about all executing stored procedures. Use to analyze the component steps of stored procedures. Add the SP:Recompile event if you suspect that procedures are being recompiled.
Tuning	Captures information about stored procedures and Transact-SQL batch execution. Use to produce trace output that Database Engine Tuning Advisor can use as a workload to tune databases.

Customizing a Profiler trace

To have a best of all worlds, one might want to create a custom Profiler template. To create a custom template, one can follow the following steps:

Go to File –> Templates –> New Template
Select “Server Type”
Supply a name for the template, and also choose whether the template should be based on an existing template or not
Go to “Events” tab and add/remove events captured using the “Show All Events” checkbox. NOTE that I have removed a few default events and have added 2 new ones.
Using the “Show All Columns”, I added one additional data column capture
Go to “Column Filters” to apply filters on the data columns NOTE: If multiple criteria are defined, the “AND” operator is used
Go to “Organize columns” to define the sequence in which columns should appear and also to group by any particular column
Once all adjustments are done, click on “Save” to save the template
The trace will then appear in the “Use the template” drop-down when starting a new trace for the given server type

Setting the Default Trace

Let’s assume that we need to set the trace we just created as the default trace. Here are the very simple sequence of steps to follow:

Navigate to File –> Templates –> Edit Template

Select the Server Type and Template Name to edit

Check the “Use as a default template for selected server type”.

NOTE: You can also use this interface to delete a trace template.

Storage & Exporting/Importing templates

Customization of templates is machine and user specific. That means that the template must be stored on the disk as a physical file. A simple file-system search reveals the location of the template file:

For SQL 2012 (code named: “Denali”), the template is available at: C:Users<<<user name>>>AppDataRoamingMicrosoftSQL Profiler11.0TemplatesMicrosoft SQL Server110

So, here’s a tip – the next time you backup your system, also backup your templates so that you don’t have to set them up again! You can Export or Import your existing templates by going to File->Templates->Export Template/Import Template:

In my next post…

In the next post, I will be writing about the various options available to execute the trace. Finally, towards the end of the week we would look at other uses of Profiler (specifically, trace replay).

References

Please find below some of the reference Books On Line documentation that I found when researching for this post:

Until we meet next time,

Be courteous. Drive responsibly.

Subscribe to my posts via E-mail: Subscribe here | Read my posts via your favourite RSS reader: Click Here!

SQL Server – Profiler – Part 1 – What is a SQL Trace? Permissions, space requirements and launching the Profiler

SQL Trace

All of this is achieved by “tracing” the activities of the SQL Server Database engine. Such an instance is called a “SQL Trace”. A SQL Trace is most commonly described by using the following terms. We will be using these as we talk more about the SQL Server Profiler

Trace = A collection of events & data returned by the SQL Server database engine
Events = An Event is the occurrence of an event within the SQL Server Database engine
Event Class = A type of event that can be traced. The event class contains all of the data columns that can be reported by an event
Event Category = A group of related Event Classes
Data column = An attribute of an event
Filter = Criteria that limit the events collected on a trace

Permissions

Any user requiring to run a SQL Trace using SQL Server Profiler must have the ALTER TRACE permissions on the SQL Server instance. Talking about permissions, it is important to remember that such users will be able to see sensitive information and therefore such access must be restricted to members of the db_owner fixed database role, or members of sysadmin fixed server role.

Access to trace data – Minimum disk space requirements

For efficient access to the trace data, SQL Server Profiler uses the path specified by the TEMP environment variable to store the trace data. It is required that SQL Server Profiler has a minimum 10MB of free space. If by chance, the free space drops below 10MB, all SQL Server Profiler functions stop.

Launching the SQL Server Profiler

Now that we have the basics, let’s see the 3 ways by which we can invoke SQL Server Profiler.

Via Command Prompt

This has got to be the most used method to launch the SQL Server Profiler. One can go to the “Run” prompt or launch the command prompt and type the following based on the version of Microsoft SQL Server Profiler that one wants to launch:

profiler90 = Profiler for Microsoft SQL Server 2005
profiler = Profiler for Microsoft SQL Server 2008/2008 R2/SQL 2012 (code named: “Denali”)

Via SQL Server Management Studio (SSMS)

There are 2 ways in which the Profiler is launched from the SSMS:

Via the toolbar

Go to Tools –> SQL Server Profiler

NOTE: No SPID filtering is performed.

Via the Query editor window

In the Query editor window, right-click
Choose “Trace Query in SQL Server Profiler” (As an alternate to the above steps, you can use the keyboard short-cut: Ctrl+Alt+P)

NOTE: Note that the SPID being traced by the Profiler is automatically set to the SPID of the Query editor window.

Via Activity Monitor

Launch the Activity Monitor (refer my post: Activity Monitor – Underappreciated Features of Microsoft SQL Server)
Under the “Process” pane, select the activity of interest
Right-click and choose “Trace Process in SQL Server Profiler”

NOTE: Note that the the SPID being traced by the Profiler is automatically set to the SPID selected in the Processes pane.

In my next post:

In my next post, I will be looking at Profiler Templates, Template types and how to create customized templates.

References:

I have tried to provide the reader with the essentials to get started with SQL Server Profiler. The reader may refer the following Books On Line pages for advanced reading.

Until we meet next time,

Be courteous. Drive responsibly.

Subscribe to my posts via E-mail: Subscribe here | Read my posts via your favourite RSS reader: Click Here!

SQL Server – Administration – Cleanup Database Backup History

4 Replies

Earlier this week, I had explored why the sequence of index creation is important in a rather long post on performance tuning. You can read the post here: SQL Server – Performance – Best Practice – Create Index Before or After a data insert?

Today’s post, therefore, will be a nice, small one – something that you might want to convert into a scheduled “spring-cleaning” task.

Viewing Backup Information

Microsoft SQL Server uses MSDB to maintain a record for each completed backup. In production environments, it is generally required that all backup history is maintained. However, in Development and Quality Assurance environments, maintaining database backup history is generally not that important – after all, databases change in these environments often.

The backup history is maintained in the following tables within the MSDB:

backupfile
backupfilegroup
backupmediafamily
backupmediaset
backupset
restorefile
restorefilegroup
restorehistory

To learn more about viewing backup history information, refer the Books On Line page at: http://technet.microsoft.com/en-us/library/ms188653.aspx

Cleaning up the backup history

In order to cleanup the backup history, we can use one of the following two system stored procedures:

sp_delete_backuphistory – To remove backup history for all databases within a SQL Server instance
sp_delete_database_backuphistory – To remove backup history for a specific database

Some Key points to remember:

Running these system stored procedures requires membership to the sysadmin fixed server role
Running these procedures may be performance intensive
Please do not use these without guidance from your DBA!

Until we meet next time,

Be courteous. Drive responsibly.

Subscribe to my posts via E-mail: Subscribe here | Read my posts via your favourite RSS reader: Click Here!

SQL Server – Performance – Best Practice – Create Index Before or After a data insert?

10 Replies

As a part database developer myself, I am sure that what I am about to document today will be useful to almost everyone who has ever written a stored procedure for with indexed tables for SSRS reporting or massive batch processing. What I will be trying today is to find an answer to the dilemma – Should indexes be created before or after the insert to the table?

The moment of index creation impacts table statistics, and therefore should impact performance. The point that got me thinking towards this test is the following line from Books On Line (http://msdn.microsoft.com/en-us/library/ms190397.aspx), Section: “Determining When to Create Statistics”, which quite clearly mentions that: “The query optimizer creates statistics for indexes on tables or views when the index is created.”

The approach that I will be taking today is:

Create two identical tables in the TempDB
- On one table, indexes will be created before the insert of test data is done
- On the other, I will first insert some data, and then create indexes
Later, I will fire simple select queries against both, while monitoring the performance via monitoring statistics usage by:
- setting of SET STATISTICS IO ON and,
- Using the SQL Profiler

WARNING: The scripts provided in this post are provided “as-is” and without warranty for understanding purposes only. Using the DBCC scripts provided in the post on your QA or production environments can have serious side effects.

Creating the test environment

In order to create the test environment, I will first create a test table, create the necessary indexes and finally add some test data to it.

USE tempdb
GO

/*****************************************************************************************************************
   STEP 01: Create a table, complete with a primary key (to create a clustered index), and a non-clustered index
*****************************************************************************************************************/
USE tempdb
GO
CREATE TABLE DontUpdateStatistics
( StatId INT IDENTITY(1,1),
  RandomId INT DEFAULT (RAND()*1000),
  StatName VARCHAR(10) DEFAULT 'DEFAULT',
  CONSTRAINT PK_DontUpdateStatistics_StatId PRIMARY KEY CLUSTERED 
  (
    StatId
  )
)
GO

CREATE NONCLUSTERED INDEX NC_DontUpdateStatistics_RandomId ON DontUpdateStatistics(RandomId)
GO

/*****************************************************************************************************************
   STEP 03: Insert some test data into the table
*****************************************************************************************************************/
USE tempdb
GO
INSERT INTO DontUpdateStatistics DEFAULT VALUES
GO 100000

Next, I will create another table. However indexes will be created after test data is inserted.

USE tempdb
GO
/*****************************************************************************************************************
   STEP 04: Create a table - this time, do not create any indexes
*****************************************************************************************************************/
CREATE TABLE UpdateStatistics
( StatId INT IDENTITY(1,1),
  RandomId INT DEFAULT (RAND()*1000),
  StatName VARCHAR(10) DEFAULT 'DEFAULT'
)
GO

/*****************************************************************************************************************
   STEP 05: Insert some test data into the table
*****************************************************************************************************************/
USE tempdb
GO
INSERT INTO UpdateStatistics DEFAULT VALUES
GO 100000

/*****************************************************************************************************************
   STEP 06: Now create the primary key clustered index and the non-clustered index
*****************************************************************************************************************/
ALTER TABLE UpdateStatistics
    ADD CONSTRAINT PK_UpdateStatistics_StatId PRIMARY KEY CLUSTERED 
    (
        StatId
    )
GO
CREATE NONCLUSTERED INDEX NC_UpdateStatistics_RandomId ON UpdateStatistics(RandomId)
GO

Run the Test

The first thing I will do is to check if statistics for both tables are updated or not.

Confirm statistics updates

/*****************************************************************************************************************
   STEP 07: See if the statisics on the table in tempDB were updated
Script source: http://blog.sqlauthority.com/2010/01/25/sql-server-find-statistics-update-date-update-statistics/
             : by Pinal Dave
*****************************************************************************************************************/
USE tempdb
GO
SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('DontUpdateStatistics') OR
      OBJECT_ID = OBJECT_ID('UpdateStatistics')
GO

As can be seen, the statistics for Table #1 (“DontUpdateStatistics”) did not update:

The Test Queries

Next, allow me to present a set of SELECT queries that would fetch data from the tables of interest.

USE tempdb
GO
/*****************************************************************************************************************
   STEP 08 A: Clear out the Buffers and the procedure cache.
            : Make sure to run a CHECKPOINT!
Reference source: 
http://beyondrelational.com/blogs/nakul/archive/2011/10/03/dbcc-dropcleanbuffers-needs-to-be-preceded-by-a-checkpoint-to-remove-all-buffers.aspx
*****************************************************************************************************************/
CHECKPOINT
GO
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

/*****************************************************************************************************************
   STEP 08 B: Turn STATISTICS IO ON, and select some data from the DontUpdateStatistics table
*****************************************************************************************************************/
SET STATISTICS IO ON

SELECT * FROM DontUpdateStatistics WHERE StatId BETWEEN 10 AND 1000 
                                     AND RandomId > 300
GO

SET STATISTICS IO OFF

/*****************************************************************************************************************
   STEP 08 C: Clear out the Buffers and the procedure cache.
            : Make sure to run a CHECKPOINT!
Reference source: 
http://beyondrelational.com/blogs/nakul/archive/2011/10/03/dbcc-dropcleanbuffers-needs-to-be-preceded-by-a-checkpoint-to-remove-all-buffers.aspx
*****************************************************************************************************************/
CHECKPOINT
GO
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
GO

/*****************************************************************************************************************
   STEP 08 D: Turn STATISTICS IO ON, and select some data from the UpdateStatistics table
*****************************************************************************************************************/
SET STATISTICS IO ON

SELECT * FROM UpdateStatistics WHERE StatId BETWEEN 10 AND 1000 
                                 AND RandomId > 300
GO

SET STATISTICS IO OFF

Configure the SQL Profiler

To use the SQL Server Profiler in order to help me back my case, I connected to my SQL Server instance, and selected only one event:

Event Group: Errors & Warnings

Event Class: Missing Column Statistics

The Result

With the SQL Profiler running, I turned on the Actual Execution Plan in SSMS (by pressing Ctrl+M) and then executed the test query. The result was as shown below:

The Execution Plan
Statistics Output
Profiler Output

Confirm statistics updates

/*****************************************************************************************************************
   STEP 09: See if the statisics on the table in tempDB were updated
Script source: http://blog.sqlauthority.com/2010/01/25/sql-server-find-statistics-update-date-update-statistics/
             : by Pinal Dave
*****************************************************************************************************************/
USE tempdb
GO
SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('DontUpdateStatistics') OR
      OBJECT_ID = OBJECT_ID('UpdateStatistics')
GO

As can be seen, the statistics for Table #1 (“DontUpdateStatistics”) are now updated. The statistics for Table #2 (“UpdateStatistics”) are unchanged:

Post-Mortem

First of all, I would like to draw attention to a line mentioned in the Profiler’s description of the “Missing Column statistics” event class. The line states – “You will not see this event produced unless the option to auto-compute statistics is turned off”. For the TempDB, auto-compute statistics is ON, so that’s not the root cause to this issue.

If this is not the issue, then what is? Summarizing the results, I can derive the following conclusions:

Both queries appear to have executed with equal performance impact – with a 50% split across them according to the execution plan
However, looking at the statistics IO output, I can see that:

Read-Ahead Reads for table #1 (“DontUpdateStatistics”) = 16
Read-Ahead Reads for table #2 (“UpdateStatistics”) = 4
The query against Table #1 (“DontUpdateStatistics”) is requires 4 times more reads when compared to the query against query #2

The SQL Profiler warns me that column statistics are not available, and therefore, a sub-optimal plan can be chosen by the optimizer

Revisiting Books On Line, (http://msdn.microsoft.com/en-us/library/ms190397.aspx), the section: “Determining When to Create Statistics”, clearly mentions that: The query optimizer creates statistics for indexes on tables or views when the index is created.

More precisely, statistics, if missing, are created when a select is performed on the columns for which missing statistics were found.

In our case, we created the indexes on table #2 (“UpdateStatistics”) after data was inserted. Because creating indexes involves selecting data from the table, a statistics update is performed. For table #1 (“DontUpdateStatistics”) though, the indexes were created before inserting the data. When the query #1 was executed, the statistics are generated, and then the optimization happens – which makes the query a little bit slower as it spends more time on the read-ahead reads.

If we look up on Books On Line for STATISTICS IO (http://msdn.microsoft.com/en-us/library/ms184361.aspx), we can see that the read-ahead read count gives us the number of pages placed into the cache. In our case, all pages for table #1 have to be placed in the cache for the statistics update.

Conclusion

It’s been a rather long post, with a simple conclusion. It is important to think about when index creation should be performed because using the proper sequence is important from the performance and IO perspective.

Until we meet next time,

Be courteous. Drive responsibly.

SQL Server – SSMS – Table Designer v/s ALTER TABLE – Best Practices – Msg 1701 – Creating or altering table failed….This exceeds

A recap

Assume that we have the following setup:

CREATE DATABASE foo
GO

USE foo
GO
CREATE TABLE MyTbl (MyName CHAR(4030))
GO

INSERT INTO MyTbl
SELECT REPLICATE('a',4030)
GO

Let us assume that we need to modify the structure of our table:

ALTER TABLE MyTbl ALTER COLUMN MyName CHAR(4031)
GO

The result of this would be a Message #1701 error:

Msg 1701, Level 16, State 1, Line 2

Creating or altering table ‘MyTbl’ failed because the minimum row size would be 8068, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

The Root Cause

The root cause of the error was that the maximum allowed row size in Microsoft SQL Server is 8060 bytes (calculations presented in the original post here).

The ALTER TABLE statement attempts to expand the column “in-place”, i.e. it copies the data over to a new column with the new length specifications on the same row, making the row total now 8061 bytes, which is more than the 8060 maximum – resulting in the error seen.

The Solution – SSMS Table Designer

Besides modifying the storage to have smaller, more granular tables,the only solution I see is to apply the change in the following sequence:

Create a new table with the new specifications
Copy all the data over from the existing table to the new table
Drop the old table and rename the newly created table

Let’s see how the table designer can help us overcome the issue at hand with the row size increasing the 8060-byte limit.

Right-click the table in question and choose “Design” to launch the Table Designer.
In the Designer, simply change the column size.
Script the changes to a file
Save the changes & confirm in the Object Explorer that the size change completed successfully

For those interested, the scripted output of SSMS is as under, with descriptive comments added by me as required to demonstrate the essentials.

/* To prevent any potential data loss issues, you should review this script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
SET QUOTED_IDENTIFIER ON
SET ARITHABORT ON
SET NUMERIC_ROUNDABORT OFF
SET CONCAT_NULL_YIELDS_NULL ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
---Nakul - Step 01: Creating the new table
CREATE TABLE dbo.Tmp_MyTbl
	(
	MyName char(4031) NULL
	)  ON [PRIMARY]
GO
ALTER TABLE dbo.Tmp_MyTbl SET (LOCK_ESCALATION = TABLE)
GO
---Nakul - Step 02: Pump data from old table to new table
IF EXISTS(SELECT * FROM dbo.MyTbl)
	 EXEC('INSERT INTO dbo.Tmp_MyTbl (MyName)
		SELECT MyName FROM dbo.MyTbl WITH (HOLDLOCK TABLOCKX)')
GO
---Nakul - Step 03: Drop the old table and rename the new table
DROP TABLE dbo.MyTbl
GO
EXECUTE sp_rename N'dbo.Tmp_MyTbl', N'MyTbl', 'OBJECT' 
GO
COMMIT

Conclusion

As we just saw, dropping and recreating tables is the standard scripting mechanism used by the Table Designer. While it does make our life easy, the fact remains that:

The Table Designer method is performance intensive

Assume that we are trying to change the size of a character based column in a table that has a million or more rows
When using the table designer, we might end up waiting for what would look like infinity, with I/O shooting through the roof and at the end, we would have a bloated transaction log

The Table Designer performs a table recreate even if it is not required (while a recreate is essential in our case, it may not be the case when changing a column from NOT NULL to a NULL

The bottom line is: If efficient DDL can be written to make the changes required, avoid using the Table Designers as far as possible.

There is no denying that the designer is supposed to make life simpler, and that’s where I would share my philosophy around these. I use the designers as a stepping stone using the scripts generated as a base, and then trying to improve upon them. After practicing this a couple of times, I was able to churn out pretty decent, standardized DDL without using the designer at all.

Microsoft SQL Server comes with a set of some pretty good designers (Table & Query designer to name a few) which are useful productivity enhancers under most conditions, but might end up causing issues in other cases. Use them with caution.

Until we meet next time,

Be courteous. Drive responsibly.

Subscribe to my posts via E-mail: Subscribe here | Read my posts via your favourite RSS reader: Click Here!