Category Archives: Imported from BeyondRelational

These posts are imported from my old blog page: http://beyondrelational.com/modules/2/blogs/77/nakuls-blog.aspx

SQL Server Myth – Table Name cannot be same as Database Name


Microsoft SQL Server is home to many myths and legends, and I have developed an interest in knowing about them. It is really a very interesting exercise to know about these myths, and prove them incorrect if the situation permits.

The Myth

One such myth is that a table name in a Microsoft SQL Server database cannot be the same as the database name.

The application that I work on does not have such a scenario. I had never ever heard about anything remotely similar, and therefore, I had to investigate this one. So, let’s see if this one is true or not via a simple DIY ("Do It Yourself") test.

The DIY Test

I have always believed that a simple test can go a long way in clearing out the concepts. So, let’s begin by connecting to a SQL Server 2008 R2 instance, and checking if SQL Server allows us to create a table of the same name as a database.

/*
    WARNING - THIS CODE IS PRESENTED AS-IS AND WITHOUT WARRANTY.
              PLEASE DO NOT USE IT IN PRODUCTION/QA ENVIRONMENTS
*/
USE AdventureWorks2008R2
GO

CREATE TABLE AdventureWorks2008R2 (MyKey INT)
GO

Next, let’s insert a few values:

/*
    WARNING - THIS CODE IS PRESENTED AS-IS AND WITHOUT WARRANTY.
              PLEASE DO NOT USE IT IN PRODUCTION/QA ENVIRONMENTS
*/
USE AdventureWorks2008R2
GO
INSERT INTO AdventureWorks2008R2 VALUES (1), (2), (3)
GO

Finally let’s attempt to select from this new table followed by necessary cleanup:

/*
    WARNING - THIS CODE IS PRESENTED AS-IS AND WITHOUT WARRANTY.
              PLEASE DO NOT USE IT IN PRODUCTION/QA ENVIRONMENTS
*/
USE AdventureWorks2008R2
GO
SELECT * FROM AdventureWorks2008R2
GO

--Cleanup
DROP TABLE AdventureWorks2008R2
GO

Surprised? Don’t be. It’s perfectly legal to have a table name same as the database name.

So, what’s going on here?

Great question! That’s what one should be after.

A table is ultimately a database object. The database object name is referred to as its identifier. If we look into MSDN for the rules of naming identifiers (http://msdn.microsoft.com/en-us/library/ms175874.aspx), we see that the rules for naming database objects can be summarized as under:

  1. First character must be one of:
    • A Unicode letter (Latin characters from a-z, A-Z and characters from other languages)
    • The Underscore ("_"), at sign ("@") or the number sign ("#")
  2. Subsequent characters can be
    • Letters
    • Decimal numbers
    • Dollar sign ("$"), Number sign ("#") or underscore ("@")
  3. The identifier must not be a Transact-SQL reserved word. SQL Server reserves both the uppercase and lowercase versions of reserved words
  4. Embedded spaces or special characters are not allowed
  5. Supplementary characters are not allowed

Nowhere is it mentioned that identifiers cannot be the same as the database name.

NOTE: When identifier names conflict with reserved words, they must be enclosed in square braces. So, a table name like "TABLE" is illegal, while "[TABLE]" is perfectly legal.

Special Thanks to…

This post has been inspired from Pinal Dave’s (blog) series – SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Efficient Query Writing Strategy – Day 4 of 35

Until we meet next time,

Be courteous. Drive responsibly.

Performance Best practice – Transaction log must on a different drive. But WHY?


It is a well-known recommendation and best practice that the transaction log of any database must be on a drive different than the data files are on. This is especially useful to improve transaction log file performance, which manifests itself as a high value for the LOGBUFFER wait type. Refer Pinal’s post (blog) on the LOGBUFFER wait type here. Pinal has demonstrated this at multiple community meets (Tech Ed 2011, CTD – June 2011) and every time he performs the demo, the crowd erupts in admiration.

So, all’s well. But, the question that kept coming back to me was – WHY? Why does moving the transaction log to it’s own dedicated drive benefit the performance of the SQL Server?

To understand the WHY behind this best practice, it is imperative for us to understand the differences in the physical architecture of the transaction log and the data files.

SQL Server uses something called as a Write Ahead Log (WAL) mechanism. What this means is that even before data is persisted to the disk/data file, data is written to the transaction log. When data is written to a database, it moves from the memory (where the manipulation happened) to the transaction log. Later, when background check-pointing happens, this data is written from the log to the data file. Therefore, the data file performance does not directly affect the throughput of the database. The transaction throughput of the database ultimately depends upon the performance of the transaction log.

Since users can read or write any data from the data files, the read & write activity is essentially random in nature. Physically, the read-write heads inside of the disk are jumping around all over the place moving from one sector to another randomly – which slows down the drive, reducing the throughput.

The transaction log on the other hand, is written to serially. This is one of the reasons why instant file initialization cannot be used for transaction logs (refer Paul Randal’s post here), but that’s a different story. Because the transaction log is written to serially, and read from only during check-pointing, a log backup or a restart recovery, it is much more beneficial to place the transaction log on a drive that does not need it’s heads to move around randomly.

This is why moving the transaction log to it’s dedicated drive benefits the SQL Server performance wise.

You can read more on the physical architecture of the transaction log in Books On Line at: http://msdn.microsoft.com/en-us/library/ms179355.aspx

Now that I understand the reason why this arrangement works, I feel much more confident in implementing the same in my development, quality assurance and production environments.

Until we meet next time,

Be courteous. Drive responsibly.

BCP & large BULK operations made faster – switch between FULL & BULK-LOGGED recovery models


Recently, a friend told me that they were experiencing performance issues in production when importing data from 3rd party interfaces at the end of the day. The entire import was done via BCP and at a time when no users were logged into the system. Also, since it was a disconnected import, the entire import could be repeated again in case of any errors or issues.

After some discussion and research on MSDN, we came across the following page: Considerations for Switching from the Full or Bulk-Logged Recovery Model. Sure enough, the issues were resolved by a simple process change and addition of less than 10 lines of T-SQL code his end.

The Secret of faster Bulk operations

At this point, allow me to mention that detailed discussion on recovery models within SQL Server is out-of-scope of this post. Please refer Books-on-line for more details on each recovery model.

What the MSDN page suggests is that while FULL recovery model is ideal for normal usage, the BULK_LOGGED recovery model might help speed-up large bulk imports. So, the sequence that BOL recommends is:

  1. Under the FULL recovery model, take a log backup (in addition to the normal scheduled backup)
  2. Switch to the BULK_LOGGED recovery model
  3. Undertake the bulk operation
  4. Switch back to the FULL recovery model
  5. Take a log backup (in addition to the normal scheduled backup)
  6. Database available for normal operations

Using Bulk Logged Recovery model to speed up bulk operations

While under the BULK_LOGGED recovery model, the scheduled database, log and differential backups can continue to occur.

The secret lies in the fact that under the BULK_LOGGED recovery model, some operations are minimally-logged whereas the same operations are fully logged under the FULL recovery model. The fact that these operations are minimally logged means that the bulk operation would complete faster under the BULK_LOGGED model than under than the FULL recovery model.

These most commonly used operations which can benefit from the BULK_LOGGED recovery model are bulk import operations (BCP, SELECT…INTO, BULK INSERT and INSERT… SELECT).

IMPORTANT:

While following these steps it is strongly recommended that:

  1. No users be there in the system
  2. No modification be made that cannot be recovered without a log or a full database backup. Simply re-executing the bulk operation should be sufficient to restore the database from last backup if necessary

What this would not benefit

There are many myths & legends that surround SQL Server. One of them is that switching to the BULK_LOGGED recovery model would speed up log shipping because the transaction log file would be small. This myth has been busted here – http://blogs.msdn.com/b/sqlserverfaq/archive/2011/01/07/using-bulk-logged-recovery-model-for-bulk-operations-will-reduce-the-size-of-transaction-log-backups-myths-and-truths.aspx

I hope that the above has helped you in setting up your bulk import processes.

Until we meet next time,

Be courteous. Drive responsibly.

SQL Server Configuration Manager – Hide your SQL Server Instance


Data is one of the most critical and valuable components for any industry and it’s protection is every individual’s responsibility. Data lives within Microsoft SQL Server, and therefore, it is the duty of DBAs to ensure that their SQL Server is safe from unwanted detection and intrusion.

Today, I will demonstrate a method by which you can isolate your SQL Server instance from unwanted detection. I believe that this is a best practice and must be followed for all production systems.

SQL Server Configuration Manager

Per Books-On-Line,

  • SQL Server Configuration Manager is a tool to manage the services associated with SQL Server, to configure the network protocols used by SQL Server, and to manage the network connectivity configuration from SQL Server client computers
  • It is basically a Microsoft Management Console (MMC) snap-in and for SQL 2008 utilizes the SQLServerManager10.msc snap-in component
  • System administrators: please note that the configuration manager uses WMI to monitor and change SQL Server settings

Managing Protocol access

After all, client connectivity to access, manipulate and store data is all the SQL Server is about. SQL Server Configuration Manager allows you to configure server and client network protocols, and connectivity options.

The SQL Server configuration manager allows you to control how applications on the network access your database – what protocol is to be used, whether remote connections are allowed or not and also control the IP addresses & ports that SQL server would listen on for incoming requests. However, there is one key aspect that can be controlled when configuring protocols – the discovery of an SQL Server instance on the network.

How applications “discover” SQL Server?

Applications on a network discover a SQL Server instance by sending out something called as an “enumeration request”. If a SQL Server is visible on the network, the SQL Server Browser service would respond to this enumeration request and allow the SQL Server instances on the server to be listed amongst the list of available instances to connect to.

Hiding a SQL Server instance

In a production environment, you may want to hide the production instances of SQL Server to avoid detection. Essentially, all that needs to be done is to instruct the SQL Server Browser not to respond to an enumeration request for that particular SQL Server instance. Here are the brief steps on how you can go about achieving this objective:

  1. Launch the SQL Server Configuration Manager
  2. Expand the node for “SQL Server Network Connection”
  3. Browse out to “Protocols for <<SQL Server instance name>>
  4. Right-click and go to “Properties”
  5. image
  6. In the “Protocols for <<SQL Server instance name>>”, choose “Yes” from the drop-down to hide this instance from detection over the network
  7. image

As you can see that in just 5 steps (2 are images, so they don’t count) you can hide your SQL Server. Users and applications can still connect to this instance normally, but anyone attempting to enumerate through the network for a list of available SQL Servers would not find this instance on their list.

Other “How-to” topics

You can do a lot with the SQL Server Configuration Manager. You can find a list of common operations, and how to go about them on the Books-On-Line page here: http://msdn.microsoft.com/en-us/library/ms188707.aspx

By the way…

I have often seen developers using their trusted Windows Services applet to restart the SQL Server service. This was acceptable till SQL 2008 came about. Starting SQL 2008, this practice should not be followed.

This is because SQL Server uses a service master key for all it’s operations and this key needs to be regenerated upon service restart. These keys are not re-generated if the services are restarted using the Windows Services applet. The SQL Server Configuration manager is the right place to go to if you need a SQL Server service restart.

Until we meet next time,

Be courteous. Drive responsibly.

Msg 1701 – Creating or altering table failed….This exceeds the maximum allowable table row size of 8060 bytes.


In a perfect world, any table once designed would never undergo any modifications for the entire life of the product. However, nothing’s perfect and as they say – the only thing that’s constant is that nothing’s constant. Similarly, requirements change and so do table designs. However, there is one little aspect of SQL Server that most developers and novice administrators do not quite understand correctly.

Assume that we have the following setup:

CREATE DATABASE foo
GO

USE foo
GO
CREATE TABLE MyTbl (MyName CHAR(4030))
GO

INSERT INTO MyTbl
SELECT REPLICATE('a',4030)
GO

Let us assume that we need to modify the structure of our table:

ALTER TABLE MyTbl ALTER COLUMN MyName CHAR(4031)
GO

While the ALTER TABLE statement should work just fine, it does not. What we encounter is:

Msg 1701, Level 16, State 1, Line 2

Creating or altering table ‘MyTbl’ failed because the minimum row size would be 8068, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

So, what went wrong? We just wanted to increase one byte!

The post-mortem

Most of us have a tendency not to read error messages. Any message, and we just click “OK” to it. Similarly, if SQL throws anything in red fonts, we simply hit F5 again just in case SQL was not in the right spirits the first time around.

However, in this case, a close look at the message is the key to understanding what’s going on. Let’s start with the final sentence: This exceeds the maximum allowable table row size of 8060 bytes.

What is this number? 8060 bytes? It is the page size of a data page.

Per Books On Line (Understanding Pages & Extents), here’s how the numbers work out:

Data Page length in bytes 8192
Page Header 96
Overhead 36
Total Data space available on page = data page length – header length – overhead 8060

Once we have established this, let’s study what we are trying to do:

  Column properties Data type length (bytes) Current Row size in bytes
Current 4030 1 4030
New 4031 1 4031
TOTAL     8061

So, we are trying to take a row (because we only have one column, the column size is the row size) of 4030 bytes, and increase it to 4031 bytes, which should be okay because it ultimately remains under the 8060 byte limit. But that’s where looks are deceiving.

The ALTER Table statement attempts to expand the column “in-place”, i.e. it copies the data over to a new column with the new length specifications on the same row, making the row total now 8061 bytes, which is more than the 8060 maximum.

Now that we know we have a row that’s larger than the maximum number of bytes a page can hold, we move down the Books On Line article to see a section for Large Row support, which mentions – “Rows cannot span pages, however portions of the row may be moved off the row’s page so that the row can actually be very large.

Rows cannot span pages – everything boils down to this simple statement.Because our request causes the row to exceed 8060 bytes “in-transit”, the ALTER TABLE fails.

You can get more information on Rows exceeding the 8K limit here: http://msdn.microsoft.com/en-us/library/ms186981.aspx

Workarounds

There are a couple of workarounds to this problem. However, all are very crude mechanisms with their own pros & cons and must not be used unless absolutely necessary.

  1. The most obvious one – if possible, reduce the data within a row so that we do not exceed the allowed limits
  2. Create a new table with the new specifications, copy all the data over, transforming it on the way. Finally, drop the old table
  3. Export the existing data to another table/s, truncate the existing table. Make necessary changes and import data back into the original table

I hope that you found this behaviour interesting and also enjoyed learning about the internal workings of Microsoft SQL Server as much as I did.

Until we meet next time,

Be courteous. Drive responsibly.