Tag Archives: Development

Articles on Microsoft SQL Server development

#0420 – SQL Server – SSIS – Does SSIS use TCP/IP by default?


A while ago, I wrote a post on how to force SSMS to use TCP/IP for connections to the local SQL Server instance. This led me to a question – what is the default connection protocol used by SQL Server Integration Services (SSIS) when connecting to SQL Server Does SSIS have the same affinity to use Shared Memory over TCP/IP?

Does SSIS always use TCP/IP?

To find the answer to this question, I devised a simple test. I wrote a simple SSIS package and went about checking the connection properties of the connections opened by the SSIS package.

The test package

The test package is quite simple. I have a Data Flow task with two (2) – OLEDB connection managers – one connected to the [AdventureWorks2019] sample database and the other connected to the [tempdb] database.

The Data Flow task simply pumps all rows of the [AdventureWorks2019].[dbo].[Employee] table to an identical test table [tempdb].[dbo].[dEmployee] that I had created before the test.

To allow for easier monitoring of the connection, I have also added a Script Task with a simple 1 second (1000 milli-second) sleep interval in the processing of each row.

The image shows the internals of the data-flow task. At the top, I have a OLE DB source pumping data to an OLE DB Destination through a Script Component.
Sample SSIS Package – Internals of the Data flow task

If you are interested, here’s how the Script task has been configured:

Script task showing all input columns as selected and directly getting transferred to output stream.

Here’s the code for the the 1 second delay (1000ms = 1 sec):

/// <summary>
/// This method is called once for every row that passes through the component from Input0.
///
/// Example of reading a value from a column in the the row:
///  string zipCode = Row.ZipCode
///
/// Example of writing a value to a column in the row:
///  Row.ZipCode = zipCode
/// </summary>
/// <param name="Row">The row that is currently passing through the component</param>
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
    //Introduce 1 second sleep at each row
    Thread.Sleep(1000);
}

Test Execution & Results

The test strategy is quite simple – to run the package and monitor the connections opened by the package based on the hosting process.

From Visual Studio

When the package is executed from Visual Studio, the SSIS package is executed under a Debug Host Process. So, with the package running, I use the Windows Task Manager to find the host process Id of the debug host process.

Finding the Host process PID for the DtsDebugHost.exe process.
Finding the Host process PID for the DtsDebugHost.exe process.

With once the process Id has been identified, we will head over to the SQL Server Management Studio (SSMS) and run the following query (after substituting the various Ids).

SELECT [session_id],
       [host_name],
       [program_name],
       [host_process_id],
       [client_interface_name],
       [database_id],
       DB_NAME([database_id]) AS [DatabaseName],
       [row_count]
FROM [sys].[dm_exec_sessions]
WHERE [is_user_process] = 1
  AND [host_process_id] = 4388; --This is the PID from the Task Manager
GO

SELECT * 
FROM [sys].[dm_exec_connections]
WHERE [session_id] IN (63, 66); --This is the list of session_id from the query above
GO

Here’s what the query returned in my case:

T-SQL Query output showing that SSIS connections open Shared Memory connections by default.
The SSIS connections open Shared Memory connections by default

As can be seen, the SSIS used Shared Memory connections by default when debugging the package from Visual Studio.

Independent execution of the SSIS package

To confirm that the Shared Memory connections were not caused by the SSIS Debug Host process, I simply executed the same package by double-clicking on it (which will invoke DTExec.exe).

Getting the host PID for the DTExecUI.exe process
Getting the Host PID for the DTExecUI.exe process

Using the same set of queries from above, here’s the output:

T-SQL Query output showing that SSIS connections open Shared Memory connections by default.
The SSIS connections open Shared Memory connections by default

As can be seen, the SSIS connections open Shared Memory connections by default.

References

  • SSMS uses Shared Memory connections by default [Blog Link]
  • How to download the AdventureWorks2019 sample database [MSDN Link]

Until we meet next time,

Be courteous. Drive responsibly.

#0419 – 2 things I didn’t know about Notepad


Today’s post is a quick one based on an experience I had recently. In one hour, I learnt two (2) new things that I didn’t know about Windows Notepad.

I was in an Azure training recently and working my way through a lab exercise. As I was building my environment, I had collected a bunch of tokens and connection strings in a Notepad file. And that’s when I ran into the first thing I didn’t know about Notepad.

The maximum length of text allowed in a line is 1024 characters

The following is a simulated text that I generated by replicating the English alphabet and the numbers (0-9) such that the resulting string is 1044 characters in length. In Notepad, the string automatically wraps at 1024 characters.

The workaround was simply to open the file in the Visual Studio IDE (which does not have the same limitations).

Image showing automatic wrapping of strings to 1024 characters in Notepad
Image showing automatic wrapping of strings to 1024 characters in Notepad

Text searches only work for first 128 characters

Immediately after I realized the word wrap limit, I was trying to search a connecting string (which was 133 characters in length) and landed up with multiple hits (which I was not supposed to).

After triple-checking everything, looked at what was being searched and that’s when I realized that the search box only takes 128 characters.

Screengrab showing that the text in the Notepad search box in Notepad stops at 128 characters
Screengrab showing that the text in the Notepad search box stopped at 128 characters (green lines)

If you want to see it for yourself

You can use copies of the following string (37 characters in length) and an instance of Notepad:

abcdefghijklmnopqrstuvwzyz0123456789

Until we meet next time,

Be courteous. Drive Responsibly.

#0417 – SQL Server – Select row count of local temp tables


I was recently contacted by a fellow team member who was interested in finding out the number of records in a temporary table which was being used as part of a long-running script.

As I had mentioned in one of my previous posts, local temporary tables are only accessible to the connection that created them. Hence, if a script is already running, the only connection that can execute queries against the local temporary table is the connection itself making this requirement a tricky one to work with.

The Solution

The solution for this is to realize that all tables – permanent or local consume storage either in a system/user database or in the tempdb. Hence, we can access this meta-data to fulfill our requirement. Let’s check it out with a demo.

Demo

In any window, we can access the dynamic management view (DMV): [sys].[dm_db_partition_stats]. As we know, this DMV returns page and row-count information for every partition in the current database.

So, let’s open a new query window in the SQL Server Management Studio and run the following query:

--Create a new test database
IF (DB_ID('SQLTwins') IS NULL)
BEGIN
    CREATE DATABASE [SQLTwins];
END
GO

USE [SQLTwins];
GO

--Window 01
BEGIN
    IF OBJECT_ID('tempdb..#SQLTwinsDemo','U') IS NOT NULL
    BEGIN
        DROP TABLE [dbo].[#SQLTwinsDemo];
    END

    CREATE TABLE [dbo].[#SQLTwinsDemo] ([Number] INT         NOT NULL,
                                        [Value]  VARCHAR(50) NOT NULL
                                       );

    INSERT INTO [dbo].[#SQLTwinsDemo] ([Number],
                                       [Value]
                                      )
    VALUES (9, 'Nine' ),
           (8, 'Eight'),
           (7, 'Seven'),
           (6, 'Six'  ),
           (5, 'Five' ),
           (4, 'Four' ),
           (3, 'Three'),
           (2, 'Two'  ),
           (1, 'One'  );
END

Now, in another window, try to run a simple row count query. As expected, it would return an error.

USE [SQLTwins];
GO
SELECT COUNT(*) FROM [dbo].[#SQLTwinsDemo];
GO
Msg 208, Level 16, State 0, Line 3
Invalid object name '#SQLTwinsDemo'.

Now, let’s use the DMV: [sys].[dm_db_partition_stats] in another window to get the row count information.

USE [SQLTwins];
GO
--Now, do this in Window #2
BEGIN
    SELECT [st].[name] AS [TableName],
           [partitionStatistics].[row_count] AS [RowCount]
    FROM [tempdb].[sys].[dm_db_partition_stats] AS [partitionStatistics]
    INNER JOIN [tempdb].[sys].[tables] AS [st] ON [st].[object_id] = [partitionStatistics].[object_id]
    WHERE [st].[name] LIKE '%SQLTwinsDemo%'
      AND ([partitionStatistics].[index_id] = 0  --Table is a heap
           OR
           [partitionStatistics].[index_id] = 1  --Table has a clustered index
          )
END
Fetching RowCount for local temporary tables using SQL Server DMVs

Hope it helps!

Until we meet next time,

Be courteous. Drive responsibly,

#0416 – SQL Server – Msg 8101 – Use column lists when working with IDENTITY columns


I have often written about IDENTITY columns on my blog. Identity columns, most commonly used to implement auto-increment keys, have been around for more than a decade now. Yet, I often see teams run into interesting use cases especially in cases where data is being migrated from one system to another.

Today’s post is based on one such incident that came to my attention.

The team was trying to migrate data from one table to another as part of an exercise to change the database structure for more efficiency. When moving the data from one table to another, they were using the option (SET IDENTITY_INSERT ON) in order to explicitly insert values into the Identity column. However, they were running into an error.

Msg 8101, Level 16, State 1, Line 24
An explicit value for the identity column in table 'dbo.tIdentity' can only be specified when a column list is used and IDENTITY_INSERT is ON.

Here is a simulation of what they were doing:

USE tempdb;
GO
SET NOCOUNT ON;
--Prepare the environment
--   Create a table, and add some test data into it
--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO
--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY(1,1),
                            IdentityValue VARCHAR(10)
                           );
GO
--Turn Explicit IDENTITY_INSERT ON and insert duplicate IDENTITY values
SET IDENTITY_INSERT dbo.tIdentity ON;
GO
--NOTICE: No column list has been supplied in the INSERT
INSERT INTO dbo.tIdentity
VALUES (1, 'One'),
       (2, 'Two');
GO

--RESULTS
--Msg 8101, Level 16, State 1, Line 24
--An explicit value for the identity column in table 'dbo.tIdentity' can only be pecified when a column list is used and IDENTITY_INSERT is ON.

The Solution

Let’s re-read the error. It clearly gives an indication of what the issue is – if we need to insert an explicit value into Identity columns, we need to explicitly use column lists in our insert statements, as shown below.

USE tempdb;
GO
SET NOCOUNT ON;
--Prepare the environment
--Create a table, and add some test data into it
--Safety Check
IF OBJECT_ID('tIdentity','U') IS NOT NULL
    DROP TABLE dbo.tIdentity;
GO

--Create a table
CREATE TABLE dbo.tIdentity (IdentityId INT IDENTITY(1,1),
                            IdentityValue VARCHAR(10)
                           );
GO

--Turn Explicit IDENTITY_INSERT ON and insert duplicate IDENTITY values
SET IDENTITY_INSERT dbo.tIdentity ON;
GO

--NOTE: Column list has been supplied in the INSERT,
--      so, no errors will be encountered    
INSERT INTO dbo.tIdentity ([IdentityId], [IdentityValue])
VALUES (1, 'One'),
       (2, 'Two');
GO

--Confirm that data has been inserted
SELECT IdentityId,
       IdentityValue
FROM dbo.tIdentity;
GO

--Now that data has been inserted, turn OFF IDENTITY_INSERT
SET IDENTITY_INSERT dbo.tIdentity OFF;
GO

-----------------------------------------------------------------
--RESULTS
----------
--IdentityId  IdentityValue
--1           One
--2           Two
 -----------------------------------------------------------------

Hope you will find this helpful.

Untill we meet next time,

Be courteous. Drive responsibly.

#0415 – SQL Server – Performance Tuning – Use STRING_AGG to generate comma separated strings


With more and more data being exchanged over APIs, generating comma-separated strings are becoming a much more common requirement.

A few years ago, I wrote about two different ways to generate comma-separated strings. The most common one I find to be in use when generating comma-separated values from a table is the intermediate conversion of XML. This however, is a very costly mechanism and can potentially take minutes for the query to run depending upon the amount of data involved.

SQL Server 2017 brings a new aggregate function that can be used to generate comma-separated values extremely fast. The function is STRING_AGG().

Here’s a sample of it’s usage:


 --WARNING: THIS SCRIPT IS PROVIDED AS-IS AND WITHOUT
-- WARRANTY.
-- FOR DEMONSTRATION PURPOSES ONLY
--Step 01: Generate Temp table to store source data
DECLARE @NamesTable TABLE ([Id] INT,
[Name] NVARCHAR(50)
);
--Step 02: Generate test data
INSERT INTO @NamesTable
VALUES (1, 'A'),
(2, 'D'),
(2, 'C'),
(3, 'E'),
(3, 'H'),
(3, 'G');
--Step 03: Using STRING_AGG to generate comma-separated strings
SELECT STRING_AGG(tbl.Name, ',') AS [CommaSeparatedString]
FROM @NamesTable AS tbl;
GO
/RESULTS**
CommaSeparatedString
A,D,C,E,H,G
*/

Advantages of STRING_AGG:

  • Can be used just like any other aggregate function in a query
  • Can work with any user supplied separator – doesn’t necessarily have to be a comma
  • No manual step required – Separators are not added at the end of the concatenated string
  • STRING_AGG() is significantly faster than using XML based methods
  • Can be used with any compatibility level as long as the version is SQL Server 2017 (or higher) and Azure SQL database

Here’s an example of how STRING_AGG can be used with any separator:

 --WARNING: THIS SCRIPT IS PROVIDED AS-IS AND WITHOUT
-- WARRANTY.
-- FOR DEMONSTRATION PURPOSES ONLY
--Step 01: Generate Temp table to store source data
DECLARE @NamesTable TABLE ([Id] INT,
[Name] NVARCHAR(50)
);
--Step 02: Generate test data
INSERT INTO @NamesTable
VALUES (1, 'A'),
(2, 'D'),
(2, 'C'),
(3, 'E'),
(3, 'H'),
(3, 'G');
--Step 03: Using STRING_AGG to generate comma-separated strings
SELECT STRING_AGG(tbl.Name, '-*-') AS [CustomSeparatorString]
FROM @NamesTable AS tbl;
GO
/RESULTS**
CustomSeparatorString
A--D--C--E--H--G /

A minor challenge

As with every new feature, there may be a small usability challenge with STRING_AGG. One cannot use keywords like DISTINCT to ensure that only distinct values are used for generating the comma-separated string. There is however a Azure feedback item open where you can exercise your vote if you feel this feature is useful.

Further Reading

  • Different ways to generate a comma-separated string from a table [Blog Link]
  • STRING_AGG() Aggregate Function [MSDN BOL]

Until we meet next time,

Be courteous. Drive responsibly.