Tag Archives: HowTo

All topics “HowTo” in Microsoft SQL Server.

#0395 – SQL Server – SSIS – Adding date and time to a file name after processing


As we have been discussing over the last couple of weeks, moving data from a “source” to a “destination” is the crux of SSIS. The data source can be literally anything – a SQL Server  database, a web-service, a script, an XML or even a conventional delimited file.

Picking up files from a folder and processing them is one of the most common use-case that I have seen. In order to avoid processing the same file over and over again, the most common requirement is to append the file name with a date-time value. In this post, I will provide an expression I use to achieve this goal.

Appending a Date/Time and renaming a file in SSIS

The easiest way to append a date/time value to a file name and renaming the file in SSIS is via SSIS expressions.

Put every simply – an expression is a formula, a combination of variables, literals and operators that work together and evaluate to a single value.

We can use the following expression to yield a date value:

SUBSTRING(@[User::FileName], 1, FINDSTRING(@[User::FileName],".",1) -1 ) + (DT_WSTR,4)YEAR(GETDATE()) + (DT_WSTR,2)MONTH(GETDATE()) + (DT_WSTR,2)DAY(GETDATE())

This can then be appended to the file name and a simple File System Task can rename the file for us on the file system. Let’s see this in action with an example.

Demo

Assume a scenario where I have a set of files in a folder, and I need to do some processing on them. After processing, I need to update the file names with the date. For the sake of brevity of this example, I will not be performing any other operation on the files other than renaming them.

My folder containing the input files looks like this:

Folder with input files which need to be renamed once processing is complete

Folder with input files which need to be renamed once processing is complete

Now, I create an SSIS package that contains a simple For Each file iterator and a File System task.

For Each Loop configuration

The For Each Loop (configured as a file iterator) uses an expression populated by a variable to get the source directory information.

The file name and extension is fetched into a variable which will be used during the processing and subsequent renaming.

0395-sqltwins-overall-package

Overall package configuration showing the For Each File Iterator

0395-sqltwins-foreachloop-collection-configuration

For Each Iterator – Collection configuration using a variabel for the source directory

0395-sqltwins-foreachloop-fetching-file-name

Fetching the individual file name & extension into a variable

Now, I create a new user variable “OutputFileName” and use the expression below to generate the output file name. The expression has essentially 3 distinct parts:

  1. Fetch the file name (without the extension)
  2. Append the date to this string
  3. Fetch the file extension and append to the modified/new file name
@[User::SourceFolder] + 
SUBSTRING(@[User::FileName], 1, FINDSTRING(@[User::FileName],".",1) -1 ) + (DT_WSTR,4)YEAR(GETDATE()) + (DT_WSTR,2)MONTH(GETDATE()) + (DT_WSTR,2)DAY(GETDATE()) + 
SUBSTRING(@[User::FileName], FINDSTRING(@[User::FileName],".",1), LEN(@[User::FileName]))
Output File Path expression

Output File Path expression

This variable is now used in the configuration of the file system task which is responsible for renaming the file.

File System Task configuration

The file system task is an extremely flexible task component in SSIS. It can operate on the file system not only by using file connections but also on the basis of variables! For our problem, we will leverage this flexibility of the File System task.

As can be seen from the screenshot below, my File System task has been configured as follows:

  • Source File Path is a variable
  • Destination File Path is a variable
  • File Operation type = “Rename File”

The “Rename File” operation renames the file specified by the old file path and renames it to the name specified by the new file path.

File System Task configuration using “Rename File” mode where input & output file names are supplied via variables

Once the package is executed, we can see that the files in the source folder are now updated as expected.

Source Folder with the File Names updated as expected

Source Folder with the File Names updated as expected

Further Reading

Until we meet next time,

Be courteous. Drive responsibly.

High-level usage of the File Ordering script task

#0394 – SQL Server – SSIS – ForEach Loop – Controlling processing order


One of the SSIS related questions that I see often in the forums is around the ability to control the sequence in which a For Each Loop processes the files from a directory. When performing data movement as part of an ETL or an integration, it may be required to work asynchronously. The “sender” of the data may place the files in a folder for the data processing to pickup. These files generally contain ordered data that needs to be loaded in a specific sequence.

The problem is that the SSIS For Each Loop container does not have the provision that allows a user to specify the order in which the files are to be processed. Today, I present one of the workarounds that I came up with (there may be others, and I would be very interested to know what do you prefer in these scenarios).

Environment Prep

To demonstrate the ordered loading of files, I created a set of simple files  and a table to which I will be adding the data read from these files.

USE [tempdb];
GO

IF OBJECT_ID('dbo.ImportedData','U') IS NOT NULL
BEGIN
    DROP TABLE dbo.ImportedData;
END
GO


CREATE TABLE dbo.ImportedData 
   (RecordId           INT          NOT NULL IDENTITY(1,1),
    ProcessedFileName  NVARCHAR(255) NOT NULL,
    RecordValue        NVARCHAR(100)     NULL,
    ProcessedDate      DATETIME     NOT NULL,
    CONSTRAINT pk_ImportedData PRIMARY KEY CLUSTERED (RecordId, ProcessedDate)
   );
GO

To demonstrate the fix, I will not be processing the files in order of their file names, but I will instead be processing them in order of  the date on which they were last saved. The screenshot below shows the order in which files will be processed.

Folder showing files to process in the required non-alphabetical order

Folder showing files to process in the required non-alphabetical order

Core Logic – Identify the Order of the Files to Process

The For Each container is just a mechanism to loop through a set (of objects/files/records). It does not have the provision to control the order in which the set is to be iterated through.

Hence, the workaround that I came up with was to supply a pre-ordered set to the For Each container. I therefore used a script task to order the filenames in a recordset object which I will pass to the For Each container.

As shown below, I pass the path to the folder where the files are located, and accept the sorted list into an object.

High-level usage of the File Ordering script task

High-level usage of the File Ordering script task

The script used within the script task which is also the heart of this solution is provided below.

public void Main()
{
    // TODO: Add your code here
    DirectoryInfo di = new DirectoryInfo((string)Dts.Variables["User::FolderPath"].Value);
    List<string> files = new List<string>();

    foreach (var fi in di.EnumerateFiles().OrderBy(t=>t.LastWriteTime))
    {
        files.Add(fi.FullName);
    }

    Dts.Variables["User::FileListing"].Value = files;

    Dts.TaskResult = (int)ScriptResults.Success;
}

Process the File

The  next stage is simply to process the file.

  • The sorted set of files is iterated upon by the For Each container
  • I have a script task that simply prints out a message box with the file being processed currently
  • The data flow task simply reads from the file and writes to a conventional OLE DB destination

Screenshots showing these bits of the configuration are provided below.

0394-sqltwins-ssis-foreachcontainer

For Each Loop Iteration Configuration

0394-sqltwins-ssis-foreachcontainer2

For Each Loop File Name configuration

0394-sqltwins-ssis-dfd_fileconnection

Setting the file connection based on the File Name set by For Each Iterator

If I execute the SSIS package and then query the destination table, we see that the files were indeed ordered in the expected order.

USE [tempdb];
GO
SELECT * 
FROM dbo.ImportedData 
WHERE RecordValue LIKE '%1%';
GO
0394-sqltwins-ssis-output

Output of script to confirm that data was loaded in the required sequence

Further Reading

Until we meet next time,

Be courteous. Drive responsibly.

Showing how to debug a DML trigger by stepping into a DML trigger during a SSMS T-SQL Debugging session

#0382 – SQL Server – SSMS – Debugging a T-SQL DML Trigger


I often receive questions related to debugging code using SSMS. Based on these questions, I have written a lot of blog posts in the past, however I was recently asked the question:

How can I debug a DML trigger in SSMS?

I realized that while I had encountered the same question and researched it in my initial days (when I worked on SQL Server 2000), I had never written about it. So, here goes.

In order to demonstrate how to debug a trigger, all we need to do is debug the statement that initiates the trigger.

For this demonstration, I will fire an update on the [Sales].[SalesOrderHeader] table of the AdventureWorks database.

USE AdventureWorks2012;
GO
UPDATE soh
SET soh.SubTotal += 2
FROM Sales.SalesOrderHeader AS soh
WHERE soh.SalesOrderNumber = 'SO43697';

When we debug this query (using F10), the SSMS debugger launches and we can step through the code using the same key combinations as we have in Visual Studio.

Demonstrating how to debug DML triggers using SSMS

Debugging T-SQL script using SSMS

Executing the update should fire the DML trigger [uSalesOrderHeader] which can be debugged like any other T-SQL code by stepping into the trigger (press F11 when executing the update).

Showing how to debug a DML trigger by stepping into a DML trigger during a SSMS T-SQL Debugging session

Stepping into a DML trigger during a SSMS T-SQL Debugging session

That’s it. Debugging a trigger is no different than debugging a stored procedure or any other T-SQL script.

In case you would like to learn more about debugging in SSMS, please do refer my previous posts (links below). If there is anything specific that you would like to know more about, do drop in a line.

Further Reading

  • T-SQL Debugger – Underappreciated Features of Microsoft SQL Server [Blog Link]
  • Enabling T-SQL Debugger in SQL Server Management Studio (SSMS) [Blog Link]
  • SSMS – Tutorial – Part 03 – The T-SQL Debugger (L200) [Blog Link]
  • T-SQL Debugging – Connection Errors & Firewall settings [Blog Link]
  • T-SQL Debugging – SSMS errors – MSDBG2.DLL [Blog Link]
  • Setting, Labeling, Using & Searching Conditional Breakpoints [Blog Link]
  • Export/Import Breakpoints [Blog Link]
  • Using the “When Hit, Do Something” breakpoint configuration [Blog Link]
  • Using Data-Tips [Blog Link]
  • Editing, Exporting and Importing Data Tips [Blog Link]

Until we meet next time,
Be courteous. Drive responsibly.

#0374 – SQL Server – Removing string terminators (“\0”) using REPLACE during string concatenation


Recently, I was called upon to troubleshoot a strange behaviour demonstrated by a data conditioning script that involved string concatenations. The script ran fine without any errors, but the script did not appear to be concatenating string.

The entire script for this post is shared towards the end of the post. Because the script involves creating specific data which would give away the root cause, I will not be presenting the snippets beforehand.

Assume that we have a simple table with two columns, “FirstName” and “LastName”:

Simple table with test data for string concatenation demo

Simple table with test data for string concatenation demo

The data conditioning script involved populating the “FullName” column in the table with a simple combination of the First and the Last names.

Output of String Concatenation Script demonstrating the problem

Output of String Concatenation Script demonstrating the problem

The Problem: If we look at the output carefully, there is a problem with the FullNames for rows # 2 and 4. Although the LastName is present, only the FirstName is seen in the concatenation result.

The Theory:

After about an hour of troubleshooting, we decided to check out the length of the strings in the table, and that’s when we hit gold. Although we could “see” only a couple of characters, the length was turning out to be a higher than what we expected.

As can be seen from the screenshot below, although the First Name “John” has a length of 4, we get 5 in the length. Similarly, although the FullName shows up as “John”, we get a length of 9.

Screenshot showing the length of the strings in the table

Screenshot showing the length of the strings in the table

This is the moment when the light bulb went off and we realized what was going on. The strings were inserted by a legacy application based on C/C++ code. In such legacy applications, we need to explicitly handle termination of strings by adding the string termination character. That would very well account for the presence of an additional character in the FirstName.

In order to explain the mystery behind the length of the FullName, let’s walk through the concatenation of a record.

  • FirstName = John, 4 characters
  • LastName = Doe, 3 characters
  • Expected length of FullName = First Name + a space + Last Name = 4 + 1 + 3 = 8 characters
  • Actual length of FullName = 9 characters

The difference can be accurately explained when we believe that the concatenation did actually happen – only thing is that we are unable to see the LastName part of the string because the system encounters the string termination character, causing it to stop displaying more characters from the string.

In order to confirm our theory, we replaced the string termination character with a hyphen (-), using the REPLACE function.

UPDATE pt
SET pt.FullName = REPLACE(pt.FullName COLLATE Latin1_General_BIN, CHAR(0) COLLATE Latin1_General_BIN, '-')
FROM @personTable AS pt;

In order for this to work in all environments, we changed the collation to binary when performing the replace (because ultimately, string terminators are just a set of bits when performing binary manipulation).

Finding and Replacing the String Terminator in a string.

Finding and Replacing the String Terminator in a string.

As can be seen from the screenshot above, the REPLACE was successful, and we were able to see the entire string.

In Conclusion

  • When working with data created by legacy code, it is useful to understand how the code works. In this case, we realized that the string terminator was causing a problem and were able to overcome it – but it could have led to hours of troubleshooting (an option of re-creating data manually was also put on the table)
  • SQL Server, and T-SQL can be trusted when it comes to data manipulation. Almost always it’s the system or the human element that is missing something critical

Until we meet next time,
Be courteous. Drive responsibly.

Script for this post:

USE tempdb;
GO
--Creating the sample table
DECLARE @personTable TABLE (FirstName VARCHAR(50),
                            LastName  VARCHAR(50),
                            FullName  VARCHAR(100)
                           );

--Insert some test data
INSERT INTO @personTable (FirstName, LastName)
VALUES ('Nakul','Vachhrajani'),
       ('John' + CHAR(0),'Doe'),
       ('Jack','Smith'),
       ('FirstName' + CHAR(0),'LastName');

--Check out the data
SELECT pt.FirstName,
       pt.LastName
FROM @personTable AS pt;

--Perform the string concatenation
UPDATE pt
SET pt.FullName = pt.FirstName + ' ' + pt.LastName
FROM @personTable AS pt;


--Check out the data
SELECT pt.FirstName,
       pt.LastName,
       pt.FullName
FROM @personTable AS pt;

--Checking the length of the data
SELECT pt.FirstName,
       LEN(pt.FirstName) AS FirstNameLength,
       pt.LastName,
       LEN(pt.LastName) AS LastNameLength,
       pt.FullName,
       LEN(pt.FullName) AS FullNameLength
FROM @personTable AS pt;

--Confirming presence of string termination characters
SELECT pt.FirstName,
       LEN(pt.FirstName) AS FirstNameLength,
       CHARINDEX(CHAR(0),pt.FirstName,1) AS LocationOfStringTerminator,
       pt.LastName,
       LEN(pt.LastName) AS LastNameLength,
       pt.FullName,
       LEN(pt.FullName) AS FullNameLength,
       CHARINDEX(CHAR(0),pt.FullName,1) AS LocationOfStringTerminatorInFullName
FROM @personTable AS pt;

--Replace the string termination character with a hyphen
UPDATE pt
SET pt.FullName = REPLACE(pt.FullName COLLATE Latin1_General_BIN, CHAR(0) COLLATE Latin1_General_BIN, '-')
FROM @personTable AS pt;

--Confirming that string termination characters are no longer present
SELECT pt.FirstName,
       LEN(pt.FirstName) AS FirstNameLength,
       CHARINDEX(CHAR(0),pt.FirstName,1) AS LocationOfStringTerminator,
       pt.LastName,
       LEN(pt.LastName) AS LastNameLength,
       pt.FullName,
       LEN(pt.FullName) AS FullNameLength,
       CHARINDEX(CHAR(0),pt.FullName,1) AS LocationOfStringTerminatorInFullName
FROM @personTable AS pt;
GO

#0367 – SQL Server – Fetching connection session options using SESSIONPROPERTY


A couple of years ago, I wrote an article which demonstrated that the SET options defined for a connection influence query execution and query results. One of the questions that came up in the research was how to determine which SET options are being used for a given session. It was then when I learnt about the system function – SESSIONPROPERTY().

The SESSIONPROPERTY() function can be used to return the current session value of the following SET options:

  • ANSI_NULLS
  • ANSI_PADDING
  • ANSI_WARNINGS
  • ARITHABORT
  • CONCAT_NULL_YIELDS_NULL
  • NUMERIC_ROUNDABORT
  • QUOTED_IDENTIFIER

Here’s an example:

SELECT SESSIONPROPERTY('ANSI_NULLS') AS IsAnsiNullsOn,
       SESSIONPROPERTY('ANSI_PADDING') AS IsAnsiPaddingOn,
       SESSIONPROPERTY('ANSI_WARNINGS') AS IsAnsiWarningsOn,
       SESSIONPROPERTY('ARITHABORT') AS IsArithAbortOn,
       SESSIONPROPERTY('CONCAT_NULL_YIELDS_NULL') AS IsConcatNullYieldsNull,
       SESSIONPROPERTY('NUMERIC_ROUNDABORT') AS IsNumericRoundabortOn,
       SESSIONPROPERTY('QUOTED_IDENTIFIER') AS IsQuotedIdentifierOn;
GO
Output demonstrating the usage of SESSIONPROPERTY() system function

Output demonstrating the usage of SESSIONPROPERTY() system function

Please do keep in mind that the SET options take effect based on a combination of server-level, database-level, and user-specified options. Hence, manipulating these options needs to be done with care.

Further Reading

  • SQL Server Stored Procedures and SET options [Link]
  • SESSIONPROPERTY [MSDN Link]

Until we meet next time,
Be courteous. Drive responsibly.