Tag Archives: Best Practices

Articles related to performance and usability best practices in Microsoft SQL Server.

#0415 – SQL Server – Performance Tuning – Use STRING_AGG to generate comma separated strings


With more and more data being exchanged over APIs, generating comma-separated strings are becoming a much more common requirement.

A few years ago, I wrote about two different ways to generate comma-separated strings. The most common one I find to be in use when generating comma-separated values from a table is the intermediate conversion of XML. This however, is a very costly mechanism and can potentially take minutes for the query to run depending upon the amount of data involved.

SQL Server 2017 brings a new aggregate function that can be used to generate comma-separated values extremely fast. The function is STRING_AGG().

Here’s a sample of it’s usage:


 --WARNING: THIS SCRIPT IS PROVIDED AS-IS AND WITHOUT
-- WARRANTY.
-- FOR DEMONSTRATION PURPOSES ONLY
--Step 01: Generate Temp table to store source data
DECLARE @NamesTable TABLE ([Id] INT,
[Name] NVARCHAR(50)
);
--Step 02: Generate test data
INSERT INTO @NamesTable
VALUES (1, 'A'),
(2, 'D'),
(2, 'C'),
(3, 'E'),
(3, 'H'),
(3, 'G');
--Step 03: Using STRING_AGG to generate comma-separated strings
SELECT STRING_AGG(tbl.Name, ',') AS [CommaSeparatedString]
FROM @NamesTable AS tbl;
GO
/RESULTS**
CommaSeparatedString
A,D,C,E,H,G
*/

Advantages of STRING_AGG:

  • Can be used just like any other aggregate function in a query
  • Can work with any user supplied separator – doesn’t necessarily have to be a comma
  • No manual step required – Separators are not added at the end of the concatenated string
  • STRING_AGG() is significantly faster than using XML based methods
  • Can be used with any compatibility level as long as the version is SQL Server 2017 (or higher) and Azure SQL database

Here’s an example of how STRING_AGG can be used with any separator:

 --WARNING: THIS SCRIPT IS PROVIDED AS-IS AND WITHOUT
-- WARRANTY.
-- FOR DEMONSTRATION PURPOSES ONLY
--Step 01: Generate Temp table to store source data
DECLARE @NamesTable TABLE ([Id] INT,
[Name] NVARCHAR(50)
);
--Step 02: Generate test data
INSERT INTO @NamesTable
VALUES (1, 'A'),
(2, 'D'),
(2, 'C'),
(3, 'E'),
(3, 'H'),
(3, 'G');
--Step 03: Using STRING_AGG to generate comma-separated strings
SELECT STRING_AGG(tbl.Name, '-*-') AS [CustomSeparatorString]
FROM @NamesTable AS tbl;
GO
/RESULTS**
CustomSeparatorString
A--D--C--E--H--G /

A minor challenge

As with every new feature, there may be a small usability challenge with STRING_AGG. One cannot use keywords like DISTINCT to ensure that only distinct values are used for generating the comma-separated string. There is however a Azure feedback item open where you can exercise your vote if you feel this feature is useful.

Further Reading

  • Different ways to generate a comma-separated string from a table [Blog Link]
  • STRING_AGG() Aggregate Function [MSDN BOL]

Until we meet next time,

Be courteous. Drive responsibly.

Advertisements
Screenshots showing that objects have been given default constraint names by SQL Server in case a name was not supplied by the user

#0401 – SQL Server – Script to validate object naming convention


A few weeks ago, I ran into a question on one of the forums asking for a script that can help the team validate object naming conventions. Immediately, I was able to sympathize with the team.

What happens is that when developers use the graphical (GUI) tools in the SQL Server Management Studio (SSMS) or via a simple script, they often fail to specify a name to each individual constraint. These slips are not intentional – developers don’t often realize that each constraint is an independent object because they are ultimately related to  another user defined object (a table).

However, when a name is not explicitly specified for a particular constraint, what Microsoft SQL Server does is provide a name by combining the following:

  1. A standard prefix indicating the object (e.g. “DF” for default constraints)
  2. 9 characters of the object name
  3. 5 characters of the field name
  4. Finally, the unique Id of the object, represented in hexa-decimal format

While this format will always generate a unique value, it would generate names that may not be intuitive. It is therefore a common  practice to review the database code and review for compliance with naming conventions  that have been defined in the product/project.

This logic can be leveraged during code reviews/audits to identify objects where standard project naming conventions are not met.

To demonstrate the functionality of the script, I create one table with a wide range of constraints – none of which have a name specified.

USE [tempdb];
GO
IF OBJECT_ID('dbo.ConstraintsWithoutNames','U') IS NOT NULL
BEGIN
    DROP TABLE dbo.ConstraintsWithoutNames;
END
GO

CREATE TABLE dbo.ConstraintsWithoutNames 
    ([RecordId]     INT          NOT NULL IDENTITY(1,1) 
                                 PRIMARY KEY CLUSTERED,
     [RecordName]   VARCHAR(255)     NULL,
     [RecordStatus] TINYINT      NOT NULL DEFAULT (0) 
                    CHECK ([RecordStatus] IN (0, 2, 4, 8))
    );
GO

Now, the following script is a simple string search that looks for strings ending with the hexa-decimal representation of the parent object.

USE [tempdb];
GO
SELECT * 
FROM [sys].[objects] AS [so]
WHERE [so].[is_ms_shipped] = 0 --Considering user objects only
  AND [so].[name] LIKE ('%' + REPLACE(CONVERT(NVARCHAR(255),CAST([so].[object_id] AS VARBINARY(MAX)),1),'0x',''))
                        --Only those objects whose names end with the hexadecimal
                        --representation of their object Id

Screenshots showing that objects have been given default constraint names by SQL Server in case a name was not supplied by the user

Objects given default constraint names

I  hope you found this script useful. Please do  share your ideas/scripts that you may be using in your day-to-day activities.

Until we meet next time,

Be courteous. Drive responsibly.