#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns

Recently, we were working on a stored procedure which involved combining similar result sets from various queries into a single result set with the help of the UNION operator. As I was reviewing the query written by one of the team members, I noticed that they had explicitly specified column aliases for all the result-sets being consumed in the UNION.

In order to test the team’s understanding about the fundamentals, I simply mentioned that the same column name across all result sets in the UNION was not required. The conversation that followed went something like this:

Sample query being reviewed:

USE tempdb;
GO

--Table variables to simulate the result sets 
--being combiend using UNION
DECLARE @productSet1 TABLE (ProductId INT 
                                      NOT NULL 
                                      IDENTITY(1,1),
                            ProductName VARCHAR(50) 
                                        NULL
                           );

DECLARE @productSet2 TABLE (Id INT 
                               NOT NULL 
                               IDENTITY(1,1),
                            Product VARCHAR(50) 
                                    NULL
                           );

--Insert some test data
INSERT INTO @productSet1 (ProductName)
VALUES ('Camshaft'),
       ('Connecting Rod'),
       ('Starter Motor');

--Insert some test data
INSERT INTO @productSet2 (Product)
VALUES ('Piston'),
       ('Distributor'),
       ('10A Fuse'),
       ('16A Fuse');

--Query that was written by the developers
SELECT ps1.ProductId AS ProductId,
       ps1.ProductName AS ProductName
FROM @productSet1 AS ps1
UNION
SELECT ps2.Id AS ProductId,
       ps2.Product AS ProductName
FROM @productSet2 AS ps2;
GO

Me: What has been done is ideal in the sense that ensuring the same column names in all the result sets makes the query much more readable. But, UNION does not need that column names are the same across all result sets.

Team: No? But then how will SQL Server combine the result sets if it does not know which column in the subsequent result sets to map with which column from the first result set?

Me: All SQL Server needs is that the number and order (i.e. sequence) of columns in the result sets being consumed by the UNION are the same and the data types are compatible. It is not required that the column names are the same.

Team: In that case, what will be the column names in the result set?

Me: That’s quite simple – the final result set follows the column names from the first result set.

Team: Hmm… can you show an example?

Me: Sure! Let’s manipulate the given script itself for the quick demo.

--Assumption: Table variables referenced by the query already exist.

SELECT ps1.ProductId AS ProductId,
       ps1.ProductName AS ProductName
FROM @productSet1 AS ps1
UNION
SELECT ps2.Id,
       ps2.Product
FROM @productSet2 AS ps2;
GO

If we run this statement, we will see that the UNION succeeds and the results follow the column names from the first result set:

Team: This is very interesting! So we can name the columns of result sets involved in a UNION to names of our choice.

Me: Yes. However, it may have an impact on the readability of the query. It’s better to have the name, number, sequence and data-types of the columns involved in a UNION same, but it’s not necessary to do so.

Team: Ok, now we get it. Thank-you, Nakul!

Me: You’re welcome!

Conclusion

Whenever two or more result sets are being combined using a UNION or UNION ALL, all one has to ensure is that:

The number & order of the columns being consumed in the UNION are the same across all queries
The data types of the columns across the queries must be compatible

It is NOT required (but is recommended from a readability perspective) to have the same column names across all queries.

Let me know what you think about this post by leaving your feedback here! Cancel reply

marc_jellinek@hotmail.com October 28, 2013 at 8:30 AM

Here’s another interesting item:

While the column names will come from the first resultset, it appears that the data type will come from the largest datatype for a given column.

DECLARE @table_1 TABLE ( [id_1] [int] )
DECLARE @table_2 TABLE ( [id_2] [bigint] )

INSERT INTO @table_1 ([id_1])
VALUES (1), (2), (3)

INSERT INTO @table_2 ([id_2])
VALUES (1), (2), (3), (10000000000)

SELECT *
INTO #tempresult
FROM ( SELECT [id_1] FROM @table_1
UNION ALL
SELECT [id_2] FROM @table_2
) [data]

SELECT [c].[name],
[t].[name]
FROM [tempdb].[sys].[columns] [c]
INNER JOIN [tempdb].[sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id]
WHERE [c].[object_id] = OBJECT_ID(‘tempdb..#tempresult’)

DROP TABLE #tempresult

Here you see that @table_2 has a value that is too large for an [int].
Querying the system view [tempdb].[sys].[columns] shows that the column [id_1] is of type [bigint]

LikeLike

Reply ↓
Nakul Vachhrajani October 28, 2013 at 11:20 AM

That’s really interesting! It does make sense for SQL Server to choose the largest data-size applicable to the column.

I did not know this. Thank-you very much for sharing.

LikeLike

Reply ↓

This site uses Akismet to reduce spam. Learn how your comment data is processed.

SQLTwins by Nakul Vachhrajani

SQL Server tips and experiences dedicated to my twin daughters.

#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns

Conclusion

Further Reading

2 thoughts on “#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns”

Let me know what you think about this post by leaving your feedback here! Cancel reply

Conclusion

Further Reading

Spread the word around! Share via:

Related

2 thoughts on “#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns”

Let me know what you think about this post by leaving your feedback here! Cancel reply