#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns


Recently, we were working on a stored procedure which involved combining similar result sets from various queries into a single result set with the help of the UNION operator. As I was reviewing the query written by one of the team members, I noticed that they had explicitly specified column aliases for all the result-sets being consumed in the UNION.


In order to test the team’s understanding about the fundamentals, I simply mentioned that the same column name across all result sets in the UNION was not required. The conversation that followed went something like this:


Sample query being reviewed:

USE tempdb;
GO

--Table variables to simulate the result sets 
--being combiend using UNION
DECLARE @productSet1 TABLE (ProductId INT 
                                      NOT NULL 
                                      IDENTITY(1,1),
                            ProductName VARCHAR(50) 
                                        NULL
                           );

DECLARE @productSet2 TABLE (Id INT 
                               NOT NULL 
                               IDENTITY(1,1),
                            Product VARCHAR(50) 
                                    NULL
                           );

--Insert some test data
INSERT INTO @productSet1 (ProductName)
VALUES ('Camshaft'),
       ('Connecting Rod'),
       ('Starter Motor');

--Insert some test data
INSERT INTO @productSet2 (Product)
VALUES ('Piston'),
       ('Distributor'),
       ('10A Fuse'),
       ('16A Fuse');

--Query that was written by the developers
SELECT ps1.ProductId AS ProductId,
       ps1.ProductName AS ProductName
FROM @productSet1 AS ps1
UNION
SELECT ps2.Id AS ProductId,
       ps2.Product AS ProductName
FROM @productSet2 AS ps2;
GO

Me: What has been done is ideal in the sense that ensuring the same column names in all the result sets makes the query much more readable. But, UNION does not need that column names are the same across all result sets.


Team: No? But then how will SQL Server combine the result sets if it does not know which column in the subsequent result sets to map with which column from the first result set?


Me: All SQL Server needs is that the number and order (i.e. sequence) of columns in the result sets being consumed by the UNION are the same and the data types are compatible. It is not required that the column names are the same.


Team: In that case, what will be the column names in the result set?


Me: That’s quite simple – the final result set follows the column names from the first result set.


Team: Hmm… can you show an example?


Me: Sure! Let’s manipulate the given script itself for the quick demo.

--Assumption: Table variables referenced by the query already exist.

SELECT ps1.ProductId AS ProductId,
       ps1.ProductName AS ProductName
FROM @productSet1 AS ps1
UNION
SELECT ps2.Id,
       ps2.Product
FROM @productSet2 AS ps2;
GO

If we run this statement, we will see that the  UNION succeeds and the results follow the column names from the first result set:


image


Team: This is very interesting! So we can name the columns of result sets involved in a UNION to names of our choice.


Me: Yes. However, it may have an impact on the readability of the query. It’s better to have the name, number, sequence and data-types of the columns involved in a UNION same, but it’s not necessary to do so.


Team: Ok, now we get it. Thank-you, Nakul!


Me: You’re welcome!


Conclusion


Whenever two or more result sets are being combined using a UNION or UNION ALL, all one has to ensure is that:



  • The number & order of the columns being consumed in the UNION are the same across all queries
  • The data types of the columns across the queries must be compatible

It is NOT required (but is recommended from a readability perspective) to have the same column names across all queries.


Further Reading



  • UNION operator [Books On Line, Link]

Until we meet next time,


Be courteous. Drive responsibly.

2 thoughts on “#0305 – SQL Server – Myths – UNION and UNION ALL – need same name, number, sequence & data-type of columns

  1. marc_jellinek@hotmail.com

    Here’s another interesting item:

    While the column names will come from the first resultset, it appears that the data type will come from the largest datatype for a given column.

    DECLARE @table_1 TABLE ( [id_1] [int] )
    DECLARE @table_2 TABLE ( [id_2] [bigint] )

    INSERT INTO @table_1 ([id_1])
    VALUES (1), (2), (3)

    INSERT INTO @table_2 ([id_2])
    VALUES (1), (2), (3), (10000000000)

    SELECT *
    INTO #tempresult
    FROM ( SELECT [id_1] FROM @table_1
    UNION ALL
    SELECT [id_2] FROM @table_2
    ) [data]

    SELECT [c].[name],
    [t].[name]
    FROM [tempdb].[sys].[columns] [c]
    INNER JOIN [tempdb].[sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id]
    WHERE [c].[object_id] = OBJECT_ID(‘tempdb..#tempresult’)

    DROP TABLE #tempresult

    Here you see that @table_2 has a value that is too large for an [int].
    Querying the system view [tempdb].[sys].[columns] shows that the column [id_1] is of type [bigint]

    Like

    Reply
  2. Nakul Vachhrajani

    That’s really interesting! It does make sense for SQL Server to choose the largest data-size applicable to the column.

    I did not know this. Thank-you very much for sharing.

    Like

    Reply

Let me know what you think about this post by leaving your feedback here!

This site uses Akismet to reduce spam. Learn how your comment data is processed.