Home | About Nakul Vachhrajani | Archives
April 14, 2014 9:00 AM
I have often seen a recommendation that using DISTINCT instead of GROUP BY is better from a performance perspective. DISTINCT is definitely a more readable alternative, but I wanted to test for myself as to what the truth was.
To compare the performance of the two, I wrote the following simple queries.
The query below helps in identifying the number of records that SQL Server will need to process. On my machine, when executed without the GROUP BY or the DISTINCT clause, the query generates records in excess of 380,000.
USE AdventureWorks2012;
GO
SELECT COUNT(so1.object_id)
FROM sys.objects AS so1
CROSS JOIN sys.objects AS so2 ;
GO
The queries below fetch the same information – one uses the GROUP BY whereas the other query uses a DISTINCT.
USE AdventureWorks2012;
GO
–Clean the buffers and procedure cache
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO–Evaluate GROUP BY
SELECT so1.object_id,
so1.name
FROM sys.objects AS so1
CROSS JOIN sys.objects AS so2
GROUP BY so1.object_id, so1.name ;
GO–Clean the buffers and procedure cache
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO–Evaluate DISTINCT
SELECT DISTINCT
so1.object_id,
so1.name
FROM sys.objects AS so1
CROSS JOIN sys.objects AS so2 ;
GO
Before executing the two queries (one with the GROUP BY and the other with the DISTINCT), I had turned on the actual execution plan.
Upon studying the execution plan, the following observations can be made:
In conclusion,
DISTINCT is not faster than GROUP BY. In fact, Both DISTINCT and GROUP BY demonstrate equal performance impact.
When aggregations are not being used, DISTINCT is more readable when compared to GROUP BY.
Until we meet next time,
Be courteous. Drive responsibly.
Posted by nakulvachhrajani
Categories: #SQLServer, Blog, Imported from BeyondRelational
Tags: #SQLServer
Mobile Site | Full Site
Get a free blog at WordPress.com Theme: WordPress Mobile Edition by Alex King.