SQL Server – Using Select Distinct Statement
Using Distinct statement we can retrieve distinct values from a table. Distinct eliminates all the duplicate records for a column or columns and returns only distinct or unique records for them. Distinct statement can be used over a single column or multiple columns.
Example of using Distinct on a single column:
Select distinct CustomerId from orders
This query will return all the distinct customer ids from orders table.
Result:
Using SQL Distinct on multiple columns:
We can specify multiple column names with distinct in a select query and sql will return all the distinct records using the combination of column names specified in the query.
Example:
select distinct CustomerId,ShipPostalCode from orders order by ShipPostalCode asc
Result:
Here a combination CustomerId, ShipPostcode
is considered for generating a distinct result set. That’s why, you can see ShipPostalCode
data (05033, 1010
) is repeated in the result set but CustomerId
against all those records is unique, so a row is treated as a unique entry.
We can use order by clause with a distinct query as shown in the query.
Distinct with null values:
Distinct query considers a null value, so if a table contains a null value, distinct will treat null value as a distinct record and return it in the result set. This is a difference between count(column_name)
, count will eliminate null values while computing the result.
Distinct vs Group By:
Group by and Distinct query can both return same result set.
Group by query:
SELECT CustomerId, ShipVia, Freight FROM Orders GROUP BY CustomerId, ShipVia, Freight ORDER BY CustomerId, ShipVia, Freight
Distinct query:
Select distinct CustomerId,ShipVia,Freight from orders
Both of the these queries will return same result set when executed in SQL. However, when you try to use aggregate functions like count(column_name) in distinct query, it will return an error.
We cannot use aggregate functions with distinct queries. Like:
Select distinct CustomerId,ShipVia,Freight,count(CustomerId) from orders
Error returned: Column ‘orders.CustomerID’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
While group by query would produce a result set.
We use Group By query when we have to apply aggregate functions like count, avg, min, max in our query. Distinct query does not supports aggregate functions.
Also see:
Using SQL DISTINCT on multiple columns
This example uses NorthWind database.