SQL Server – SELECT DISTINCT Statement

  • by
sql-distinct-clause-use-on-multiple-columns

SQL Server – Using Select Distinct Statement

Using Distinct statement we can retrieve distinct values from a table. Distinct eliminates all the duplicate records for a column or columns and returns only distinct or unique records for them. Distinct statement can be used over a single column or multiple columns.

Example of using Distinct on a single column:

Select distinct CustomerId from orders

This query will return all the distinct customer ids from orders table.

Result:

sql-distinct-single-column-example

sql-distinct-single-column-example

Using SQL Distinct on multiple columns:

We can specify multiple column names with distinct in a select query and sql will return all the distinct records using the combination of column names specified in the query.

Example:

select distinct CustomerId,ShipPostalCode from orders order by ShipPostalCode asc

Result:

sql-distinct-clause-use-on-multiple-columns

sql-distinct-clause-use-on-multiple-columns

Here a combination CustomerId, ShipPostcode is considered for generating a distinct result set. That’s why, you can see ShipPostalCode data (05033, 1010) is repeated in the result set but CustomerId against all those records is unique, so a row is treated as a unique entry.

We can use order by clause with a distinct query as shown in the query.

Distinct with null values:

Distinct query considers a null value, so if a table contains a null value, distinct will treat null value as a distinct record and return it in the result set. This is a difference between count(column_name), count will eliminate null values while computing the result.

Distinct vs Group By:

Group by and Distinct query can both return same result set.

Group by query:

SELECT CustomerId, ShipVia, Freight FROM Orders GROUP BY
CustomerId, ShipVia, Freight ORDER BY
CustomerId, ShipVia, Freight

Distinct query:

Select distinct CustomerId,ShipVia,Freight from orders

Both of the these queries will return same result set when executed in SQL. However, when you try to use aggregate functions like count(column_name) in distinct query, it will return an error.

We cannot use aggregate functions with distinct queries. Like:

Select distinct CustomerId,ShipVia,Freight,count(CustomerId) from orders

Error returned: Column ‘orders.CustomerID’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

While group by query would produce a result set.

We use Group By query when we have to apply aggregate functions like count, avg, min, max in our query. Distinct query does not supports aggregate functions.

Also see:

Using SQL DISTINCT on multiple columns

This example uses NorthWind database.