SQL: Understanding Group By Clause, Distinct Keyword, and Sorting Results

The GROUP BY clause is used in SQL to group the results of a SELECT statement based on one or more columns. It works in conjunction with aggregate functions, such as SUM or AVG, to calculate the aggregate value for each group. The HAVING clause is used with the GROUP BY clause to filter the results of a query based on aggregate values. The DISTINCT keyword is used in the SELECT statement to eliminate duplicate rows from the result set. It returns only unique rows for the selected columns. Sorting the results based on one or more columns can be achieved using the ORDER BY clause, which can sort the results in ascending or descending order. Understanding how to use these clauses is crucial when querying data in SQL and can help you to extract the information you need from a database efficiently.

GROUP BY Clause

The GROUP BY clause in SQL is used in conjunction with aggregate functions (such as SUM, COUNT, AVG, etc.) to group the results of a SELECT statement based on one or more columns. The basic syntax for using the GROUP BY clause is:

SELECT column1, aggregate_function(column2), …
FROM table_name
GROUP BY column1, column2, …;

For example, to select the total salary of employees group by department:

SQL
SELECT department, SUM(salary)
FROM employees
GROUP BY department;

This query will group the employees by department and calculate the total salary for each department.

You can also use the HAVING clause to filter the results of a query that includes a GROUP BY clause based on aggregate values. The basic syntax for using the HAVING clause is:-

SELECT column1, aggregate_function(column2), …
FROM table_name
GROUP BY column1, column2, …
HAVING aggregate_function(column2) condition;

For example, to select the total salary of employees grouped by department having a total salary greater than 100000:

SQL
SELECT department, SUM(salary)
FROM employees
GROUP BY department
HAVING SUM(salary) > 100000;

This query will group the employees by department, calculate the total salary for each department and retrieve only the department having a total salary greater than 100000.

DISTINCT Keyword

The DISTINCT keyword in SQL is used in the SELECT statement to eliminate duplicate rows from the result set. The basic syntax for using the DISTINCT keyword is:

SELECT DISTINCT column1, column2, …
FROM table_name;

For example, to select all unique “last_name” values from the “employees” table:

SQL
SELECT DISTINCT last_name FROM employees;

This query will return all the unique last_name values from the employees’ table, eliminating any duplicate values.

You can also use the DISTINCT keyword with multiple columns. For example, to select all unique combinations of “first_name” and “last_name” values from the “employees” table:

SQL
SELECT DISTINCT first_name, last_name FROM employees;

This query will return all the unique combinations of first_name and last_name from the employee’s table, eliminating any duplicate combinations.

It’s important to note that the DISTINCT keyword only applies to the selected columns, so if you have multiple rows with identical values in the selected columns but different values in other columns, all of those rows will be considered as duplicate and only one will be returned.

Sorting Results

In SQL, the ORDER BY clause is used to sort the results of a SELECT, UPDATE, or DELETE statement based on one or more columns. The basic syntax for using the ORDER BY clause is:

SELECT column1, column2, …
FROM table_name
ORDER BY column1 [ASC | DESC], column2 [ASC | DESC], …;

For example, to select all columns and rows from the “employees” table and sort the results by the “last_name” in ascending order:

SQL
SELECT * FROM employees
ORDER BY last_name;

The default sorting order is ascending (ASC), but you can also sort the results in descending order (DESC) by specifying the keyword after the column name.

SQL
SELECT * FROM employees
ORDER BY last_name DESC;

You can also sort the results by multiple columns. For example, to select all columns and rows from the “employees” table and sort the results first by the “last_name” in ascending order and then by the “first_name” in descending order:

SQL
SELECT * FROM employees
ORDER BY last_name, first_name DESC;

You can also use expressions or functions in the ORDER BY clause. For example, to select all columns and rows from the “employees” table and sort the results by the length of “last_name” in descending order:

SQL
SELECT * FROM employees
ORDER BY LENGTH(last_name) DESC;

It’s essential to consult the official documentation of the SQL implementation you are using for specific information on the syntax and options that are available for using the ORDER BY clause in conjunction with the SELECT, UPDATE, or DELETE statement.

Sorting the results based on a column is useful when you want to see the data in a certain order or when you want to get the top or bottom values of a column.

In conclusion, the GROUP BY clause, DISTINCT keyword, and ORDER BY clause are essential SQL commands that help organize and refine data in a database. The GROUP BY clause is used in conjunction with aggregate functions to group the results of a SELECT statement based on one or more columns, allowing for more efficient data analysis. The DISTINCT keyword is used to eliminate duplicate rows from a result set, while the ORDER BY clause is used to sort the results of a SELECT, UPDATE, or DELETE statement based on one or more columns. These commands allow for more precise and efficient data retrieval and analysis, making them critical tools for anyone working with SQL databases.

Share The Tutorial With Your Friends
Twiter
Facebook
LinkedIn
Email
WhatsApp
Skype
Reddit

Check Our Ebook for This Online Course

Advanced topics are covered in this ebook with many practical examples.