3.4 SQL: SELECT
Retrieving Data is done through the SELECT statement. This is by far the most versatile, complex and important SQL query you will work with. It is very important to become as familiar as you can with the different functions, joins, clauses and techniques which can be employed within a SELECT statement, as this will add a lot of power to your application programming ability.
At a basic level, the most common form of a SELECT query is:
mysql> SELECT `field1`, `field2`, `field3` FROM `table_name` WHERE where_condition_1 AND where_condition_2;
You will see that the above query is made up of three parts:
- The field list: This is a comma separated list of the fields you wish to have returned. These can be fields from a table, functions or constants. We will talk more about these later.
- The table list: This is the table(s) from which the fields originate. If there are multiple tables being used, as in the case of a join, then they are to be comma separated.
- Selection conditions: This is an optional clause, allowing for the data to be filtered according to one or more conditions. This will be covered shortly.
If you wish to simply select all data, indiscriminately, from a table, the following query will suffice:
mysql> SELECT * FROM `table_name`;
To better illustrate the usage of the three main clauses of the SELECT statement, let’s examine the following bit of SQL:
mysql> SELECT `description`, `due_date` FROM `tasks` WHERE `user_id` = 1 AND `due_date` < NOW();
It this SELECT statement, we can clearly see the three clauses/sections as mentioned previously. First, we have a field list, in which we specify we only want the
due_date fields returned. Secondly, we have the table list, where we specify that the fields in the field list are to be retrieved from the
tasks table. Lastly, there are two WHERE conditions specified; the first is to only return rows from the
tasks table where the
user_id field is set to 1, and where the value of the
due_date field is set to any date that came before today’s date.
You would notice that today’s date is obtained dynamically through the use of the NOW() MySQL function. This is one of many functions in MySQL which we are able to make use of. Let’s take some time now to look at some of the more useful functions.
This function is used to retrieve the current time stamp in the format ‘YYYY-MM-DD HH:ii:ss’.
mysql> SELECT NOW(); Output: 2012-06-10 20:30:46
Concat takes multiple comma-separated parameters and concatenates them all together into a single string.
mysql> SELECT CONCAT(‘Hello’, ‘ ‘, ‘World’, ‘!’); Output: Hello World!
The substring function is used to return a portion of a particular field’s string value. The parameters it takes are firstly the string, as either a field or constant, secondly the starting position, and thirdly the length of the substring to return.
mysql> SELECT CONCAT(‘Hello World’, 1, 5); Output: Hello
The following functions belong to a class of functions called aggregate functions. These all have the characteristic of operating on values across multiple rows.
Often, the result of this is only a single row is returned. In some situations, this is what we are looking for. In other situations, we would want more than one record returned. In these situations, we need to specify a aggregation scheme, to inform MySQL as to how to group the records on which the aggregate function is operating. This is done by using the GROUP BY
field2 clause after the WHERE clause.
It is important to be careful to not confuse using aggregate functions in a non-aggregated way. If you find yourself selecting a number for fields as well as an aggregate function, just take a moment to consider if what you are doing really does make sense, and if you are wanting aggregated values.
This returns only unique values for a particular field.
mysql> SELECT DISTINCT(`status`) FROM `tasks`; Output: open resolved
To get the smallest value of a particular field, we use the MIN() function. It takes one parameter, which is the field for which you are wanting to get the smallest value.
myqsl> SELECT MIN(`due_date`) FROM `tasks`; Output: 2012-05-01
Similar to the MIN() function, the MAX() function takes in a field as a parameter and returns the largest value for that field in the table.
mysql> SELECT MAX(`due_date`) FROM `tasks`; Output: 2013-12-31
This function is used to return the number of rows that have been aggregated together.
myqsl> SELECT COUNT(*) FROM `tasks`; Output: 28
This function adds together all the values of a particular field which are aggregated together.
mysql> SELECT SUM(`amount`) FROM `sales`; Output: 2598.45
When designing our databases, it is rare that we would have all data we need in a single table. If this happens, you should take another look at the manner in which you have designed the tables to ensure that you have eliminated the potential for data anomalies. So, as we have our data spread across multiple tables, we often come across situations when we need to select fields from multiple tables. In such situations, we accomplish this through joins.
There are a number of different types of joins, such as inner, outer, self, left, right, and implicit. However, for our purposes at this level, we shall just cover the implicit join, as it is the most versatile join, and should be able to suite most of your needs.
The implicit join is conceptually quite simple. When performing a SELECT query using an implicit join, there are three simple steps we need to do over-and-above the normal writing of a SELECT query:
1) When listing fields, we need to prefix them with the table to which they belong.
mysql> SELECT `tasks`.`description`, `users`.`username`, `queues`.`name`
2) Any tables mentioned in the field listing needs to be referenced in the FROM clause.
mysql> FROM `tasks`, `users`, `queues`
3) In the WHERE clause, we need to add in the primary-foreign key relationships to connect the tables together. There needs to be at least (number_of_tables - 1) joining conditions in the WHERE clause.
mysql> WHERE `tasks`.`user_id` = `users`.`uid` AND `tasks`.`queue_id` = `queues`.`uid`
If we follow these three additional steps, we will successfully be able to select data from multiple tables. Here is an example of a complete SELECT query which performs an implicit join:
mysql> SELECT `tasks`.`description`, `users`.`username`, `queues`.`name` FROM `tasks`, `users`, `queues` WHERE `tasks`.`user_id` = `users`.`uid` AND `tasks`.`queue_id` = `queues`.`uid` AND `tasks`.`due_date` = NOW();
The above query will return the task description, associated user’s username, and name of the queue to which the task belongs for all tasks which are due today.
Although table joins are capable of solving most of our needs, there come times when we need to take the output of one query, and use it within another query. This is accomplised through Sub Queries.
Let’s consider the following example:
mysql> DELETE FROM `tasks` WHERE `uid` IN (SELECT `uid` FROM `archived_tasks`);
In the above example, we can see that the subquery is in parentheses and returns a list of all
uid values in the
archived_tasks table. This output is then used within the WHERE clause of the DELETE statement. The result is that all the rows in the
tasks table with
uid which exist in the
archived_tasks table will be deleted.
Often, we will need to provide a bridge between the the sub query and the outer query. This is achieved by aliasing the outer table name, and referencing that alias from the sub query’s WHERE clause.
mysql> SELECT i.`invoice_number`, i.`invoice_date`, (SELECT SUM(`line_total`) FROM `invoice_items` WHERE `invoice_id` = i.`uid`) as ‘total’ FROM `invoices` i;
In the above example, we are listing invoices along with the total for each invoice which is calculated from a
invoice_items table, which holds the line items for each invoice. The sub query calculates the totals from the line items and is joined to the outer invoice record through the table alias "i".
Below is an ERD for a generic library system.
1) Write the SELECT statements to return the number of books available, and the number of books out on loan.
2) Write an SQL SELECT query to produce the following report: