MariaDB Crash Course - Ben Forta [41]
WHERE orders.cust_id = customers.cust_id
* * *
New Term: Correlated subquery
A subquery that refers to the outer query.
* * *
The type of subquery is called a correlated subquery. This syntax—the table name and the column name separated by a period—must be used whenever there is possible ambiguity about column names. Why? Well, let’s look at what happens if fully qualified column names are not used:
Input
SELECT cust_name,
cust_state,
(SELECT COUNT(*)
FROM orders
WHERE cust_id = cust_id) AS orders
FROM customers
ORDER BY cust_name;
Output
+----------------+------------+--------+
| cust_name | cust_state | orders |
+----------------+------------+--------+
| Coyote Inc. | MI | 5 |
| E Fudd | IL | 5 |
| Mouse House | OH | 5 |
| Wascals | IN | 5 |
| Yosemite Place | AZ | 5 |
+----------------+------------+--------+
Analysis
Obviously the returned results are incorrect (compare them to the previous results), but why did this happen? There are two cust_id columns, one in customers and one in orders, and those two columns need to be compared to correctly match orders with their appropriate customers. Without fully qualifying the column names, MariaDB assumes you are comparing the cust_id in the orders table to itself. And
SELECT COUNT(*) FROM orders WHERE cust_id = cust_id;
always returns the total number of orders in the orders table (because MariaDB checks to see that every order’s cust_id matches itself, which it always does, of course).
Analysis
Although subqueries are useful in constructing this type of SELECT statement, care must be taken to properly qualify ambiguous column names.
* * *
Note: Always More Than One Solution
As explained earlier in this chapter, although the sample code shown here works, it is often not the most efficient way to perform this type of data retrieval. You revisit this example in a later chapter.
* * *
* * *
Tip: Build Queries with Subqueries Incrementally
Testing and debugging queries with subqueries can be tricky, particularly as these statements grow in complexity. The safest way to build (and test) queries with subqueries is to do so incrementally, in much the same way as MariaDB processes them. Build and test the innermost query first. Then build and test the outer query with hard-coded data, and only after you have verified that it is working embed the subquery. Then test it again. And keep repeating these steps as for each additional query. This takes just a little longer to construct your queries, but doing so saves you a lot of time later (when you try to figure out why queries are not working) and significantly increases the likelihood of them working the first time.
* * *
Summary
In this chapter, you learned what subqueries are and how to use them. The most common uses for subqueries are in WHERE clauses, in IN operators, and for populating calculated columns. You saw examples of both of these types of operations.
15. Joining Tables
In this chapter, you learn what joins are, why they are used, and how to create SELECT statements using them.
Understanding Joins
One of SQL’s most powerful features is the capability to join tables on-the-fly within data retrieval queries. Joins are one of the most important operations you can perform using SQL SELECT, and a good understanding of joins and join syntax is an extremely important part of learning SQL.
Before you can effectively use joins, you must understand relational tables and the basics of relational database design. What follows is by no means a complete coverage of the subject, but it should be enough to get you up and running.
Understanding Relational Tables
The best way to understand relational tables is to look at a real-world example.
Suppose you had a database table containing