MariaDB Crash Course - Ben Forta [47]
Now look at the same query using a join:
Input
SELECT p1.prod_id, p1.prod_name
FROM products AS p1, products AS p2
WHERE p1.vend_id = p2.vend_id
AND p2.prod_id = 'DTNTR';
Output
+---------+----------------+
| prod_id | prod_name |
+---------+----------------+
| DTNTR | Detonator |
| FB | Bird seed |
| FC | Carrots |
| SAFE | Safe |
| SLING | Sling |
| TNT1 | TNT (1 stick) |
| TNT2 | TNT (5 sticks) |
+---------+----------------+
Analysis
The two tables needed in this query are actually the same table, and so the products table appears in the FROM clause twice. Although this is perfectly legal, any references to table products would be ambiguous because MariaDB could not know to which instance of the products table you are referring.
To resolve this problem, table aliases are used. The first occurrence of products has an alias of p1, and the second has an alias of p2. Now those aliases can be used as table names. The SELECT statement, for example, uses the p1 prefix to explicitly state the full name of the desired columns. If it did not, MariaDB would return an error because there are two columns named prod_id and prod_name. It cannot know which one you want (even though, in truth, they are one and the same). The WHERE clause first joins the tables (by matching vend_id in p1 to vend_id in p2), and then it filters the data by prod_id in the second table to return only the desired data.
* * *
Tip: Self Joins Instead of Subqueries
Self joins are often used to replace statements using subqueries that retrieve data from the same table as the outer statement. Although the end result is the same, sometimes these joins execute far more quickly than do subqueries. It is usually worth experimenting with both to determine which performs better.
* * *
Natural Joins
Whenever tables are joined, at least one column appears in more than one table (the columns being joined). Standard joins (the inner joins you learned about in the previous chapter) return all data, even multiple occurrences of the same column. A natural join simply eliminates those multiple occurrences so only one of each column is returned.
How does it do this? The answer is it doesn’t—you do it. A natural join is a join in which you select only columns that are unique. This is typically done using a wildcard (SELECT *) for one table and explicit subsets of the columns for all other tables. The following is an example:
Input
SELECT c.*, o.order_num, o.order_date,
oi.prod_id, oi.quantity, OI.item_price
FROM customers AS c, orders AS o, orderitems AS oi
WHERE c.cust_id = o.cust_id
AND oi.order_num = o.order_num
AND prod_id = 'FB';
Analysis
In this example, a wildcard is used for the first table only. All other columns are explicitly listed so no duplicate columns are retrieved.
The truth is, every inner join you have created thus far is actually a natural join, and you will probably never even need an inner join that is not a natural join.
Outer Joins
Most joins relate rows in one table with rows in another. But occasionally, you want to include rows that have no related rows. For example, you might use joins to accomplish the following tasks:
• Count how many orders each customer placed, including customers who have yet to place an order
• List all products with order quantities, including products not ordered by anyone
• Calculate average sale sizes, taking into account customers who have not yet placed an order
In each of these examples, the join includes table rows that have no associated rows in the related table. This type of join is called an outer join.
The following SELECT statement is a simple inner join. It retrieves a list of all customers and their orders:
Input
SELECT customers.cust_id, orders.order_num
FROM customers INNER JOIN orders
ON customers.cust_id = orders.cust_id;