DB2 Join

DB2 Join: An INNER JOIN finds and returns matching data from tables, while an outer join finds and returns matching data and some dissimilar data from tables. An inner join finds and returns matching data from tables, while an outer join finds and returns matching data and some dissimilar data from tables. For combining two sets of columns with/without removing duplicates refer to UNION vs UNION ALL use in SQL Queries. For returning the first non-null value in a list of the input expressions use COALESCE function in Join Queries.

DB2 Join : Inner join

Combines each row of the left table with each row of the right table, keeping only the rows in which the join condition is true.

DB2 Join : Outer join

Includes the rows that are produced by the inner join, plus the missing rows, depending on the type of outer join:

DB2 Join : Left outer join

Includes the rows from the left table that were missing from the inner join.

DB2 Join : Right outer join

Includes the rows from the right table that were missing from the inner join.

DB2 Join : Full outer join

Includes the rows from both tables that were missing from the inner join.

Cartesian product

When two or more tables are referenced in the FROM clause of a query, the database server joins the tables. If neither the FROM clause nor the WHERE clause specifies a predicate for the join, the server computes a Cartesian product that contains m * n rows, where m is the number of rows in the first table and n is the number of rows in the second table. This product is the set of all possible combinations formed by concatenating a row from the first table with a row from the second table.

  Table-(T1)            Table-(T2)         
 ID    NAME            ID    TITLE         
 10     Sandy          20     Sales Mgr      
 20     Sam            30     Clerk          
 30     Cindy          30     Manager        
                       40     Sales Rep    
                       50     Manager
Example of Standard Join
 SELECT *                        Result
 FROM T1,                ID   NAME    ID  TITLE      
 WHERE T1.ID = T2.ID     20   Sam     20  Sales Mgr
 ORDER BY T1.ID          30   Cindy   30  Clerk
        , T2.TITLE;      30   Cindy   30  Manager

Example of Inner Join
 SELECT *                        Result         
 FROM T1               ID   NAME     ID     TITLE     
 INNER JOIN T2         20   Sam      20     Sales Mgr
 ON T1.ID = T2.ID      30   Cindy    30     Clerk
 ORDER BY T1.ID        30   Cindy    30     Manager
        , T2.TITLE;

ON and WHERE Usage

In an inner join only, an ON and a WHERE check work much the same way. Both define the nature of the join, and because in an inner join, only matching rows are returned, both act to exclude all rows that do not match the join.

Inner join, using ON check

Below is an inner join that uses an ON check to exclude “Manager”:

Inner join, using ON check
 Below is an inner join that uses an ON check to exclude “Manager”:
 SELECT *                     Result  
 FROM T1                ID   NAME     ID     TITLE  
 INNER JOIN T2          20    Sam     20     Sales Mgr
 ON T1.ID = T2.ID       30    Cindy   30     Clerk
 AND T2.TITLE <> ’Manager’
 ORDER BY T1.ID
        , T2.TITLE;
 
Inner join, using WHERE check
 Here is the same query written using a WHERE clause
 SELECT *                  Result           
 FROM T1              ID   NAME     ID     TITLE
 INNER JOIN T2        20    Sam     20     Sales Mgr
 ON T1.ID = T2.ID     30    Cindy   30     Clerk
 WHERE T2.TITLE <> ’Manager’
 ORDER BY T1.ID
        , T2.TITLE;

Left Outer Join

A left outer join is the same as saying that I want all of the rows in the first table listed, plus any matching rows in the second table:

Table-(T1)       Table-(T2)                  Result
ID   NAM     ID    TITLE      ID   NAME    ID   TITLE     
10   Sandy   20    Sales Mgr  10   Sandy   --   --------- 
20   Sam     30    Clerk      20   Sam     20   Sales Mgr 
30   Cindy   30    Manager    30   Cindy   30   Clerk
             40     Sales Rep 30   Cindy   30   Manager
             50     Manager

Example of Left Outer Join
 SELECT *                                      
 FROM T1
 LEFT OUTER JOIN T2
 ON T1.ID = T2.ID
 ORDER BY T1.ID
        , T2.TITLE;

ON and WHERE Usage

In a partial outer join (i.e. left or right), an ON check works differently, depending on what table (field) it refers to:

  • If it refers to a field in the table being joined to, it determines whether the related row matches the join or not.
  • If it refers to a field in the table being joined from, it determines whether the related row finds a match or not. Regardless, the row will be returned.

In the next example, those rows in the table being joined to (i.e. the T2 view) that match on ID, and that is not for a manager are joined to:

ON check on table being joined to
 SELECT *                        Result
 FROM T1                   ID  NAME    ID   TITLE  
 LEFT OUTER JOIN T2        10  Sandy   --   --------- 
 ON T1.ID = T2.ID          20  Sam     20   Sales Mgr  
 AND T2.TITLE <> ’Manager’ 30  Cindy   30   Clerk
 ORDER BY T1.ID
        , T2.TITLE;

WHERE check on table being joined to

If we rewrite the above query using a WHERE check we will lose a row (of output) because the check is applied after the join is done, and a null JOB does not match:

SELECT *                                  Result
 FROM T1                      ID  NAME   ID   TITLE   
 LEFT OUTER JOIN T2           20  Sam    20   Sales Mgr 
 ON T1.ID = T2.ID             30  Cindy  30   Clerk
 WHERE T2.TITLE <> ’Manager’  
 ORDER BY T1.ID
        , T2.TITLE;
We could make the WHERE equivalent to the ON, if we also checked for nulls:
 SELECT *                                  Result
 FROM T1                      ID   NAME    ID   TITLE  
 LEFT OUTER JOIN T2           10   Sandy   --   ---------
 ON T1.ID = T2.ID             20   Sam     20   Sales Mgr
 WHERE (T2.TITLE <> ’Manager’ 30   Cindy   30   Clerk
    OR T2.TITLE IS NULL)
 ORDER BY T1.ID
        , T2.TITLE;

In the next example, those rows in the table being joined from (i.e. the T1 view) that match on ID and have a NAME > ’D’ participate in the join. Note however that T1 rows that do not participate in the join (i.e. ID = 30) are still returned:

SELECT *                             Result
 FROM T1                  ID   NAME    ID   TITLE 
 LEFT OUTER JOIN T2       10   Sandy   --   --------- 
 ON T1.ID = T2.ID         20   Sam     20   Sales Mgr
 AND T2.NAME > ‘D’        30   Cindy   --   ---------
 ORDER BY T1.ID
        , T2.TITLE;

If we rewrite the above query using a WHERE check (on NAME) we will lose a row because now the check excludes rows from the answer-set, rather than from participating in the join:

SELECT *                        Result
 FROM T1               ID   NAME     ID     TITLE 
 LEFT OUTER JOIN T2    10   Sandy    --     --------- 
 ON T1.ID = T2.ID      20   Sam      20     Sales Mgr
 WHERE T2.NAME > ‘D’          
 ORDER BY T1.ID
        , T2.TITLE;

Unlike in the previous example, there is no way to alter the above WHERE check to make it logically equivalent to the prior ON check. The ON and the WHERE are applied at different times and for different purposes and thus do completely different things.

Right Outer Join

A right outer join is the inverse of a left outer join. One gets every row in the second table listed, plus any matching rows in the first table:

Table-(T1)       Table-(T2)                  Result
 ID   NAME   ID     TITLE     ID   NAME   ID   TITLE    
 10   Sandy  20     Sales Mg  20   Sam    20   Sales Mgr 
 20   Sam    30     Clerk     30   Cindy  30   Clerk
 30   Cindy  30     Manager   30   Cindy  30   Manager
             40     Sales Rep --   ----   40   Sales Rep
             50     Manager   --   -----  50   Manager
 
Example of Right Outer Join
 SELECT *                         Result
 FROM T1                ID   NAME   ID    TITLE          
 RIGHT OUTER JOIN T2    20   Sam    20    Sales Mgr 
 ON T1.ID = T2.ID       30   Cindy  30    Clerk
 ORDER BY T1.ID         30   Cindy  30    Manager
        , T2.TITLE;     --   -----  40    Sales Rep   
                        --   -----  50    Manager        

ON and WHERE Usage

The rules for ON and WHERE usage are the same in a right outer join as they are for a left outer, except that the relevant tables are reversed.

Full Outer Joins

A full outer join occurs when all of the matching rows in two tables are joined, and there is also returned one copy of each non-matching row in both tables.

   Table-(T1)    Table-(T2)           Result
 ID   NAME    ID  TITLE     ID   NAME   ID   TITLE     
 10   Sandy   20  Sales Mgr 10   Sandy  --   ---------  
 20   Sam     30  Clerk     20   Sam    20   Sales Mgr  
 30   Cindy   30  Manager   30   Cindy  30   Clerk
              40  Sales Rep 30   Cindy  30   Manager
              50  Manager   --   -----  40   Sales Rep   
                            --   -----  50   Manager  

Example of Full Outer Join
 SELECT *                        Result
 FROM T1                ID   NAME     ID     TITLE  
 FULL OUTER JOIN T2     10    Sandy     --   ---------
 ON T1.ID = T2.ID       20    Sam       20   Sales Mgr
 ORDER BY T1.ID         30    Cindy     30   Clerk
        , T2.ID         30    Cindy     30   Manager
        , T2.TITLE;     --    -----     40   Sales Rep 
                        --    -----     50   Manager 

ON and WHERE Usage

In a full outer join, an ON check is quite unlike a WHERE check-in in that it never results in a row being excluded from the answer set. All it does is categorize the input row as being either matching or non-matching. For example, in the following full outer join, the ON check joins those rows with equal key values:

Full Outer Join, match on keys
 SELECT *                     Result
 FROM T1               ID   NAME     ID     TITLE  
 FULL OUTER JOIN T2    10   Sandy    --     --------- 
 ON T1.ID = T2.ID      20   Sam      20     Sales Mgr 
 ORDER BY T1.ID        30   Cindy    30     Clerk
        , T2.ID        30   Cindy    30     Manager
        , T2.TITLE;    --   -----    40     Sales Rep 
                       --   -----    50     Manager  

Full Outer Join, match on keys > 20

In the next example, we have deemed that only those IDs that match, and that also have a value greater than 20, are a true match:  

SELECT *                         Result
 FROM T1               ID   NAME       ID     TITLE  
 FULL OUTER JOIN T2    10    Sandy     --     --------- 
 ON T1.ID = T2.ID      20    Sam       --     ---------
 AND T1.ID > 20        30    Cindy     30     Clerk
 ORDER BY T1.ID        30    Cindy     30     Manager
        , T2.ID        --    -----     20     Sales Mgr 
        , T2.TITLE;    --    -----     40     Sales Rep  
                       --    -----     50     Manager

Observe how in the above statement we added a predicate, and we got more rows! This is because in an outer join an ON predicate never removes rows. It simply categorizes them as being either matching or non-matching. If they match, it joins them. If they don’t, it passes them through. In the next example, nothing matches. Consequently, every row is returned individually. This query is logically similar to doing a UNION ALL on the two views:

Full Outer Join, match on keys (no rows match)
 SELECT *                     Result
 FROM T1              ID   NAME       ID     TITLE  
 FULL OUTER JOIN T2   10    Sandy     --     --------- 
 ON T1.ID = T2.ID     20    Sam       --     --------- 
 AND +1 > -1          30    Cindy     --     --------- 
 ORDER BY T1.ID       --    -----     20     Sales Mgr 
        , T2.ID       --    -----     30     Clerk
        , T2.TITLE;   --    -----     30     Manager
                      --    -----     40     Sales Rep
                      --    -----     50     Manager  

ON checks are somewhat like WHERE checks in that they have two purposes. Within a table, they are used to categorize rows as being either matching or non-matching. Between tables, they are used to define the fields that are to be joined on.

In the prior example, the first ON check defined the fields to join on, while the second join identified those fields that matched the join. Because nothing matched (due to the second predicate), everything fell into the “outer join” category. This means that we can remove the first ON check without altering the answer set:

Full Outer Join, don’t match on keys (no rows match)
 SELECT *                      Result
 FROM T1               ID   NAME       ID     TITLE 
 FULL OUTER JOIN T2    10    Sandy     --     --------- 
 ON T1.ID = T2.ID      20    Sam       --     --------- 
 AND +1 > -1           30    Cindy     --     --------- 
 ORDER BY T1.ID        --    -----     20     Sales Mgr
        , T2.ID        --    -----     30     Clerk
        , T2.TITLE;    --    -----     30     Manager
                       --    -----     40     Sales Rep
                       --    -----     50     Manager 

What happens if everything matches and we don’t identify the join fields? The result in a Cartesian Product:

Full Outer Join, don’t match on keys (all rows match)
 SELECT *                        Result
 FROM T1               ID   NAME       ID     TITLE  
 FULL OUTER JOIN T2    10    Sandy     20     Sales Mgr
 ON T1.ID = T2.ID      10    Sandy     30     Clerk
 AND +1 <> -1          10    Sandy     30     Manager
 ORDER BY T1.ID        10    Sandy     40     Sales Rep 
        , T2.ID        10    Sandy     50     Manager    
        , T2.TITLE;    20    Sam       20     Sales Mgr 
                       20    Sam       30     Clerk
                       20    Sam       30     Manager
                       20    Sam       40     Sales Rep  
                       20    Sam       50     Manager 
                       30    Cindy     20     Sales Mgr
                       30    Cindy     30     Clerk
                       30    Cindy     30     Manager
                       30    Cindy     40     Sales Rep
                       30    Cindy     50     Manager      

In an outer join, WHERE predicates behave as if they were written for an inner join. In particular, they always do the following:

  • WHERE predicates defining join fields enforce an inner join on those fields.
  • WHERE predicates on non-join fields are applied after the join, which means that when they are used on not-null fields, they negate the outer join.

Here is an example of a WHERE join predicate turning an outer join into an inner join:

Full Outer Join, turned into an inner join by WHERE
 SELECT *                       Result
 FROM T1             ID    NAME      ID     TITLE  
 FULL JOIN T2        20    Sam       20     Sales Mgr  
 ON T1.ID = T2.ID    30    Cindy     30     Clerk
 WHERE T1.ID = T2.ID 30    Cindy     30     Manager  
 ORDER BY T1.ID                
        , T2.ID
        , T2.TITLE;       

To illustrate some of the complications that WHERE checks can cause, imagine that we want to do a FULL OUTER JOIN on our two test views (see below), limiting the answer to those rows where the “T1 ID” field is less than 30. There are several ways to express this query, each giving a different answer:   

In our first example, the “T1.ID < 30” predicate is applied after the join, which effectively eliminates all “T2” rows that don’t match (because their “T1.ID” value is null):

Outer join T1.ID < 30, check applied in WHERE (after join)
 SELECT *                      Result
 FROM T1             ID   NAME       ID     TITLE 
 FULL JOIN T2        10    Sandy     --     --------- 
 ON T1.ID = T2.ID    20    Sam       20     Sales Mgr 
 WHERE T1.ID <30               
 ORDER BY T1.ID                
        , T2.ID
        , T2.TITLE;     

In the next example the “T1.ID < 30” check is done during the outer join where it does not any eliminate rows but rather limits those that match in the two views:

Outer join T1.ID < 30, check applied in ON (during join)
 SELECT *                        Result
 FROM T1             ID   NAME       ID     TITLE 
 FULL JOIN T2        10    Sandy     --     --------- 
 ON T1.ID = T2.ID    20    Sam       20     Sales Mgr 
 AND T1.ID < 30      30    Cindy     --     --------- 
 ORDER BY T1.ID      --    -----     30     Clerk
        , T2.ID      --    -----     30     Manager
        , T2.TITLE;  --    -----     40     Sales Rep  
                     --    -----     50     Manager 

Imagine that what really wanted to have the “T1.ID < 30” check only apply to those rows in the “T1” table. Then one has to apply the check before the join, which requires the use of a nested-table expression:

Outer join T1.ID < 30, check applied in WHERE (before join)
 SELECT *                      Result
 FROM (SELECT *             ID   NAME     ID   TITLE    
       FROM T1              10    Sandy   --   ---------
       WHERE ID < 30) AS T1 20    Sam     20   Sales Mgr  
 FULL OUTER JOIN T2         --    -----   30   Clerk     
 ON T1.ID = T2.ID           --    -----   30   Manager 
 ORDER BY T1.ID             --    -----   40   Sales Rep
        , T2.ID             --    -----   50   Manager 
        , T2.TITLE;                 

Observe how in the above query we still got a row back with an ID of 30, but it came from the “T2” table. This makes sense because the WHERE condition had been applied before we got to this table. There are several incorrect ways to answer the above question. In the first example, we shall keep all non-matching T2 rows by allowing to pass any null T1.ID values:                     

Outer join T1.ID < 30, (gives wrong answer - see text)
 SELECT *                     Result
 FROM T1                ID    NAME   ID   TITLE    
 FULL OUTER JOIN T2     10    Sand   --   --------- 
 ON T1.ID = T2.ID       20    Sam    20   Sales Mgr
 WHERE T1.ID < 30       --    -----  40   Sales Rep 
 OR T1.ID IS NULL       --    -----  50   Manager    
 ORDER BY T1.ID               
        , T2.ID               
        , T2.TITLE;                               

There are two problems with the above query: First, it is only appropriate to use when the T1.ID field is defined as not null, which it is in this case. Second, we lost the row in the T2 table where the ID equaled 30. We can fix this latter problem, by adding another check, but the answer is still wrong:   

Outer join T1.ID < 30, (gives wrong answer - see text)
 SELECT *                     Result
 FROM T1               ID    NAME      ID     TITLE 
 FULL OUTER JOIN T2    10    Sandy     --     --------- 
 ON T1.ID = T2.ID      20    Sam       20     Sales Mgr 
 WHERE T1.ID < 30      30    Cindy     30     Clerk
 OR T1.ID = T2.ID      30    Cindy     30     Manager  
 OR T1.ID IS NULL      --    -----     40     Sales Rep
 ORDER BY T1.ID        --    -----     50     Manager    
        , T2.ID               
        , T2.TITLE;           

The last two checks in the above query ensure that every T2 row is returned. But they also have the effect of returning the NAME field from the T1 table whenever there is a match. Given our intentions, this should not happen.

SUMMARY: Query WHERE conditions are applied after the join. When used in an outer join, this means that they are applied to all rows from all tables. In effect, this means that any WHERE conditions in a full outer join will, in most cases, turn it into a form of inner join.

INNER JOIN: Click Here OUTER JOIN: Click Here DB2 Manual :Click Here

Scroll to Top