DB2

INNER JOIN – step by step walkthrough with examples

 

There are four basic types of SQL JOINS: Inner, Outer (Left, Right, Full), Self, and Cross Join. The INNER JOIN combines records from two tables based on a specified condition (join predicate). It returns only the records that have matching values in both tables, omitting unmatched rows. This method enables precise data merging and efficient analysis across multiple tables.

Why Use INNER JOIN?

  • Combine Related Data: Merge columns from multiple tables using shared keys.
  • Filter Results: Return only matching rows from both tables, enhancing query precision.
  • Simplify Queries: Avoid complex subqueries by joining tables directly based on logical relationships.

INNER JOIN Syntax

SELECT columns
FROM Left_Table
INNER JOIN Right_Table
ON Left_Table.column1 = Right_Table.column2;

Explanation:

  • SELECT columns: Choose specific data fields from the joined tables.
  • FROM Left_Table, Right_Table: Specify the tables participating in the join.
  • INNER JOIN: Indicates the type of join operation.
  • ON condition: Defines the matching columns for joining.

The INNER JOIN is equivalent to the default JOIN clause when no type is explicitly stated. Both can be used interchangeably to achieve the same result.

Logical vs Physical Join Operations

Joins in SQL involve two kinds of operations:

  • Logical operations: Types used in a query—INNER JOIN, OUTER JOIN, CROSS JOIN.
  • Physical operations: Chosen by the SQL optimizer to execute the query efficiently.

Types of Physical Join Operations

  • Nested Loop Join: Efficient when one table is small and the other is indexed. Best for small result sets with low I/O overhead.
  • Merge Join: Fastest for large, sorted datasets. Ideal when join columns are already ordered.
  • Hash Join: Best for unsorted data or when dealing with large datasets where nested loops are inefficient.
  • Adaptive Join: Dynamically chooses between Nested Loop and Hash Join during execution based on runtime statistics.

INNER JOIN vs Standard JOIN

Feature INNER JOIN Standard JOIN (CROSS JOIN)
Join Condition Explicitly specified using ON clause. Implicitly combines all rows from both tables.
Result Rows Only rows where the join condition is met. All possible combinations of rows from both tables.
Purpose Combining tables with shared data. Creating a Cartesian product of all rows.
Common Uses Merging customer and order data. Testing scenarios with multiple variables.
Example SELECT * FROM customers INNER JOIN orders ON customers.customerID = orders.customerID SELECT * FROM customers, orders

Key Differences between INNER JOIN and Standard JOIN (CROSS JOIN):

  • Join Condition: INNER JOIN requires an explicit ON clause to match rows, whereas Standard JOIN (CROSS JOIN) combines every row from both tables by default.
  • Resulting Data: INNER JOIN returns only rows with matching keys, while CROSS JOIN outputs all possible row combinations, often creating large datasets.
  • Purpose: INNER JOIN is designed for combining related data across tables; Standard JOIN is mostly used in testing or simulations to explore permutations.

 

INNER JOIN using ON clause

Learn how to use SQL INNER JOIN with ON clause effectively through an example that demonstrates matching records from two related tables using a shared column.

Input Data
EmployeeName Table-(T1) & Position Table-(T2)
ID NAME ID TITLE TEAM 10 Sandy 20 Sales Mgr 5 20 Sam 30 Clerk 10 30 Cindy 30 Manager 2 40 Sales Rep 7 50 Sr. Manager 11
-- SQL INNER JOIN Query Using ON Clause
SELECT T1.NAME, 
       T1.ID, 
       T2.ID, 
       T2.TITLE
FROM EmployeeName T1
INNER JOIN Position T2 
ON T1.ID = T2.ID
ORDER BY T1.ID,
         T2.TITLE;

The above SQL query performs an INNER JOIN on the EmployeeName and Position tables using the ID column. This is equivalent to a standard join written as follows:

-- Equivalent SQL Join Syntax
SELECT T1.NAME, 
       T1.ID, 
       T2.ID, 
       T2.TITLE
FROM EmployeeName T1,
     Position T2 
WHERE T1.ID = T2.ID
ORDER BY T1.ID,
         T2.TITLE;
Result         
ID   NAME     ID     TITLE  
20   Sam      20     Sales Mgr
30   Cindy    30     Clerk
30   Cindy    30     Manager

This INNER JOIN result set returns all matching rows between EmployeeName and Position tables. Note that ID values 10, 40, and 50 are excluded because they do not have matching entries in both tables.

From the result, we can identify unmatched data — e.g., employee Sandy has no assigned position, and roles like Sales Rep and Sr. Manager remain unfilled. This illustrates a common use case for SQL INNER JOIN in data validation and relationship analysis.

ON vs WHERE Clause in SQL: Key Differences Explained

Understand the key differences between the ON clause and the WHERE clause in SQL. This comparison table helps you learn how these clauses function in JOIN operations and filtering result sets.

Feature ON Clause WHERE Clause
Purpose Specifies the join condition between tables Filters rows from the resulting joined table
Location Part of the JOIN statement Part of the SELECT statement
Execution Order Executed before the WHERE clause Executed after the JOIN clause
Applicability Used with JOIN statements Used with SELECT, UPDATE, and DELETE statements
Result Rows Affects which rows are included in the join Affects which rows are returned from the already joined table
Example SELECT * FROM customers INNER JOIN orders ON customers.customerID = orders.customerID SELECT * FROM customers WHERE country = 'USA'

Key Differences Between ON and WHERE Clauses

  • Purpose: ON defines how tables are connected, while WHERE filters the final result set.
  • Execution Order: ON is executed first to create the joined table, then WHERE filters it.
  • Applicability: ON is specifically for JOINs, while WHERE has broader use across SQL operations.
  • Result Rows: ON determines which rows are included in the join, while WHERE determines which rows are returned from the joined table.

INNER JOIN using GROUP BY clause

Using GROUP BY with INNER JOIN allows grouping the result set by one or more columns after combining rows from multiple tables. This is often used with aggregate functions like SUM(), COUNT(), and AVG() to produce summarized insights from related tables.

SELECT T1.ID,
       T1.NAME, 
       SUM(T1.TEAM) AS EMPCOUNT        
FROM EmployeeName T1          
INNER JOIN Position T2  
    ON T1.ID = T2.ID    
GROUP BY T1.ID
ORDER BY T1.ID, T2.TITLE;

Result Comparison:

Without GROUP BY clause:

ID   NAME     EMPCOUNT
20   Sam           5
30   Cindy        10
30   Cindy         2

With GROUP BY clause:

ID   NAME     EMPCOUNT
20   Sam          5
30   Cindy       12

INNER JOIN with Multiple Tables

Scenario: Analyze employee performance and satisfaction across departments, projects, and managers.

Tables Involved:

  • employees (employeeID, name, departmentID, managerID)
  • departments (departmentID, name)
  • projects (projectID, name, managerID)
  • performance_reviews (employeeID, rating)
  • satisfaction_surveys (employeeID, satisfaction_score)
SELECT
    employees.name AS employee_name,
    departments.name AS department,
    projects.name AS project,
    performance_reviews.rating,
    satisfaction_surveys.satisfaction_score
FROM employees
INNER JOIN departments 
    ON employees.departmentID = departments.departmentID
INNER JOIN projects 
    ON employees.employeeID = projects.managerID
INNER JOIN performance_reviews 
    ON employees.employeeID = performance_reviews.employeeID
INNER JOIN satisfaction_surveys 
    ON employees.employeeID = satisfaction_surveys.employeeID
WHERE departments.name = 'Sales';

INNER JOIN with GROUP BY for Sales Analysis

Scenario: Identify top-selling products across different stores along with supplier and stock information.

Tables Involved:

  • products (productID, name, supplierID)
  • suppliers (supplierID, name)
  • stores (storeID, name)
  • sales (productID, storeID, quantity_sold)
  • inventory (productID, storeID, quantity_in_stock)
SELECT
    products.name AS product_name,
    suppliers.name AS supplier,
    stores.name AS store,
    SUM(sales.quantity_sold) AS total_sales,
    SUM(inventory.quantity_in_stock) AS current_stock
FROM products
INNER JOIN suppliers 
    ON products.supplierID = suppliers.supplierID
INNER JOIN sales 
    ON products.productID = sales.productID
INNER JOIN stores 
    ON sales.storeID = stores.storeID
INNER JOIN inventory 
    ON products.productID = inventory.productID 
    AND stores.storeID = inventory.storeID
WHERE suppliers.name = 'Acme Inc.'
GROUP BY product_name, supplier, store
ORDER BY total_sales DESC;

Common Pitfalls

Ambiguous Column Names: Always prefix columns with table aliases when names repeat.

Missing Matches: INNER JOIN excludes unmatched rows. Use LEFT JOIN if you must retain all left-table rows.

Performance Considerations:

  • Ensure indexed join columns.
  • Be cautious with large tables—filter early using WHERE clauses.

Practical Use Cases:

  • Reporting: Combine sales, products, and customer tables for dashboards.
  • Data Validation: Identify records present in both systems.
  • Filtering: Fetch only overlapping dataset entries.

Conclusion

INNER JOIN is a powerful SQL operation for combining data from multiple tables based on specified conditions. It forms the backbone of relational databases, enabling efficient data retrieval and analysis. Mastering INNER JOINs empowers you to delve deeper into your database, uncovering valuable insights and forging connections that bring your data to life. So, unleash your inner data detective and start querying with confidence!

OUTER JOINs: Step-by-Step Guide | IBM DB2 Manual: Click Here

Admin

Share
Published by
Admin
Tags: Inner Join

Recent Posts

AI Product Owner : Use Cases with Examples and Tools

The Product Owner role has shifted from just being a requirements proxy to a strategic,…

7 months ago

Business Value: Crafting User Stories for Measurable Impact

Business Value: In the world of Agile development, the user story has long been the…

8 months ago

SAFe Scrum Master (SSM) Certification with FAQ and Tips

The SAFe Scrum Master certification has become one of the most sought-after credentials for Agile…

12 months ago

PSK Practice Exam Practice Mode Questions

The Professional Scrum with Kanban (PSK) course enhances your organization's ability to deliver value efficiently…

1 year ago

Effective User Interviews in Scrum Framework

Effective User interviews play a crucial role in Scrum methodology, helping Product Owners and Scrum…

1 year ago

User Research Tools and Techniques for Product Owners

Product Owners should be well-versed in various user research tools and techniques to effectively understand…

1 year ago