DB2

DB2 EXPLAIN: Access Path for Query Optimization

  • Determine the access path that Db2 chooses for a query.
  • Design databases, indexes, and application programs.
  • Determine when to rebind an application.

In practice, this makes EXPLAIN useful for answering questions like:

  • Why is this query slow?
  • Why did Db2 choose a table scan instead of an index?
  • Did the optimizer sort rows before joining them?
  • Would a different index or statistics profile improve the plan?

The access path is the route Db2 takes to retrieve the data requested by SQL. It describes which tables and indexes are accessed, how they are accessed, and the order in which Db2 processes them. A good access path usually means less I/O, fewer sorts, and faster response time. A poor access path can lead to table scans, unnecessary sorting, and slower joins.

 

Key DB2 EXPLAIN tables

The central table for EXPLAIN is PLAN_TABLE, which contains one or more rows per query block describing the chosen access path. PLAN_TABLE columns show, for example, access type, index usage, join method, join sequence, estimated row counts, and whether sorts or parallelism are used.

Additional EXPLAIN tables enrich this information:

  • DSN_STATEMNT_TABLE stores estimated statement costs (CPU, elapsed time, service units) aligned with related PLAN_TABLE rows, useful for comparing alternative versions of a query.
  • DSN_FUNCTION_TABLE and related tables hold details about functions and other statement components when present.

Together, these tables let performance specialists both see the shape of the access path and quantify its expected cost, which is critical for deciding whether further tuning is justified.

  1. Before you can use EXPLAIN, you must create a plan table to hold the results of EXPLAIN. It contains information about access paths for queries that were explained or hints. The DDL for the broader EXPLAIN table set (PLAN_TABLE, DSN_STATEMNT_TABLE, DSN_FUNCTION_TABLE, and others) is shipped in sample members such as DSNTESC in Db2 libraries, and can be executed once per schema that will run EXPLAIN. You can create it by using the below SQL.

   CREATE TABLE [userid].PLAN_TABLE LIKE DEFAULT.PLAN_TABLE;

 

  1. Populate the plan table. You can populate the plan table by executing the SQL statement EXPLAIN. You can also populate a plan table when you bind or rebind a plan or package by specifying the option EXPLAIN(YES). EXPLAIN obtains information about the access paths for all explainable SQL statements in a package or in the DBRMs of a plan.

    This causes Db2 to insert rows describing the access path for the SELECT portion of the statement into PLAN_TABLE (and related tables if present), tagged with QUERYNO = 1. Alternatively, specifying EXPLAIN(YES) on BIND or REBIND populates EXPLAIN tables for every explainable statement in a package or plan, which is useful for broader access path reviews.

   EXPLAIN PLAN SET QUERYNO = 1 FOR [your sql statement here]

  1. Select information from the plan table. Several processes can insert rows into the same plan table. To understand access paths, you must retrieve the rows for a particular query in an appropriate order.

    Ordering rows in this way helps reconstruct the sequence of operations that Db2 will perform, matching how many visual tools display access plans. For many tasks, only a subset of columns is required, so users often build custom views or queries that project core attributes such as table name, ACCESSTYPE, MATCHCOLS, METHOD, and estimated row counts.

    Execute this SQL command to view the Explain information:

   SELECT * 
    FROM PLAN_TABLE 
    WHERE QUERYNO = 1 
    ORDER BY TIMESTAMP, QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ 
    WITH UR;

Look at these fields for important information:

  1. PLANNO – Number of steps necessary to process the query indicated in QBLOCKNO. Indicates order in which the steps were executed.
  2. METHOD – Indicate joins method used for the step (PLANNO).
    • 0 = First table accessed, continuation of previous table accessed, or not used
    • 1 = Nested loop join. For each row of the present composite table, matching rows of a new table are found and joined
    • 2 = Merge scan join. The present composite table and the new tables are scanned in the order of the join columns and matching rows are joined. 3 = Sorts neededby ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified predicate or an IN predicate. Does not access a new table.
  3. ACCESTYPE – Method used to access the table.
    • DI = An intersection of multiple DOCID lists to return final DOCID list.
    • DU = Union of multiple DOCID lists to return the final DOCID list.
    • DX = An XML index scan of the index named in ACCESSNAME to return a DOCID list.
    • E = Direct row using a row change timestamp column.
    • H = Hash access. IF an overflow condition occurs, hash overflow index identified by ACCESSCREATOR and ACCESSNAME is used.
    • HN = Hash access using an IN predicate, or an IN predicate that DB2 generates. If a hash overflow condition occurs, hash overflow index identified in ACCESSCREATOR AND ACCESSNAME is used.
    • IN = Index scan when matching predicate contains an IN predicate and the IN-list is accessed through an in-memory table
    • I = An index (identified in ACCESSCREATOR and ACCESSNAME).
    • I1 = One-fetch index scan
    • M = Multiple index scan (followed by MX, MI, MH, or MU).
    • MH = Hash overflow index named in ACCESSNAME
    • MX = Index scan on index named in ACCESSNAME. When the access method MX follows the access method DX, DI, or DU, the table is accessed by the DOCID index using the DOCID list returned by DX, DI, or DU
    • MI = Intersection of multiple indexes
    • MU = Union of multiple indexes
    • N = Index scan when the matching predicate contains the IN keyword or by an index scan when DB2 rewrites a query using the IN keyword.
    • NR = Range list access.
    • O = Work file scan, as a result of a subquery.
    • P = Dynamic pair-wise index scan
    • R = Table space scan.
    • RW = work file scan of materialized user-defined table function
    • V = Buffers for an INSERT statement within a SELECT.
    • Blank = Not applicable to the current row.
  4. MATCHCOLS – Number of index key used for index scan (when ACCESTYPE is I, I1, N, NR, MX, or DX, number of index keys used in an index scan).
  5. ACCESSNAME – Name of the index used for index scan (when ACCESTYPE is I, I1, H, MH, N, NR, MX, or DX, name of index).
  6. INDEXONLY – If access to an index alone is enough to carry out the step, or if the data, too, must be accessed. Y = Yes; N = No.
  7. PREFETCH – Indicates if data pages can be read in advance by prefetch.
    • D = optimizer expects dynamic prefetch
    • S = Pure sequential prefetch
    • L = Prefetch through a page list
    • U = List prefetch with an unsorted RID list
    • Blank = unknown at bind time or no prefetch
  8. COLUMN_FN_EVAL – Indicates when aggregate functions are evaluated.
    • R = while data is being read from the table or index
    • S = While performing a sort to satisfy a GROUP BY clause
    • X = While data is read from a table or index, for aggregate functions when an OFFSET clause is specified
    • Y = While performing a sort, for aggregate functions when an OFFSET clause if specified.
    • Blank = After data retrieval after any sorts

 

Query to find Total Cost:
 
 SELECT SUBSTR(PL.COLLID,1,10) AS COLLID, 
        SUBSTR(PL.PROGNAME,1,10) AS PROGNAME, 
        DATE(PL.EXPLAIN_TIME) AS DATE, 
        TIME(PL.EXPLAIN_TIME) AS TIME, 
        COUNT(PL.QUERYNO) AS "STMT COUNT", 
        DEC(SUM(ST.TOTAL_COST),8,2) AS "TOTAL COST" 
 FROM SJD.PLAN_TABLE PL,
      SJD.DSN_STATEMNT_TABLE ST 
 WHERE PL.PROGNAME = ST.PROGNAME
   AND PL.COLLID = ST.COLLID
   AND PL.EXPLAIN_TIME = ST.EXPLAIN_TIME 
   AND PL.QUERYNO = ST.QUERYNO
 GROUP BY PL.COLLID, 
          PL.PROGNAME, 
          PL.EXPLAIN_TIME 
 ORDER BY PL.PROGNAME;
 

Result:
 ————+---------+---------+---------+---------+-------------
 COLLID PROGNAME DATE TIME       STMT COUNT TOTAL COST 
 +---------+---------+---------+---------+-----------------
 MYCOLL MYPACK   05/08/2014 11.19.38   5    10.10
 MYCOLL MYPACK   05/08/2014 17.36.17   8    19.11
 


Query to Compare Old & New Cost:

 SELECT ST1.QUERYNO AS QUERYNO, 
        COALESCE(DEC(ST1.TOTAL_COST,8,2),0) AS "OLD COST",   
        COALESCE(DEC(ST2.TOTAL_COST,8,2),0) AS "NEW COST" 
 FROM SJD.DSN_STATEMNT_TABLE ST1 FULL JOIN
      SJD.DSN_STATEMNT_TABLE ST2
   ON ST1.QUERYNO = ST2.QUERYNO
 WHERE ST1.COLLID = 'MYCOLL' 
   AND ST2.COLLID = 'MYCOLL'
   AND ST1.PROGNAME = 'MYPACK'
   AND ST2.PROGNAME = 'MYPACK'
   AND DATE(ST1.EXPLAIN_TIME) = '2014-05-08' 
   AND TIME(ST1.EXPLAIN_TIME) = '17.36.17' 
   AND DATE(ST2.EXPLAIN_TIME) = '2014-05-15' 
   AND TIME(ST2.EXPLAIN_TIME) = '13.05.14' 
   AND ST1.TOTAL_COST <> ST2.TOTAL_COST
 ORDER BY QUERYNO;
 

Result:
 ---------+---------+---------+---------+----- 
 QUERYNO OLD COST NEW COST 
 ---------+---------+---------+---------+-----
 353         .15       .11
 360        8.07     16.24
 


Query to find Total Cost and Explain Time:

 SELECT QUERYNO, 
        TOTAL_COST, 
        EXPLAIN_TIME 
 FROM SJD.DSN_STATEMNT_TABLE
 WHERE PROGNAME = 'ID14SQP' 
 AND COLLID = 'ID14SQL'
 ORDER BY TOTAL_COST DESC, 
          EXPLAIN_TIME DESC; 



Result:
 -----+---------+---------+---------+---------+---------+---- 
 QUERYNO TOTAL_COST               EXPLAIN_TIME 
 -----+---------+---------+---------+---------+---------+----
  88     +0.5616997729372477E+05  2014-11-02-18.03.04.993751 
 131     +0.1453335050780900E+01  2014-11-02-18.03.04.993751 
 131     +0.1453335050780900E+01  2014-10-27-16.55.51.641367
 

 
 SELECT QUERYNO,
        COALESCE(DEC(TOTAL_COST,8,0),0) AS "COST", 
        EXPLAIN_TIME
 FROM SJD.DSN_STATEMNT_TABLE 
 WHERE COLLID = ‘ID14SQL'
   AND PROGNAME = ‘ID14SQP' 
 ORDER BY EXPLAIN_TIME DESC;
 

Result:
 ---------+---------+---------+---------+---------+ 
 QUERYNO   COST          EXPLAIN_TIME 
 ---------+---------+---------+---------+---------+
 88        434758        2014-11-02-18.03.04.993751
 88         23184        2014-10-27-16.55.51.641367

Common pitfalls and how EXPLAIN helps

Several recurring issues in Db2 performance work can be uncovered or confirmed by studying EXPLAIN output.

  • Missing or inappropriate indexes: If EXPLAIN reports table space scans for large tables with selective predicates, it often indicates missing indexes or indexes that do not align well with predicate columns.
  • Low MATCHCOLS values: When MATCHCOLS is low relative to index definition, Db2 might not fully exploit composite indexes, suggesting a need to reorder index columns or rewrite predicates.
  • Overly complex queries: Large, monolithic queries may produce convoluted access paths with many joins and sorts; in such cases, EXPLAIN can justify breaking the logic into simpler steps or using temporary tables.
  • Stale statistics: Counterintuitive access paths that contradict data characteristics often trace back to outdated statistics; EXPLAIN, combined with catalog inspection, can help confirm this diagnosis.

By scrutinizing these patterns in EXPLAIN output, DBAs can prioritize which schema changes, query rewrites, or statistics maintenance jobs are likely to deliver the biggest performance gains.

Optimization workflow with DB2 EXPLAIN

An effective query tuning workflow with EXPLAIN generally follows a cycle of measurement, analysis, and adjustment.

  • Capture the baseline access path: Run EXPLAIN for the problematic query (or BIND with EXPLAIN(YES)) to capture the current access path and costs in PLAN_TABLE and DSN_STATEMNT_TABLE.
  • Review access methods and join order: Examine ACCESSTYPE, MATCHCOLS, join methods, and join sequence to spot inefficient table scans, poor index usage, or suboptimal join ordering.
  • Verify statistics: Ensure RUNSTATS or equivalent utilities have collected up-to-date table and index statistics; stale or missing statistics often cause poor access path choices.
  • Refactor SQL or indexes: Adjust predicates, rewrite joins or subqueries, or add/modify indexes to better support typical query patterns, guided by what EXPLAIN shows.
  • Re-EXPLAIN and compare: Rerun EXPLAIN after changes and compare PLAN_TABLE and DSN_STATEMNT_TABLE entries to confirm that access paths and cost estimates have improved.

For dynamic SQL cached in the dynamic statement cache, facilities such as EXPLAIN STMTCACHE can extract access path information for statements currently or previously executed, which is useful for tuning workloads that are not statically bound.

Best practices for Query Optimization

Good query tuning starts with accurate statistics and predicates that support efficient access. If statistics are stale or incomplete, Db2 may choose a plan that looks reasonable but performs poorly in production.

A practical tuning approach is:

  • Validate the SQL predicate structure.
  • Confirm supporting indexes exist.
  • Refresh table and index statistics.
  • Run EXPLAIN before and after changes.
  • Compare the access path, not just elapsed time.

Conclusion

DB2 EXPLAIN is a foundational tool for query performance tuning, giving visibility into the optimizer’s access path choices through PLAN_TABLE and related EXPLAIN tables. By systematically capturing, interpreting, and acting on EXPLAIN output—often aided by Visual Explain diagrams—DBAs and developers can ensure that critical queries use efficient access paths and that schema and statistics evolve in alignment with workload demands.

Read DB2 blogs : Click Here IBM DB2 Manual :Click Here

Admin

Share
Published by
Admin

Recent Posts

AI Product Owner : Use Cases with Examples and Tools

The Product Owner role has shifted from just being a requirements proxy to a strategic,…

8 months ago

Business Value: Crafting User Stories for Measurable Impact

Business Value: In the world of Agile development, the user story has long been the…

10 months ago

SAFe Scrum Master (SSM) Certification with FAQ and Tips

The SAFe Scrum Master certification has become one of the most sought-after credentials for Agile…

1 year ago

PSK Practice Exam Practice Mode Questions

The Professional Scrum with Kanban (PSK) course enhances your organization's ability to deliver value efficiently…

1 year ago

Effective User Interviews in Scrum Framework

Effective User interviews play a crucial role in Scrum methodology, helping Product Owners and Scrum…

1 year ago

User Research Tools and Techniques for Product Owners

Product Owners should be well-versed in various user research tools and techniques to effectively understand…

1 year ago