DB2 Predicates

The DB2 Predicates affect how DB2 selects the access path for the statement. Predicates are found in the WHERE, HAVING, or ON clauses of SQL statements; they describe attributes of data. Most DB2 predicates are based on the columns of a table. They either qualify rows (through an index) or reject rows (returned by a scan) when the table is accessed. The resulting qualified or rejected rows are independent of the access path that is chosen for that table.

DB2 Predicates in SQL statements are classified. These classifications dictate how DB2 processes the predicates and how much data is filtered during the process. These classifications are as follows:

Stage 1 DB2 Predicate

An indexable predicate can match index entries; predicates that cannot match index entries are said to be non-indexable.

The DB2 Data Manager understands your indexes and tables and can use an index for efficient access to your data. Only a stage 1 predicate can limit the range of data accessed on a disk. Stage 1 is responsible for translating the data stored on pages into a result set of rows and columns.

Stage 1 predicates are better than stage 2 because they qualify rows earlier and reduce the amount of processing needed at stage 2.

If the predicate, which is classified as a stage 1, is evaluated after a join operation, then it is a stage 2 predicate. So all indexable predicates are stage 1, but not all stage 1 predicates are indexable.

Examples of Stage 1 DB2 Predicate

Syntax – col op value

col – Indicates a column of a table.

Op – Indicates an operator such as =, >, <, >=, <=, and so on.

Value – Indicates an expression that does not contain a column from the table (a non-column expression).

Let’s consider index exists on the EMPNO column of the EMP table.

Example of indexable stage 1 predicate:

SELECT LASTNAME, FIRSTNME
FROM  EMP
WHERE EMPNO = '000010'

Predicates containing NOT BETWEEN, NOT IN (for a list of values), NOT LIKE (without a leading search character), or LIKE (with a leading search character) can also be stage 1 indexable.

Example of non-indexable stage 1 predicate:

SELECT LASTNAME, FIRSTNME 
FROM EMP
WHERE EMPNO <> '000010'

Stage 2 DB2 Predicate

The stage 2 engine processes functions and expressions. But, it is unable to access data in indexes and tables directly. Data from stage 1 is passed to stage 2 for further processing. So, stage 1 predicates are more efficient than stage 2 predicates. Stage 2 predicates cannot use an index, and thus cannot limit the range of data retrieved from a disk.

Stage 2 (Relational Data Services) handles more complex predicates, data transformations, and computations. These Stage 2 predicates are much more expensive for DB2 to resolve than Stage 1 due to additional processing and additional code path. Additionally, RDS cannot make effective use of indexes.

Generally, stage 2 occurs after data accesses and performs such actions as sorting and evaluation of functions and expressions. Stage 2 predicates generally contain column expressions, correlated subqueries, CASE expressions, and so on.

Examples of a Stage 2 DB2 Predicate

Let’s consider EMPNO is a character column of fixed length 6. Below example is a range predicate comparing a character column to a character value that exceeds the length of the column which makes it a stage 2 predicate.

SELECT LASTNAME, FIRSTNME
FROM  EMP
WHERE EMPNO > '00000010' 

SELECT LASTNAME, FIRSTNME
FROM EMP 
WHERE SUBSTR(LASTNAME,1,1) = 'B'

Order of DB2 Predicates Evaluation

 The first set of rules:

  1. Indexable predicates are applied first. All matching predicates on index key columns are applied first and evaluated when the index is accessed.
  2. Other stage 1 predicates are applied next.
  3. First, stage 1 predicates that have not been picked as matching predicates but still refer to index columns are applied to the index. This is called index screening. In general, DB2 chooses the most restrictive predicate as the matching predicate. All other predicates become index-screening predicates.
  4. After data page access, stage 1 predicates are applied to the data.
  5. Finally, the stage 2 predicates are applied on the returned data rows

The second set of rules describes the order of predicate evaluation within each of the above stages:

  1. All equal predicates (including column IN list, where list has only one element).
  2. All range predicates and predicates of the form column IS NOT NULL.
  3. All other predicate types are evaluated.

After both sets of rules are applied, predicates are evaluated in the order in which they appear in the query. Because you specify that order, you have some control over the order of evaluation.

Types of DB2 Predicates

The type of a predicate depends on its operator or syntax, as listed below. The type determines what precessing and filtering occurs when the predicate is evaluated.

  • Subquery – Any predicate that includes another SELECT statement. Example: C1 IN (SELECT C10 FROM TABLE1)
  • Equal – Any predicate that is not a subquery predicate, has an equal operator and no NOT operator. Also included are predicates of the form C1 IS NULL.  Example: C1=100
  • Range – Any predicate that is not a subquery predicate and has an operator in the following list: >, =>, <, <=, LIKE, or BETWEEN.
  • IN – A predicate of the form: column IN (list of values). Example: C1 IN (5,10,15)
  • NOT – Any predicate that is not a subquery predicate and contains a NOT operator. Example: COL1 <> 5 or COL1 NOT BETWEEN 10 AND 20.

Combining DB2 Predicates

When simple predicates are connected by an OR condition, the resulting compound predicate is evaluated at the higher stage of the two simple predicates. The following example contains two simple predicates that are combined by an OR. The first predicate is stage 1 indexable, and the second is non-indexable stage 1. The result is that the entire compound predicate is stage 1 and not indexable:

SELECT EMPNO
FROM EMP
WHERE WORKDEPT = 'C01' OR SEX <> 'M'

In the next example, the first simple predicate is stage 1 indexable, but the second (connected again by an OR) is stage 2. Thus, the entire compound predicate is stage 2:

SELECT EMPNO 
FROM EMP 
WHERE WORKDEPT = 'C01' OR SUBSTR(LASTNAME,1,1) = 'L'

BETWEEN is usually more efficient than <= predicate and the >= predicate except when comparing a host variable to 2 columns.

Stage 1: WHERE Col1 <= :hostvar AND col2 >= :hostvar
Stage 2 : WHERE :hostvar BETWEEN col1 and col2

Read DB2 blogs : Click Here IBM DB2 Manual :Click Here

Scroll to Top