Left / Right / Outer Join Performance

Document ID : KB000018518
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

Achieving the best optimization of an outer join may require a review of the access path and adjustment of the query specifications.

Solution:

Whenever any query is not performing as desired, it is important to use the EXPLAIN function to see which access path is being chosen by the optimizer to satisfy the logical request. For an Outer join of any sort (LEFT JOIN, RIGHT JOIN, or PRESERVE) this can be particularly important. If the goal is to have the join processed by accessing the preserved table (or record) using a Calckey, then ensure that the JOIN clause specifies all elements in the Calckey definition.

As part of the process of creating a result table for an outer join, IDMS SQL will first create a Cartesian product of the tables. A Cartesian product is a join operation which returns a set that contains all possible ordered pairs from the two tables being joined. This will mean reading every row of both tables, joining them as indicated in the query, and then selecting from that temporary table the rows which satisfy the query selection criteria. Because of this intermediate step, ensuring that the table access is efficient for both tables involved, is very important.

For example, if a LEFT JOIN specifies only the low-order element of the CALC key in the join criteria, the Calckey will not be used to access that record. If the join criteria are expanded to include all fields in the Calckey (even if this means that the high-level field is compared to a literal), EXPLAIN will show that CALC access was used to access the table. If the different Calckey elements are specified in different places of the query (for example, some in the join expression & others in the WHERE clause), a Calckey will not be used. To illustrate this, consider the structure with records SQLSCHEM.R813 and SQLSCHEM.R054, where the Calckey of the R054 record is made up of fields R054_GROUP_CODE and R054_CODE_TEXT. In that case, the following query would not utilize the Calckey because the join criteria specified in the ON clause names only one of the fields in the Calckey:

SELECT R813_ITEM_NBR , R813_ITEM ,
R813_ITEM_SER_NBR ,R813_ROUTE ,
R813_WRHOUSE ,R813_LOC ,
R813_INITIAL_DATE ,R813_INITIAL_CODE ,
R813_EMPL_ASSIGNED_NUM ,R054_ADD_TEXT_AREA ,
R813_EMP_DEPT ,R813_ACTION_CD
FROM SQLSCHEM.R813
LEFT JOIN SQLSCHEM.R054
ON R054_CODE_TEXT = CAST(R813_EMPL_ASSIGNED_NUM AS CHAR(10))
WHERE R813_STATUS_CD = 'O'
AND R054_GROUP_CODE = 'EMPNBR'

The query below would utilize the Calckey because the join criteria specified in the ON clause names both fields that make up the Calckey:

SELECT R813_ITEM_NBR ,R813_ITEM,
R813_ITEM_SER_NBR ,R813_ROUTE,
R813_WRHOUSE ,R813_LOC,
R813_INITIAL_DATE ,R813_INITIAL_CODE,
R813_EMPL_ASSIGNED_NUM ,R054_ADD_TEXT_AREA,
R813_EMP_DEPT ,R813_ACTION_CD
FROM SQLSCHEM.R813
LEFT JOIN SQLSCHEM.R054
ON (R054_GROUP_CODE = 'EMPNBR' AND R054_CODE_TEXT = CAST(R813_EMPL_ASSIGNED_NUM AS CHAR(10)))
WHERE R813_STATUS_CD = 'O'