Maximizing Data Set Integrity

Document ID : KB000027979
Last Modified Date : 14/02/2018
Show Technical Document Details


EDIF - Enhanced Data Set Integrity Facility

EDIF provides powerful integrity features above and beyond those provided by the base z/OS operating system and access methods.

The EDIF features described in this article can be generally applied or excluded for all data sets, or can be selectively applied or excluded by data set name, data set name mask, or data set type (DSORG).

Once you have selected the data sets you want to protect, this additional integrity is applied for all access of those data sets. However, you can bypass the additional integrity for data set access from certain programs, if desired.

You choose what action is to be taken when EDIF detects an exception. One option would be to allow the exception to occur and to log it in SMF. Many first-time EDIF users select this "quiet mode" option. Another option would be to prevent the exception by abnormally terminating the offending program with a selectable ABEND code. With the simultaneous DISP=SHR update protection option, you can also elect to have EDIF automatically and transparently handle the integrity exception.

EDIF provides you with the ability to display or modify your EDIF processing options dynamically, at any time, using operator commands. EDIF also provides an operator command (EDITEST) to display what processing options, if any, apply to the data set specified on the EDITEST command.



A) Simultaneous DISP=SHR Update Protection

For data sets specified in DD statements in JCL, a multiple resource ENQ is issued for resources whose QNAMEs are SYSDSN and whose RNAMEs contain each data set name specified in the JCL, respectively. The type of access (EXC or SHR) for each resource is governed by the DISP= value in the JCL. However, there is no allegiance between the access type attribute (EXC or SHR) on the SYSDSN ENQ and the mode with which the file (DCB) is subsequently opened within the application program.

For example, if two jobs specify DISP=SHR in their JCL for a data set, and both programs open the associated DCB for output, data corruption is possible unless additional serialization is employed.

Most of us are taught that, if you update an existing data set, you should specify DISP=OLD or DISP=MOD. Unfortunately, users often specify DISP=SHR for activity that includes update processing. There is no edit checking to prevent this from occurring.

EDIF provides the additional serialization required to eliminate the possibility of data corruption. EDIF tracks open and close activity of specified data sets. If an OPEN is issued for one of these data sets and the OPEN specifies a mode that allows update processing, EDIF issues an exclusive ENQ with a QNAME of EDIDSN, and an RNAME containing the VOLSER and the data set name. This eliminates the potential data corruption caused by incorrect specification of DISP in the JCL.

Furthermore, the serialization is more granular. Some jobs require a series of short updates to a common file. Without EDIF, those jobs would either need to be single-threaded through the initiators, or you would need to write customized code that performs dynamic allocation and de-allocation for the shared file. With EDIF, you can intentionally specify DISP=SHR in the JCL, run the jobs concurrently, and allow EDIF to "interleave" the updates to the common file in a secure fashion without loss of data integrity and without rewriting application code.

Many years ago, IBM added serialization (SYSZDSCB ENQ) to prevent such simultaneous updates from corrupting a PDS. Only one DISP=SHR open for output is allowed for a PDS. Subsequent DISP=SHR openers for output are terminated with a S213-30 ABEND. At the z/OS 1.5 level, IBM added serialization (SYSZEDI ENQ) to prevent simultaneous updates from corrupting sequential files with the same S213 ABEND results.

With EDIF, when the EDIDSN resource is not available, the opening task is suspended in EDIF code until the resource is available. Therefore, EDIF transparently serializes these updates and eliminates the possibility of S213-30 abends. Furthermore, the EDIF solution protects data sets of other organizations as well, not just DSORG=PO data sets.

Data sets may be specified as eligible for EDIF processing by default, by data set name with wild cards, by data set organization, or by any combination of the above.


B) Data Set Attribute Verification Feature 

EDIF can detect and/or prevent the erroneous modification of data set attributes that could render a data set unusable to other programs. EDIF performs this by comparing the attributes in the DCB for the opening program with the attributes in the DSCB for the data set

One example of DSCB attribute modification is partitioned data sets (PDS). PDSs are usually the "repository" for critical operational data such as procedure libraries, run libraries, and CICS libraries. PDS directories contain data that describe the members within the PDS. Damage to a PDS directory can destroy the mechanism by which the members (and the critical data) are accessed. The following are examples of how PDSs and/or their directories can be destroyed:

  • Copying a data set with attributes different from those of the PDS (LRECL, BLKSIZE) into the PDS
  • Not specifying a member name on the DD statement when the program opens the DD for output with DSORG=PS in its DCB

In the first case, the attributes of the PDS are changed to the attributes of the "copied" data set. None of the other members within the data set can be accessed, because their attributes are now different from those of the PDS. One way to fix this problem is to copy a data set with the correct attributes back into the PDS.

Unfortunately, in a multi-image environment many other updates to the PDS are probably taking place, and those new "updates" may be suspect once the file is "fixed". In addition, the solution involves someone doing something after the problem occurs. What if the problem occurred at 2:00 A.M.? Do operators or application programmers know how to fix it? In automated operations environment, problems should be prevented, not solved after they occur.

In the second case above, a user updates a PDS and forgets to include a member name on the PDS as the object of the update. For example:

//* 'MONTHLY.PAYROLL.DATA' is a sequential file

In this example, the program PAYME01 expects that the data set associated with DDNAME PAYLOG is a sequential file, and will open it with DSORG=PS in its DCB. Therefore, the directory for 'PAYROLL.DATA' will be destroyed. The only solution is to restore the PDS from a previous back-up copy. Careful programming can prevent errors that could cost your IS organization significant amounts of time and money.