Using the IBM Health Checker with CA IDMS.

Document ID : KB000021586
Last Modified Date : 14/02/2018
Show Technical Document Details

Question:

Does CA IDMS provide checks for use with the IBM z/OS Health Checker facility?

 

Answer:

 

IBM Health Checker Exploitation

This section describes the health checks that report on CA IDMS systems. The product owner for all CA IDMS health checks is CA_IDMS.

Each Health Check issues its output as messages to the IBM Health Checker message buffer which you can view using SDSF, the HZSPRINT utility, or a log stream that collects a history of check output. The HCHECKER facility in CA SYSVIEW provides a convenient method for displaying and modifying the status and attributes of a health check.

If a check finds a potential problem, it issues a WTO message. We will call these messages exceptions. The check exception messages are issued both as WTOs and also to the message buffer. The WTO version contains only the message text, while the exception message in the message buffer includes both the text and explanation of the potential problem found, including the severity, as well as information on what to do to fix the potential problem.

CA IDMS will always write information to the JES log about the status of its related health checks whenever a CA IDMS Central Version is started. This information is provided even if the IBM Health Checker facility is not active on the z/OS system.

Note that the full name of each health check for CA IDMS systems is in the format:

<check-name>@<system-name> where  
  <check-name>  is the basic check name as documented below and         
   <system-name> is the name of system as defined by the SYSTEM ID clause on the 
                 SYSTEM statement entered during System Generation.  

You can use IBM Health Checker syntax to modify the default attributes of a particular check for a single CA IDMS system by specifying the specific system id as part of the check name. Alternatively, you can modify the defaults for a group of systems or for all systems by using the special wildcard characters "*" or "?" as shown in the samples at the end of this section.

IDMS_SCRATCH_IN_MEMORY Health Check

Description

IDMS_SCRATCH_IN_MEMORY checks whether scratch is in memory. This Health Check issues a medium-level exception if it isn't, since the CA IDMS system runs more efficiently when scratch is kept in memory. Since this is a static check it is run once during start up.

CA IDMS utilizes a temporary working storage area referred to as the scratch area. In severely storage constrained environments, this area can be supported on DASD through standard file I/O. In most cases, however, the performance of the CA IDMS system can be significantly improved by enabling this scratch area to reside in main storage and optionally permitting dynamic extensions to provide for the self-tuning of storage use.

Best Practice

Specify SCRATCH IN STORAGE IS YES on the CA IDMS SYSTEM statement.

Parameters Accepted

None.

Debug Support

No.

Verbose Support

No.

Reference

See the CA IDMS System Generation Guide for detailed information on setting the SCRATCH IN STORAGE parameter.

Messages

See the CA IDMS Messages and Codes Guide.

IDMS_ CPU_EFFECTIVENESS Health Check

Description:

The CPU effectiveness of a CA IDMS system is the ratio of CPU received to the elapsed time that CPU was required. This provides an indication of the amount of time that CA IDMS was forced into an involuntary wait for CPU resources. Low values of CPU effectiveness indicate serious performance degradation and are often observed as significant response time increases by users of the system. This number should usually be in excess of 90% for production systems which means that CA IDMS was reliably provided with at least 90% of the CPU that was required for performing the tasks that were assigned to it. A lower ratio may be adequate for development or test systems. The ratio that raises an exception can be adjusted for some or all CA IDMS systems.

This check is run every 15 minutes and the CPU effectiveness calculation is based on this interval. A longer interval is not recommended since the variation in effectiveness over a longer time period makes the calculation less meaningful.

The main cause of low CPU effectiveness is insufficient priority assigned to the CA IDMS job or started task. Another possible cause is high paging rates, indicating that the machine does not have enough real storage to support its current work load. A final cause is a startup parameter on the EXEC statement that specifies SWAP=Y. Swapping should never be allowed for a CA IDMS system.

Best Practice

Ensure that the system is running non-swappable by removing any SWAP=Y parameter from the startup JCL. Assuming that paging rates are acceptable, either increase the IBM Work Load Manager (WLM) velocity goals for CA IDMS or reduce the amount of higher priority work being run on the LPAR. If this doesn't relieve the condition and the IBM Intelligent Resource Director (IRD) is in use, adjust its configuration to reduce the amount of higher priority work being run on the machine. Another alternative is to create a separate service class for CA IDMS that is CPU protected.

Parameters Accepted

CPURATIO(cpu-ratio)

cpu-ratio specifies the minimum CPU effectiveness that is acceptable for the CA IDMS system. This ratio can be modified to allow some systems to run at a lower effectiveness level without raising an exception.

Debug Support

No.

Verbose Support

No.

Reference

See the CA IDMS System Operations Guide for detailed information on setting EXEC parameters. See IBM documentation for information on WLM and IRD.

Messages

See the CA IDMS Messages and Codes Guide.

IDMS_CHANGE_TRACKING Health Check

Description

IDMS_CHANGE_TRACKING checks whether the SYSTRAK feature is in use. SYSTRAK is recommended since it facilitates recovery in certain cases and can be used to monitor and adjust CA IDMS system parameters. A medium-level exception is issued if SYSTRAK is not in effect. Since this is a static check it is run once during start up.

Best Practice

It is strongly recommended that Change Tracking be enabled. Change Tracking permits changing the database environment of a CV in a fault-tolerant manner. If the CV fails, the runtime database definition

is restored from SYSTRK files during restart, ensuring that the files being updated at the time of failure are the ones recovered by warmstart.

Parameters Accepted

None.

Debug Support

No.

Verbose Support

No.

Reference

See the CA IDMS System Operations Guide for detailed information on Change Tracking. Also see the Database Administration Guide for information on related administrative procedures.

Messages

See the CA IDMS Messages and Codes Guide.

IDMS_ZIIP_USAGE Health Check

Description:

IDMS_ZIIP_USAGE checks whether zIIPs processors are present and in use. If zIIP processors are available and they are not being used by IDMS, it issues a medium-level exception. This check is run once per 24 hours.

CA IDMS optionally exploits zIIP processors on the z9 series and above for the z/OS operating system. This feature permits offloading computing cycles to the zIIP processors, thereby increasing overall CPU throughput while decreasing operational costs.

Most CA IDMS system code is eligible to run on a zIIP processor. CA IDMS runtime processing ensures that a non-zIIP processor is selected to run non-eligible routines, such as: user exits, database procedures, SQL-invoked routines, and application programs.

Best Practice

Enable zIIP usage for CA IDMS systems where one or more zIIP processors are available.

Parameters Accepted

None.

Debug Support

No.

Verbose Support

No.

Reference

See the CA IDMS System Operations Guide section on "zIIP Exploitation".

Messages

See the CA IDMS Messages and Codes Guide.

Modifying CA IDMS Health Check Actions

The default actions for the CA IDMS health checks can be modified by changing the policy statements through IBM Health Check facilities. For detailed information on establishing or modifying policies to control individual Health Checks please refer to IBM's Health Checker for z/OS User's Guide. Two examples are shown below.

Example 1:

These sample policy statements could be added to a HZSPRMxx member in SYS1.PARMLIB to permanently change the required CPU effectiveness ratio based on the characteristics of particular CA IDMS systems.

/* This sample policy matches the default value for all IDMS        */
/* systems.                                                         */
/* ---------------------------------------------------------------- */ 
ADDREPLACE POLICY STMT(CPU_EFFECT)                                     
UPDATE CHECK(CA_IDMS,IDMS_CPU_EFFECTIVENESS*)                          
SEVERITY(HIGH)                                                          
PARM('CPURATIO(90)')                                                   
INTERVAL(00:15)
ACTIVE                                                        
DATE(20120217)                                                         
REASON('Monitor IDMS systems for adequate CPU resources')
/* This sample policy entirely deactivates CPU Effectiveness        */     
/* checking for test systems.                                       */
/* This sample assumes that the system ID for each test system      */
/* begins with the letters "TEST".                                  */
/* ---------------------------------------------------------------- */ 
ADDREPLACE POLICY STMT(CPU_EFFECT_TEST)                                     
UPDATE CHECK(CA_IDMS,IDMS_CPU_EFFECTIVENESS@TEST*)                          
INACTIVE                                                          
INTERVAL(00:15)                                                        
DATE(20120217)  
REASON('Our test systems can run at very low priority')                       
/* ---------------------------------------------------------------- */ 
/* This sample policy enables CPU effectiveness reporting for       */
/* certain test systems, but with a lower CPU ratio than the        */ 
/* default.  It also reduces the message severity if an             */          
/* exception occurs.                                                */
/* Since this policy is updated after the previous one, it          */
/* will take effect for TEST systems with system ids that contain   */                                    
/* the letters QA in columns 7 and 8 of the system id.              */
/* ---------------------------------------------------------------- */ 
ADDREPLACE POLICY STMT(CPU_EFFECT_QA)                                     
UPDATE CHECK(CA_IDMS,IDMS_CPU_EFFECTIVENESS@TEST??QA)                          
SEVERITY(MEDIUM)                                                          
PARM('CPURATIO(70)')                                                   
INTERVAL(00:15) 
ACTIVE                                                       
DATE(20120217)  
REASON('Test QA systems can run at lower priority than production')

Example 2:

This sample console command can be used to temporarily lower the interval at which CPU effectiveness is checked. This might be useful on a critical system where an alert is desired if CPU effectiveness falls below the desired level even for a short period of time.

F HZSPROC,UPDATE,CHECK(CA_IDMS,IDMS_CPU_EFFECTIVENSS@PROD100),
INTERVAL=(00:01) 

The following information should be added to the CA IDMS Messages and Codes Guide.

Add a paragraph at the end of the "Reason:" explanation for each of the following messages:

DC045000, DC045003, DC200024, DC200025, DC016107, DC016108, DC285900, and DC285905.  

Each added paragraph contains the following text:

This message is produced via the IBM Health Checker facility. See the CA IDMS System Operations Guide for more information about health checks related to CA IDMS.