EZD0811I message with info reason code not being read by NSMEMSTR

Document ID : KB000044518
Last Modified Date : 14/02/2018
Show Technical Document Details

Problem:

We are capturing all EZD0811I messages in syslogd with Event Management caiopr service (NSMEMSTR). In order to capture all of these messages it was necessary to update the /etc/syslog.conf file to specify *.debug rather than *.info on the statement added by Event Management to define the pipe.

The priority level of a syslog message ranges from highest to lowest: 
emerg/panic,alert,crit,err(or),warn(ing),notice,info,debug 
 
 
********************************* Top of Data ***************************
#.* /tmp/syslog.log -N SYSLOGD(+1)
*.* /tmp/syslog.log
*.debug.../cai/nsmem/opr/config/NODENAME/pipe/oprpipe0052           <-- change to capture all messages
#*.info.../cai/nsmem/opr/config/NODENAME/pipe/oprpipe0051            <-- default statement added when Event Management starts the caiopr process
******************************** Bottom of Data ************************* 

 

We experienced a situation on one of our LPARs where an EZD0811I message came out with a reason code of 3 (which should be classified as informational), but it was not picked up by Event Management and therefore not passed to CA OPS/MVS for our defined alerting rules.

 

Cause:

Located the following messages in the syslog.log file...

Jun 7 04:27:17 NODENAME/BPXOINIT SYSLOGD syslogd: FSUM1247 Pipe /opt/lpp/cai/nsmem/opr/config/NODENAME/pipe/oprpipe0052 has been closed 

Jun 7 04:27:17 NODENAME/BPXOINIT SYSLOGD syslogd: /opt/lpp/cai/nsmem/opr/config/NODENAME/pipe/oprpipe0052: EDC5112I Resource temporarily unavailable. Errno/Rsn=112/059D0135 

 

What we found is that syslogd issues FSUM1247, indicating the pipe has been closed and will remain disabled until syslogd is restarted. Event Management continues to try reading from the pipe and gets an EOF, resulting in the EDC5112I message. In this case it was determined that the pipe filled up and became blocked because of an unusual flood of messages that was encountered.

 

Resolution:

  1. Identify and prevent the underlying problem that caused the flood of messages.
  2. Use a system automation tool to capture this sort of situation and take the necessary action to reestablish normal functionality.

a. Recycle Event Management

b. Recycle syslogd

 

Additional Information:

For information relating to the above, refer to the chapter delegated to Event Management Configuration in the CA Common Services for z/OS Installation Guide.

https://support.ca.com/cadocs/0/CA%20Common%20Services%2014%201-z%20OS-ENU/Bookshelf.html