Maintenance mode not working

Document ID : KB000095622
Last Modified Date : 07/06/2018
Show Technical Document Details
Issue:
The following recorded in the nas.log at debug = 5 logging level when alarms from devices that are supposed to be in maintenance start appearing:

May 10 08:52:09:316 [29020] nas: maint: In reRegisterWithMaintenanceProve Got proposedStartTime
May 10 08:52:09:316 [29020] nas: maint: Set to clearIntervalMaps
May 10 08:52:09:316 [29020] nas: maint: Interval Maps Cleared, set to createIntervalMap
May 10 08:52:09:316 [29020] nas: maint: Interval created, set to registerWithMaintenanceModeProbe
May 10 08:52:09:316 [29020] nas: maint: Maintenance Mode Probe Path: maintenance_mode
May 10 08:52:09:316 [29020] nas: maint: maintenanceModeProbePath exists, trying to start session
May 10 08:52:09:336 [29020] nas: maint: Creating PDS
May 10 08:52:09:336 [29020] nas: maint: nimSessionRequest to be called
May 10 08:52:12:758 [26216] nas: maint: entering inMaintenanceMode function
May 10 08:52:29:986 [29020] nas: maint: nimSessionRequest returned 2
May 10 08:52:29:986 [29020] nas: maint: PDS deleted
May 10 08:52:29:986 [29020] nas: maint: Registration error rc: '2' to 'maintenance_mode'
May 10 08:52:29:986 [29020] nas: maint: MM State: UNHEALTHY Maintenance Mode Re-Registration **FAILURE** Return Code: 2

At this same time, we see the following error messages recorded in the maintenance_mode.log at loglevel = 5:

May 10 08:52:31:794 WARN [attach_socket, com.nimsoft.monitor.probe.MaintenanceModeProbe] Failure registering /hms/Primary_Hub/WPAPPUIM001/nas .(2) communication error, Error when trying to send on session (S) com.nimsoft.nimbus.NimServerSession(Socket[addr=/10.30.230.4,port=53541,localport=48042]): Software caused connection abort: socket write error
May 10 08:52:31:795 WARN [attach_socket, com.nimsoft.nimbus.NimServerSession] An exception occurred while processing a message from Socket[addr=/10.30.230.4,port=53541,localport=48042].
May 10 08:52:31:796 WARN [attach_socket, com.nimsoft.nimbus.NimServerSession] (2) communication error, I/O error on nim session (S) com.nimsoft.nimbus.NimServerSession(Socket[addr=/10.30.230.4,port=53541,localport=48042]): Software caused connection abort: recv failed

May 10 08:52:31:796 ERROR [attach_socket, com.nimsoft.nimbus.NimServerSession] Exception in NimServerSessionThread.run. Closing session.

 
Environment:
UIM Server:  8.51
maintenance_mode:  8.53, 8.52, 8.52HF1
nas:  8.56, 8.57, 9.0HF1
OS:  Windows 2012 64-bit and Linux RHEL 6.9 64-bit
Cause:
The nas is clearing it's internal maps (current list of devices in maintenance_mode), then failing to re-register with the local maintenance_mode probe.  No devices are in maintenance until the next time that the nas successfully re-registers with the maintenance_mode probe and rebuilds its internal maps.  Cause of the communication error causing the re-registration failure unknown at this point in time.
Resolution:
Defect resolved in the following nas hot fixes:

nas 8.56HF5
nas 9.00HF3

Contact CA Support for the appropriate nas hot fix required for your UIM environment