How can CA Spectrum be used to monitor and alarm on individual CPU and memory for devices?

Document ID : KB000004218
Last Modified Date : 14/02/2018
Show Technical Document Details
Issue:

Before Spectrum 10.2 version, out of the box, CA Spectrum monitors individual CPU and memory instances but only raises High CPU/Memory Utilization alarm on instance which threshold violation firstly occurred (if there are multiple instances). CA Spectrum doesn't monitor the aggregate value of a devices CPU and memory by default. To activate aggregate value monitoring, you need to uncomment bottom half of the event configurations below. The <SPECROOT>/SS/CsVendor/Cabletron/EventDisp has the following entries
 
# Comment out the following lines if you do not want
# CPU Threshold alarms to occur for individual instances
0x00010f01      R Aprisma.EventPairTimeAttr, 0x00010f02, "0x00010f03 -:-", 0x12bce
0x00010f03 E 50 A 2, 0x00010f03
0x00010f02 E 50 C    0x00010f03
 
# Comment out the following lines if you do not want
# Memory Threshold alarms to occur for individual instances
0x00010f04      R Aprisma.EventPairTimeAttr, 0x00010f05, "0x00010f06 -:-", 0x12bcf
0x00010f06 E 50 A 2, 0x00010f06
0x00010f05 E 50 C    0x00010f06

 

# Uncomment the following lines if you want
# CPU Threshold alarms to occur based on the
# aggregate utilization of all the instances
# 0x00010f07      R Aprisma.EventPairTimeAttr, 0x00010f08, "0x00010f09 -:-", 0x12bce
# 0x00010f09 E 50 A 2, 0x00010f09
# 0x00010f08 E 50 C    0x00010f09

# Uncomment the following lines if you want
# Memory Threshold alarms to occur based on the
# aggregate utilization of all the instances
# 0x00010f0a      R Aprisma.EventPairTimeAttr, 0x00010f0b, "0x00010f0c -:-", 0x12bcf
# 0x00010f0c E 50 A 2, 0x00010f0c
# 0x00010f0b E 50 C    0x00010f0c

Since Spectrum 10.2 version onward, out of the box, CA Spectrum monitors the aggregate value of devices CPU and memory only. The <SPECROOT>/SS/CsVendor/Cabletron/EventDisp has the following entries. Event configuration for individual instances have been removed.

# CPU Threshold alarms to occur based on the
# aggregate utilization of all the instances
0x00010f07      R Aprisma.EventPairTimeAttr, 0x00010f08, "0x00010f09 -:-", 0x12bce
0x00010f09 E 50 A 2, 0x00010f09
0x00010f08 E 50 C    0x00010f09

# Memory Threshold alarms to occur based on the
# aggregate utilization of all the instances
0x00010f0a      R Aprisma.EventPairTimeAttr, 0x00010f0b, "0x00010f0c -:-", 0x12bcf
0x00010f0c E 50 A 2, 0x00010f0c
0x00010f0b E 50 C    0x00010f0c

If you run CA Spectrum 10.2 or later and you want to monitor individual CPU and memory instances then you can re-add the following event configuration for individual instances into <SPECROOT>/custom/Events/EventDisp and then reload Event Configuration by clicking VNM model -> Information tab -> SpectroSERVER Control -> [Update Event Configuration] button using your OneClick Console.

# Comment out the following lines if you do not want
# CPU Threshold alarms to occur for individual instances
0x00010f01      R Aprisma.EventPairTimeAttr, 0x00010f02, "0x00010f03 -:-", 0x12bce
0x00010f03 E 50 A 2, 0x00010f03
0x00010f02 E 50 C    0x00010f03
 
# Comment out the following lines if you do not want
# Memory Threshold alarms to occur for individual instances
0x00010f04      R Aprisma.EventPairTimeAttr, 0x00010f05, "0x00010f06 -:-", 0x12bcf
0x00010f06 E 50 A 2, 0x00010f06
0x00010f05 E 50 C    0x00010f06

 

Environment:
Spectrum on all OS platforms
Cause:

The monitoring for individual CPU and memory works only on instance which threshold violation firstly occurred (if there are multiple instances). If then another instance exceeds threshold value while High CPU/Memory Utilization alarm still exists on the firstly violated instance then the newly violated instance will not receive alarm. The newly violated instance will have alarm only if there is no other instance was being violated. The monitoring for individual CPU and memory has been functioning like this not to introduce performance issues when enabled all at once.

In order to monitor individual CPU and memory instances regardless of existing violated instance, you need to manually create Watches.  This will allow you to enable the functionality in a controlled manner, so that you limit the risk of introducing major performance issues with the SpectroSERVER.

 

Resolution:

In order to monitor individual CPU and/or memory instances regardless of existing violated instance, you need to create a Watch from the "Thresholds and Watches" section of the device.  For CPU, you need to create a Watches using the NRM_DevicCPUUUtilization (0x12aaa) attribute.  For memory, you need to create a Watch using the NRM_DeviceMemoryUtilization (0x12ac6) attribute.  These are instanced attributes and therefore you will need to either create the Watches for all instances (using the .#) or by specifying the instance at the end of the attribute (using the .<instanceid>)

For example, if you want to create a Watch to monitor all instances of CPU, the expression would be:

NRM_DevicCPUUUtilization.#

For example, if you want to create a Watch to monitor instance CPU of instance 768, the expression would be:

NRM_DevicCPUUUtilization.768

Additional Information:

Enabling Watches puts a large resource burden on the SpectroSERVER. If you create the Watch as "Active" by default, when you save the Watch it may cause a performance issue.  It is better to create it as "Inactive" by default and then active it in a controlled manner (you can do this by multi-selecting the models and using attribute editor to activate the watch or create a Policy in Policy Manager to activate it).


Watches are created per model type, so you will need to create the watch on each model type that you want to monitor.

If you are using the all instance specification (.#) you will need to set the event to generate a unique alarm for each event in the Event Configuration GUI.  This setting is in the Alarm Details-> Alarm Options-> "Generate a Unique Alarm for Each Event".  If you do not do this, and multiple instances violate the threshold at the same time, you will only get one (1) alarm for the first instance that violated.

Watch creation information is available in the CA Spectrum online documentation.  As of the time of this writing, the link is available here:

https://docops.ca.com/ca-spectrum/10-2-1/en/managing-network/watches/create-and-edit-a-watch

Information on generating a unique alarm for each event is here:

https://docops.ca.com/ca-spectrum/10-2-1/en/managing-network/event-configuration/working-with-events-and-alarms