QUESTION: Some alarms are mission critical to me so I'd like to have a chance to resend them via email every few hours if alarm still persists. Maybe even to recreate them every few hours and alert administrators. This resending/recreating procedure should last until problem gets fixed. How to configure Spectrum to reassert an alarm periodically until the issue is resolved?
ANSWER: Customize the event that generates the alarm to do the following in the order specified (This is important for this to work correctly):
1. Clear the alarm that it generates
2. Assert an alarm
3. Run an Event Pair rule looking for the clear event within X seconds and if not seen generate the alarm event
Here is an example of what I am talking about:
0xfff00000 E 0 C 0xfff00000 A 3,0xfff00000 R CA.EventPair, 0xfff00001, "0xfff0000 -:-", 60
0xfff00001 E 0 C 0xfff00000
In the above example, event 0xfff00000 will first CLEAR an alarm with a Probable Cause id of 0xfff00000. Then it will generate a Critical alarm with a Probable Cause id of 0xfff00000. Then it instantiates an Event Pair rule looking for event 0xfff00001 to occur within 60 seconds. If event 0xfff00001 does not occur within 60 seconds of event 0xfff00000 it will generate another 0xfff00000 event.
Event 0xfff00001 will clear an alarm with a Probable Cause code of 0xfff00000 and also stop the Event Pair rule on the 0xfff00000.
The following is a screen shot of the events generated on the model from the above event configuration:
Notice after the first 0xfff00000 event is logged with a critical alarm, 60 seconds later, a new 0xfff00000 event is logged asserting a new Critical alarm AND the previous alarm for event 0xfff00000 is cleared. This goes on for several minutes. Then, when the 0xfff00001 event is logged, no more 0xfff00000 events are generated.
If you are using SANM to send e-mail notification for alarms, a new e-mail will be sent for each 0xfff00000 Critical alarm.