Alarm Notifier SANM Fault Tolerant

Document ID : KB000099723
Last Modified Date : 14/06/2018
Show Technical Document Details
Introduction:
Alarm Notifier is not configured to be Fault Tolerant out of the box. However you can make some adjustments to the Set, Clear and Update scripts on both Primary and Secondary servers to allow Alarm Notifier to work in a fault tolerant environment. 

What this will do : 

Whenever an alarm is generated , the model in the DB is checked and the attribute 0x12x0a is read - if its 10 (Primary server precedence) then Primary Alarm Notifier sends it and Secondary will write a line to the NOTIFIER.OUT log file saying Primary is running.

If 0x12x0a is 20 then Secondary server Alarm Notifier will send the mail and the Primary would write to the NOTIFIER.OUT file saying Secondary is running.
Instructions:
1. First step to to make sure both Primary and Secondary Alarm Notifiers are configured the same. If you have not yet set up Alarm Notifier, then please see this Knowledge Article: https://comm.support.ca.com/kb/how-to-configuring-spectrum-alarm-notifier-to-send-email-notifications/kb000021373 

2. Next step is to add a new attribute to the .alarmrc:

Every alarm has an attribute with the associated precedence. On a primary SpectroSERVER this value is normally "10", on the Secondary SS it is typically "20". Add this alarm attribute into the .alarmrc of the AlarmNotifier of both Primary and Secondary servers, so that this attribute will forwarded to the scripts: 

EXTRA_ATTRS_AS_ENVVARS=0x12c0a 

3. Then, add these lines to the Set, Update, and Clear scripts for SANM on both Primary and Secondary. However, be sure to add the proper Primary code to the Primary scripts, and vice versa.

On Primary add this code: 

if [[ "$SANM_0X12C0A" = "20" ]] 
then 
echo "SS Secondary is running" 
echo "Precedence = $SANM_0X12C0A" 
exit 0 
fi 


On Secondary add this code:

if [[ "$SANM_0X12C0A" = "10" ]] 
then 
echo "SS Primaryis running" 
echo "Precedence = $SANM_0X12C0A" 
exit 0 
fi 


NOTE: On Windows machines, the Attribute code is all UPPERCASE: $SANM_0X12C0A. However on LINUX machines this MUST be LOWERCASE: $SANM_0X12c0a. Thus in Linux:

if [[ "$SANM_0X12c0a" = "20" ]] 
then 
echo "SS Secondary is running" 
echo "Precedence = $SANM_0X12c0a" 
exit 0 
fi 


4. Prevent Duplicate emails:

When you start receiving alarms via email, you may start to see duplicate emails. This is because Alarm Notifier has no built-in check to verify which server, Primary or Secondary, is running. In addition, Alarm Notifier running on each server connects to its respective Archive Manager. If your Secondary is configure for a warm or hot standby, its Archive Manager may be running. This is normal but it will cause duplicate emails to be sent. 

To work around this, you'll want to add another if/then statement to check Precedence. This gets added right before the $MAIL line in the scripts so the ability to send an email becomes part of the Precedence check. This will allow only emails which match the current active server to be sent. 

Thus, on EACH script, both Primary and Secondary, add these lines. However, the below example is for PRIMARY. Make sure to change the precedence to "20" when adding these lines to the Secondary's scripts. This section in the script is near the bottom. Add the new lines right after the "echo_info | tee -i /tmp/set_alarm.$PID" line, which is right above the "$MAIL" line. You'll also add a closing "fi" to close the new if/then statement near the bottom, otherwise it will error. 

Be sure to be aware of case sensitivity across platforms as mentioned above. $SANM_0X12c0a = Linux, $SANM_0X12C0A = Windows

if [ "$RCVRS" -a "$RCVRS" != " " ] 
then 
echo " " 
echo "*******************************************************************" 
echo "Sending mail to $RECIPIENTS:" 
echo "" 
echo "($RCVRS)" 
echo "*******************************************************************" 
echo_info | tee -i /tmp/set_alarm.$PID 

if [ "$SANM_0X12C0A" = "10" ] 
then 
$MAIL -s "A $SEV alarm has occurred on $SERVER (Model Name=$MNAME)(Model Type=$MTYPE)" $RCVRS < /tmp/set_alarm.$PID 
rm -f /tmp/set_alarm.$PID 
else 
echo " " 
echo "*****************************************************" 
echo "NO $RECIPIENTS assigned - no mail sent" 
echo "*****************************************************" 
echo_info 
fi 

fi
else 
echo_info 
fi 


Next, you can also add echo statements to indicate in the email which server is active. In the example below this would be added to the Secondary scripts only. If this is to be added to Primary scripts, be sure to change the text to "Primary is running". 

echo "Severity: " $SEV 
echo "ProbableCauseID: " $CAUSE 
echo "RepairPerson: " $REPAIRPERSON 
echo "AlarmStatus: $STATUS" 
echo "SpectroSERVER: " $SERVER 
echo "Landscape: " $LANDSCAPE 

echo "SS Secondary is running" 
echo "Precedence = $SANM_0X12C0A" 
echo "ModelHandle: " $MHANDLE 
echo "ModelTypeHandle: " $MTHANDLE 
echo "IPAddress: " $IPADDRESS 
echo "SecurityString: " $SECSTR 
echo "AlarmState: " $ALARMSTATE 
echo "Acknowledged: " $ACKD 
echo "UserClearable: " $CLEARABLE 


5. Lastly, Save the files and restart both Primary and Secondary Alarm Notifiers.
Additional Information:
Resource: https://communities.ca.com/docs/DOC-91641005?et=watches.email.document#jive_content_id_Redundant_AlarmNotifier 

Optional methods: https://communities.ca.com/thread/110961384

Set up Alarm Notifier: https://comm.support.ca.com/kb/how-to-configuring-spectrum-alarm-notifier-to-send-email-notifications/kb000021373