What should a Live Exceptions baseline rule threshold be set to

Document ID : KB000023518
Last Modified Date : 14/02/2018
Show Technical Document Details


How should I set me Live Exceptions baseline rule threshold?



        To determine the proper settings for thresholds in baseline rules, users can run a report against the rule and element to determine what the data normally looks like. This allows a more useful thresholds to be set to trigger alarms using these types of rules. To do this please complete the following:


Step 1 - Create the rules needed and use any threshold. Guessing is just fine at this point. Our goal right here is to simply create the rules and apply the rules to the elements. Once this is done we need to then let enough time go by for the baseline to get calculated by the Live Exceptions Baseline job so that the baseline data now exists in the DB. This job runs by default once a day at 7PM. Check your eHealth console Scheduled Jobs UI for more information on when this job runs.


Step 2 - Now, using some additional information that will need to be gathered, we can get a version of the Alarm Detail Report that is not normally available for these baseline based rules unless an alarm is already active. To get this report without the command line run, you can right click on any active alarm in LE and choose the 'Trend Report' option which runs the 'Alarm Detail' version of a Standard Trend Report. Run one to see what it looks like and note that it shows not just the data involved in the rule violation but also the threshold set in the rule. The command to run to get this report outside of the LE browser is:


nhReport -rptType drilldown -rptName LiveExAlarmDetail -subjType element -subjName elementName -elemType elementType -fromDate mm/dd/yy -fromTime 0:00 -toDate mm/dd/yy -toTime 23:59:59 -screen -ascii liveEx1.csv -pdf liveEx1.pdf -liveExProfile profileID -liveExAlarm alarmRuleID


The values in bold are items that you will need to know the value for before running the command. To determine these values:


A. elementName = The name of the element in question.


B. elementType = To determine this first we need to know what the valid element types for the -elemType flag are. This is found by running this command:


$NH_HOME/bin/nhListElements -showTypes


Next step in determining the element type is figuring out which MTF file is in use by the element. To do this we need to run this command:


$NH_HOME/bin/nhExportConfig -groupType <groupType> -subjName <groupName> -dciOut <dciOutFile>

Where <groupType> is a valid groupType. Find these by running the nhExportConfig command with the -h option.
Where <groupName> is the name of the group the element in question is a member of.
Where <dciOutFile> is the name of the output file to write the data to. This can be anything you want.


Now find the element in question in the <dciOutFile> and find what MTF it is using. Next look in the MTF file in use in the $NH_HOME/poller directory and look for the value of the elementType. This is what you use as a value for the -elemType flag in the nhReport command.


C. mm/dd/yy is the date for which the data in the report should represent.


D.  liveEx1.csv is the name of the ASCII version of the report created. It can be modified to have any name desired.


E. liveEx1.pdf is the name of the PDF version of the report created. It can be modified to have any name desired.


F.  profileID is the database ID of the profile that contains the rule you are trying to determine an appropriate threshold for. To find this out run this command:


$NH_HOME/bin/nhDebugLiveEngine -filename <filename> -profile all

Where <filename> can be whatever you want.


Review the output file and you will see entries like these:

profileID: 1000005, name: _SystemTestRules

In this case 1000005 is the profileID for my _SystemTestRules profile.

G.  alarmRuleId is the database ID of the rule that you are trying to determine an appropriate threshold for. To find this out run this command:

nhDebugLiveEngine -filename <filename> -profile <profileID> -detail

Where <filename> can be whatever you want.
Where <profileID> is the ID found in step 6.

Review the alarm rules returned. It is easiest to identify the rule based on it's rule message which will be labeled description in the nhDebugLiveEngine command output. Here is an example:

alarmId: 1000020, profileID: 1000005, active: Yes, elementType: system, severity: warning, disabled: No, description: An abnormal condition has been detected, ruleType: Event, duration: 0, analysisWindow: 0

In this case my alarmId is 1000020 and is what I would use in the nhReport command.

Armed with this information, once a baseline based rule is applied to an element, and the Live Exceptions Baseline job has been run, you should be able to get a report that shows you what valid thresholds would be needed to trigger the alarm.