Troubleshooting Large Delta polling errors on statistics elements

Document ID : KB000023365
Last Modified Date : 14/02/2018
Show Technical Document Details

Problem:

Large Delta errors are showing in the system.log

 

Environment:

eHealth on Solaris, Windows and Linux; all versions

 

Cause:

A large delta error is reported when the delta between the counter value for the last poll and the counter value for the current poll counter reaches or exceeds 50% of its maximum value.  For a 32 bit counter the error is reported when this value is is 2^31 (2147483648) or greater.  For a 64 bit counter the error is reported when this value is is 2^63  (9223372036854775808) or greater.

 

Resolution:

      Instructions on tracking down the cause:

There are four possible causes for this type of error message. The most common cause is a broken counter. This solution is designed to provide you a method to find the solution appropriate to your case.

The four possible causes are:

 1. Index Shift

 The index has shifted on the device and we are now polling a port that is incorrect for that element type. Re-discover the device and review the discover log to see if this element is updated. If there are no changes to this element, go to step #2.

 

2. Fast Network Link

 The speed of the device is so fast (> 10MB) that the counters are wrapping within the time interval of the polling cycle. Switch the polling speed to Fast Polling. Sometimes a device (such as a core network router) can have such a large amount of traffic that it appears to be a problem. By changing to fast polling, you decrease the amount of time the counter has to change and therefore reduce the likeliness that device speed is the cause. If the large delta messages continue, go to step #3.

 

3. The device may not be certified or certified for these specific statistics.

 Check to make sure the device is Certified on the Concord Web Site. If the device is not certified, you may request certification through the Concord Web Site. If the device is certified, go to step #4.

 

4. Mib Agent Failure

 The MIB agent on the device in question is decrementing the counter. No counters should ever decrement and therefore this would be considered a software bug initiated by the vendors software. To verify a broken (decrementing) counters, run the following command from $NH_HOME as $NH_USER and output the results to a file:

 Command syntax the -s switch is in seconds:

 nhSnmpTool -n -c <comm_string> -POLLsa -s 60 -o <Object_IDentifier> <ip_address> > <OutPutFileName>

 

Let the command run for about an hour and then CTRL-C or CTRL-Z out of the command as this will poll the device for the specified OID every 60 seconds until the command is interrupted. This will output the counter values, polled directly from the device mib agent, so as to document the decrementing counters and eliminate eHealth as a variable. You will be able to use this information when contacting the device manufacturer. If you require any assistance with this process, please contact Concord Technical Support.