Bandwidth utilization is exceeding 100%

Document ID : KB000023506
Last Modified Date : 14/02/2018
Show Technical Document Details

Question: 

Why is my report showing Bandwidth Utilization exceeding 100%?

 

Causes: 

  • The element is either Frame-Relay or an ATM subinterface
  • The elements speed is incorrect in the eHealth poller, or there was a recent change to the elements speed
  • The element is full duplex but is discovered and treated as half duplex
  • The devices snmp counters were reset between polls

 

Environment:  

eHealth 6.2.x and above
All supported platforms

 

Answer:

  1. If the element is Frame-Relay or an ATM subinterface, it can exceed 100% utilization.  Please see TEC513327 - Frame Relay or ATM subtinterface Bandwidth Utilization exceeds 100% for more details.

  2. The bandwidth utilization calculation uses the elements speed in its equation, so if it is too low the result will be too high.  Check the speed set in the poller config GUI against the actual elements speed.  Also, if there was a recent change to the elements speed, then there is a possibility that some of the old data will be incorrect.  We do not store different speed values for elements in the eHealth database - every element only has one speed value, so if there was a change in the speed, the equation is using the new speed value to calculate utilization on older polled data.

  3. If the element is truly full duplex but is calculated as half duplex the result will be too high. Run a Trend report for the device for bandwidth utilization in, out, and total. If in and out equal total then the element is being treated as half duplex. If total equals the average of in and out it is being treated as full duplex.

    If the element is full duplex and being discovered as half duplex click see TEC513511 - An element's operational speed characteristics are incorrectly set to 'half-duplex' instead of 'full-duplex'

    For an explanation of full duplex bandwidth calculations see TEC519832 - The total bandwidth calculation for a full duplex line

  4. The device counters have been reset.


Assume theoretically that the device MIB counters only went to 1000, as opposed to 2 to the 32nd or 2 to the 64th power. eHealth polls for values every 5 minutes by default. If the counter was at 10 on one poll and 50 on the next it would calculate that 40 occurred during that time frame. If the counter was at 970 at one poll then 10 at the next. eHealth would calculate that 40 occurred during that time frame. (1000 - 970) + 10 = 40.

Now let's assume that the counter was at 600 at one poll. Then the counters get to 670 but then are reset to 0. At the next poll the counter is at 10. eHealth calculates (1000 - 600) + 10 = 410, since it has no way to know the counters reset.

If the delta between the first and second polled values is greater than 1/2 the possible delta in the counters (for example, (2^32)/2, or 2^31), then eHealth will display a large delta message, and drop the polled value.  If it is less than this, then eHealth will populate the database with the calculated values.  So if the counters were reset when the value was greater than 2^31, then it is possible to see a skew in the datapoint.