Bad Polls Vs. All Errors

Document ID : KB000019335
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

The OCE 'Bad Polls' and 'All Errors' counts do not always match. The 'All Errors' count appears as a result of processing of different kind of polling errors. The 'Bad Polls' count is just total amount of elements that didn't get good poll status. The sources of the 'All Errors' and 'Bad Polls' data are different and different SQL queries are made to retrieve it from database. There can be situations when 'All Errors' count is affected and does not represent actual errors count, such as when element configuration iscorrupted.

Solution:

You can find explanation with technical details below.

NOTE 1: OCE users should remember that 'All Error' list shows only errors occurred on last poll, whereas messages.stats.log contains errors from many polls. It would be a mistake to compare total amount of errors after certain poll in the file and on the 'All Errors' page. Also, for certain errors are aggregated in the messages.stats.log file, so that only one message per-device is logged.

NOTE 2: As with any software product, there might be issues with the values of Bad Polls & All Errors.But investigation of log files and DB data is required anyway to confirm if there's an actual error. For example, possible root cause of incorrect 'All Errors' counts can be broken Element configuration (invalid parent-child associations). The nh_element, NH_MTF_VARIABLE, nh_elem_assoc tables are involved in the values processing.

NOTE 3:OCE users should take care when evaluating OCE Tree Item 'All errors' vs. 'Bad Polls' count on Polling Status page. The Polling Status page is updated automatically, whereas the 'All Errors' tree item has to be refreshed manually.

  1. Per PRD54499 different DB entities are used by OCE to retrieve 'All Errors' and 'Bad Polls':

    • 'All Errors' Count is retrieved from NHV_STATS_POLL_ERROR_ELEMENT view;

    • 'Bad Poll' count is retrieved from NH_STATS_POLL_HISTORY table;

  2. Customers should pay attention to what counts are actually shown on the OCE Polling page and 'All Errors' one. Bad/Good polls number shown near Y axis of the appropriate Polling charts is a MAXIMUM of all the poll counts being currently shown on the polling status page. I.e. actual last Bad poll number can be different from the number shown at the left of the Bad Polls chart.

    Example : assume that we have gotten 10 polls completed. The number near the Y axis will be a maximum among bad poll counts of all 10 polls completed. This is explained in details in PRD48389.

  3. The NHV_STATS_POLL_ERROR_ELEMENT view is created in a way that if base table NH_STATS_POLL_INFO.LOCAL_COLLECTION_TIME field value is ZERO, then the table entry is not included into the view, i.e. will not be shown on 'All Errors' page. See the view definition code. The LOCAL_COLLECTION_TIME can be ZERO in certain cases and in general means that the element has never been polled. If the LOCAL_COLLECTION_TIME value is greater than ZERO, then it represents the time when the polling started.

    In order to find out why LOCAL_COLLECTION_TIME is ZERO log files investigation is required as well as elements configuration check.