Discovery fails against large MIB data tables

Document ID : KB000010143
Last Modified Date : 14/02/2018
Show Technical Document Details
Introduction:

The DA has a default timeout for discovery of fifteen minutes. This means that if a MIB table being queried does not complete its response within that 15 minute window, the process halts and discovery of the related component element fails.

This 15 minute window is sufficient for most cases, but there are devices that can take much longer to iterate over a couple thousand MIB table entries. This is most often and most commonly seen with devices with hundreds or thousands of interfaces.

For these cases in order to allow the device that is slow to respond enough time to do so successfully we can increase that timeout time frame.

Background:

Undocumented r3.0 feature that is useful but should be used with great care and caution.

The timeout is controlled by Vendor Certification, not by device, Monitoring Profile or Metric Family. When making this change it will increase the timeout for all devices that use the Vendor Certification the change is made against.

Environment:
CA Performance Manager releases r3.0 and newer only
Instructions:

To configure an override value for messengerTimeoutPerFacet:

 

1: Create the file /DA_HOME/IMDataAggregator/apache-karaf*/etc/com.ca.im.item.impl.distributed.DistributedItemRepositoryConfig.cfg. The default home location for DA_HOME is /opt.

2: Edit the new file to add a list of Vendor Certification qnames and associated timeout values, in seconds, using the following comma separated list format:

(Note that this should be a single line with all facet overrides in the same list.)

messengerTimeoutPerFacetOverride=<QNameForFacet1>=<timeoutInSeconds>,<QNameForFacet2>=<timeoutInSeconds> 

For example, to increase the timeouts for Interface to 30 minutes and Fast Interface to 40 minutes: 

messengerTimeoutPerFacetOverride={http://im.ca.com/certifications/snmp}IfXTableMib=1800,{http://im.ca.com/certifications/snmp}IfTableMib=2400 

3: Restart the Data Aggregator dadaemon service for the change to take effect.

 

To back out of these changes rename or delete the com.ca.im.item.impl.distributed.DistributedItemRepositoryConfig.cfg file created in step one and restart the dadaemon service.

Additional Information:

If unsure how to use this, if there is concern this may not be the right fix, please contact CA Support including a reference to this Knowledge Base Article for further assistance.

Additional information about stopping and starting the Data Aggregator dadaemon service can be found in Knowledge Base article ID TEC1382101.