False Positive Alerts / Can't restart SNMPCollector Probe

Document ID : KB000124124
Last Modified Date : 04/01/2019
Show Technical Document Details
Issue:
We are having issues with our snmpcollector probe sending false positive alerts. I tried to restart it but it fails.

snmpcollector false alarms, e.g., alarm received on 12/31/2018
10.xx.xxx.xx: XYZ-B30-1-MDF01-ASR2.xxx.xx.xxx interface Tu20x(Tunnel205) is down. 
alarm count was 931 alarms

snmpcollector probe error: "Failed to restart probe
/xxx/xxx/xxxhub/snmpcoillector.
Communication failure.

Template editor error when trying to open it:
"Templates were unable to be retreived.
Click Retry to attempt to save your configuration again.
Click Close to close thius Window."

Log files in IM were not showing any text - IM client seemed to not be working correctly so we used AC but AC log files were not displaying any text on the snmpcollector hub as well. We also noticed this Windows error which seemed to be thrown by the hub->System Error &H8004023 A Microsoft Software Installer error was encountered 

Deployment of snmpcollector v3.44 would fail towards the end of the package deployment with "Installation of section generic failed" so we tried AC instead, but only after we resolved another issue seen in the local hub's controller log: 

     fopen failed for probes/service/discovery_agent/jacob-1.17-M2-x64.dll' 

- Firewalls were already disabled on the core hub and the remote hub where snmpcollector is deployed. 
- The only Windows event error we saw was a single crash of the distsrv. 
- Deployment of the snmpcollector probe v3.44 via both the IM/Admin Console failed. either inst_pkg failed or the transfer of the generic section hung.
-Errors in the controller.log: 

Jan 3 16:34:09:809 [7000] Controller: sockWrite Failed: 0000000001B0AF30 127.0.0.1/59708 10053 (rc = -1, sfd = 1644, lenleft = 365) 
Jan 3 16:34:09:809 [7000] Controller: nimISend: failed to write 92 bytes (rc=-1) [single buffer] 
...
Jan 3 16:34:09:854 [7000] Controller: failed to send probeport list to hub CSH(10.16.205.185) - communication error 
Environment:
- snmpcollector v3.41
- DCD v2.40
- hub v7.92 HF3
Cause:
- Multiple/unknown as to the ultimate root cause
Resolution:
For the error-> "fopen failed for probes/service/discovery_agent/jacob-1.17-M2-x64.dll' "
We deactivated the discovery_agent, then deleted the directory and deleted the probe and the issue was gone. discovery_agent can be reinstalled after we upgrade the snmpcollector. 

- Recommended adding two more virtual processors since the swnmpcollector requires a quad core machine. 
- Reboot the hub machine where snmpcollector v3.44 was deployed
Recommended a reboot of the hub where snmpcollector v3.41 was already deployed and even though the 3.44 distribution seemed to fail, post reboot it appeared that the distribution of v3.44 was successfully completed.

Changed the min and max of the memory settings to:

<startup> 
options = -Xms2048m -Xmx4096m -XX:+UseCompressedOops -XX:+UseG1GC -XX:StringTableSize=10003 -Duser.language=en -Duser.country=US 
</startup> 

Also set: 
pollInterval = 900 

Added key:
MAX_BATCH = 3 

KB Article:
https://comm.support.ca.com/kb/snmpcollector-maxbatch-setting/kb000043290 

Test of opening snmpcollector probe via the AC was successful and although the failed to restart error appeared temporarily, the error almost immediately disappeared and the probe remained green with a port and a pid.

Opening of the Template Editor took several minutes.
To improve the snmpcollector configuration page loading/template editor loading right away requested that the customer  deploy this package on to the Primary hub (it will disappear), and/or on any hub/robot where the admin console is deployed.

The package can be downloaded from:
https://support.ca.com/us/product-content/recommended-reading/technical-document-index/ca-unified-infrastructure-management-hotfix-index.html

     mps_webapp-8.5.1_HF1.zip fixes the following issue:

- Improved performance opening Admin Console based probe GUI

Steps:
- import this package into your archive
- deploy the package to the UIM primary hub
- wasp probe will automatically restart.

This package will also be included in UMP 9.0.x