Technical Details regarding discovery_server communications

Document ID : KB000014966
Last Modified Date : 14/02/2018
Show Technical Document Details
Introduction:

Technical details on how the discovery_server probe communicates

Question:

How does the discovery_server probe communicate?

Environment:
UIM 8.51
Answer:

- discovery_server  probe asks the primary hubs for a list of hubs using 'gethubs' callback.

- discovery_server probe then asks each hub in that list for 'getrobots' to get a list of all robots. 

- discovery_server probe then does a 'nametoip' callback to find the IP address and port of the robot.

- If there IS a tunnel, then nametoip will return the IP/port of a tunnel session and connect to that tunnel session which will handle the request to the robot.

- If there is NO tunnel, nametoip will return the actual IP/port of the robot itself on port 48000 and the discovery_server will try to connect directly to the robot - so this may fail if a) there is not route to the robot from the primary or b) there is no tunnel between hubs.

Additional Information:

Example scenario (no metrics displayed in USM for one or more robots):

In one particular customer case, where there was no tunnel between the primary and remote hubs, the discovery_server relies on the nametoip callback, and gets the address back in the result (e.g., 10.24.x.x ) and then attempts to connect directly to that IP, port 48000. 

In the discovery_server log you must search for errors regarding the given robot hostname and/or other hosts for which no metrics are being displayed in USM. As an example this robot below shows the type of error you would expect to see in the discovery_server log indicating that it was trying to communicate directly with the 10.24.x.xxx robot to fetch the niscache elements but it failed. 

Error showing failure to fetch the nis_cache elements on the given robot: 

15 Jan 2016 17:37:02,322 [robotWorker-2] WARN com.nimsoft.discovery.server.nimbus.scan.NisCacheUpdater - fetch nis cache failed on pass=0 with 0 total elems received for /HIXXX/XX_Secondary_Hub_Servers/<hostname> : (80) Session error, Unable to open a client session for 10.24.xx.xxx:48000: Connection timed out: connect 

The discovery_server log should be set to loglevel 5 with a large logsize, e.g., 20000 and check the actual log on the file system to be sure.