Connectivity check failed. Reason: Could not find accessible execution server for agent. Agent may be offline.

Document ID : KB000048110
Last Modified Date : 11/05/2018
Show Technical Document Details
Issue:

Agent connectivity check fails with the error 'Connectivity Check failed. reason: Could not find accessible execution server for agent'. 

Figure 1

The NIMI.LOG on the Execution Server shows the following error lines:

 2014-08-19 17:51:14,385 [KeepAliveWorker-4] DEBUG (KeepAliveManagerImpl.java:283) 
 - Not updating [NodeInfo{nodeId=nid:<Node ID>, hostname='<Hostname>', 		
 addresses=[/10.x.x.x:6600], nodeType=NODE, version=4.5.1.59}] - this supernode is over max capacity

Environment:
All versions of CA Release Automation
Cause:

This error occurs when the number of agents that can be handled by the Execution Server (NES) has reached it's maximum capacity.

An Execution server is configured to serve a specific number of agents defined in nimi_config.xml configuration file (Default: 200).

If there are more than the defined number of agents trying to connect to an Execution server there can be several issues.

One of the issues is that the agents randomly lose connectivity to the server at random times. This can cause jobs to fail during any step of  process execution.


The current default definition of the number of agents can be found in <RA_EXECUTION_SERVER_INSTALL_DIR>/conf/nimi_config.xml

For Release Automation 4.7.1 and below the values are defined as:

 <server> 		
  <capacity>200</capacity> <!-- how much nodes to accept--> 		
  <warn-capacity>190</warn-capacity> <!-- over this limit new connecting nodes will be asked to seek another supernode--> 		
 </server> 	 
 

For Release Automation 5.0 the values are defined as:

  <server>
  <capacity>200</capacity> <!-- how much nodes to accept--> 		
  <warn-capacity>200</warn-capacity> <!-- over this limit new connecting nodes will be asked to seek another supernode--> 		
 </server> 	 
Resolution:

1. Increase the number of agents that can be handled by an Execution Server, by modifying nimi_config.xml.

Example for increasing the capacity to 250.

  <server>
  <capacity>250</capacity> <!-- how much nodes to accept--> 		
  <warn-capacity>250</warn-capacity> <!-- over this limit new connecting nodes will be asked to seek another supernode--> 		
 </server> 	 
 

2. Restart the Nolio Services on the Execution Server

Important Note: Make sure that there are no jobs running on the Execution Server before making the change.


 

Things to consider before increasing the number or agents

- Current RA JVM heap size definitions

- RAM on the Execution Server

- Number of processors on the Execution Server

Note: It is highly recommended to add a new Execution Server and load balance the number of agents attached to each Execution Server, if the maximum capacity (default: 200) has been reached.