Policy server hangs after web agent communication failure.

Document ID : KB000051261
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

If a Web Agent goes off the network during a request and is not able to notify the Policy Server of this communications failure, there can be a situation where the Policy Server worker thread can wait endlessly on a recv() and even when the Web Agent regains network functionality, on the Web Agent side, the connection is closed but the Policy Server has no way to know that since it is just waiting for data that will never arrive.

Solution:

By enabling the socket option for TCP KeepAlive, once the connection becomes idle, the Policy Server will start sending Keep Alive packets to the Web Agent after a system configurable period of time and after several failures to receive a response (also defined in the system) the recv() function will receive an error and the thread that was waiting will escape this situation and the connection will be closed.
A new environment variable named: SM_ENABLE_TCP_KEEPALIVE is introduced and if set to a value of 1 will enable this mechanism.
Also if the TLI tracing is enabled (by setting the env var: SM_TLI_LOG_FILE ) an entry will be printed if this option is enabled or not.
This TCP KeepAlive feature will be disabled by default on the Policy Server.M
This feature is available from 12.0 SP2 CR01 and 6.0 SP6 versions of Siteminder Policy Server.