High CPU time for bash process id

Document ID : KB000099968
Last Modified Date : 04/06/2018
Show Technical Document Details
Issue:
We found that ssh timeout process will increase CPU usage.

Is there a algorithm to recycle obsolete process id or timeout process id will be keep on running ?

Could you advise resolution for this issue ? 
Environment:
CA API Gateway 9.2 and CA API Gateway 9.3 CR1
Resolution:
workaround in place: 


The client in this case has process for logging into the APIM Gateway where the login as another user, using a certificate and then do a su - to login to the ssgconfig user.   But the issue may also occur if the user disconnects the ssh session without doing the logoff.   The trigger is that the login bash shell does not receive a signal when the disconnect occurs.

Certainly that occurs when login as user, using putty or similar ssh client and then do an "su - ssgconfig" and then the ssh client disconnects or times out. 

When the ssh connection times out leaving the bash shell active with input from /dev/null we find the bash script in ~ssgconfig/.bash_profile,  points to the script and tuns : 
  configuser_profile_menu.sh 
this does not exit but runs in a tight loop taking up CPU. 

There are often several of these processes running taking up CPU, and they have to be killed via kill -9 to make the gateway usable. 

Analysis 

It looks to be defect in the APIM Gateway code. 

Looking at the extra processes taking CPU, we could see via pstree -p that these were the process trees : 

|-su(44597)---bash(48908)---bash(47367)---rpm(47368) 
|-su(111261)---bash(111698)---bash(47362)---rpm(47363) 


Looking at the processes, it was clear su - ssgconfig was done after they login, and that the .bash_profile was running the user menu. 

source /opt/SecureSpan/Platform/bin/configuser_profile_menu.sh 

This has a loop in it for menu. And bash "read" command. It has trap of INT, which would detect a normal logout, but does not taking into account the ssh/su type logon. 

Notes: 
For "su" the trap INT is not sufficient to capture the timeout of the login shell. 
The read, both in "read" and in the "confirmedMessage" will return immediately with an empty message. 
The main task in the loop is the "rpm" call so that is why we see the process tree with the su-bash-bash-rpm call 

Workaround : 

a) Detect eof in the read command 
Read returns non zero code on eof, so that can be used to detect end of input : The simplest answer we found was : with : 


Work around:
Change: the main "read choice" line in the file /opt/SecureSpan/Platform/bin/configuser_profile_menu.sh

From:
while [ "$isPatchListMenuValid" != "y" ]
do
clear
echo "Welcome to the CA API Gateway - $(rpm -q --qf '%{VERSION}\n' ssg)"
echo "running on $(cat /etc/redhat-release)"
.
.
.
echo -n "Please make a selection: "
read choice
To:
while [ "$isPatchListMenuValid" != "y" ]
do
clear
echo "Welcome to the CA API Gateway - $(rpm -q --qf '%{VERSION}\n' ssg)"
echo "running on $(cat /etc/redhat-release)"
.
.
.
echo -n "Please make a selection: "
read choice || doLogout

So the change is : 
- read choice 
+ read choice || doLogout 
meaning that when read failed from EOF then it will do the logout. 

The CR patch will also update other "read choice" lines in that script.