Tim Status Displays the Message: "tim dead but subsys locked."

Document ID : KB000041764
Last Modified Date : 12/02/2019
Show Technical Document Details
Issue:

Issue:

     Attempting to shut down the TIM with the command "service tim stop" fails. Running "service tim status" displays the message "tim dead but subsys locked".

 

Environment:

Environment:

     Any supported TIM environment APM 9.7 or higher.

Cause:

Cause:

     The error "tim dead but subsys locked" message is caused by the TIM failing to update files notifying the OS that the process terminated. TIM's process id should be located in /opt/CA/APM/tim/logs/tim.pid. This file is cross referenced with /var/lock/subsys/tim and these two files are expected to indicate a consistent state with one another. This means that if the (empty) lockfile /var/lock/subsys/tim exists, then the first line of /opt/CA/APM/tim/logs/tim.pid is expected to contain the pid of the TIM process (indicating that Tim is running). If Tim has shutdown, then /var/lock/subsys/tim should no longer exist. When "tim dead but subsys locked" occurs, /var/lock/subsys/tim still exists even though the Tim process is not longer running.

Resolution:

There are two ways to handle this situation.  To get your TIM going, perform this Workaround.  To fix the situation, perform the Resolution.

Workaround: 

      To correct the issue, remove the /var/lock/subsys/tim file and restart the TIM with the command "service tim start".

Resolution:

     One possible cause preventing the TIM from gracefully shutting down is a timeout present within the shutdown logic of the TIM. By default TIM will wait 60 seconds for the process and sub processes running on the OS to terminate. Should this process fail to complete in 60 seconds, TIM will report an error, such as "Stopping tim: /sbin service: line <line number>: <pid> Killed....", and fail to delete /var/lock/subsys/tim as described above. This issue is especially prevalent when there are large number of TIM worker processes (typically in a one to one ratio with the number of processors available on the server).

To permanently address this issue, it is recommend that you raise the timeout present within the shutdown logic of the TIM. To do this...

     -Navigate to http://<TIM_Computer_Name or IP_Address>:<Port_Number>/cgi-bin/ca/apm/tim/index
     -Click Configure TIM Settings
     -Click Define a new value.
     -Type the name: ExitWaitSeconds
     -Enter a value. The default is 60 when undefined.
     -Click Add. The new value takes effect immediately. Restarting the TIM is not necessary.

Additional Information:

In any case where the suggestion does not help, you may need to further investigate for locks in the filesystem, to do this you can use the lslk command.
There is a useful article relating to this here: http://dnaeon.github.io/locked-files-linux/

"Finding and removing locked files on a GNU/Linux system"