VM:Tape STAM Enqueue Contention

Document ID : KB000045329
Last Modified Date : 14/02/2018
Show Technical Document Details

Introduction: 

CA VM:Tape STAM Enqueue Contention.

 

Environment:   

STAM allows multiple VM:Tape servers to dynamically share tape drives.  STAM manages this using a purpose built database called the STAM Control File that resides on a minidisk that must be write shared among all the VM:Tape systems.  The area containing the STAM Control File can be on a dedicated one cylinder minidisk or the last cylinder of the TMC minidisk that has been formatted with the RECOMP option.  The STAM Control File (which is not a CMS file) is created by the VMTSTF utility and consists of a number of disk blocks, each containing control information corresponding to one of the up to 24 CA VM:Tape systems that can participate in a STAM collection.  Whenever a process needs to do anything involving a tape  drive (for example, allocate a drive for a tape mount operation) it must obtain a physical lock on the STAM Control File.  This is done by means of a special I/O operation that reads all of the STAM Control File data blocks and then writes the block corresponding to the system obtaining the lock.  That block is written with a flag indicating that system holds the physical lock.  The reading and writing is done with a single I/O operation which does not permit any other system to access the minidisk while the I/O operation is proceeding.  On completion of the I/O operation, the blocks are examined.  If the lock flag is not set for any of the other systems, then the physical lock has been successfully obtained.  If another system had the  physical lock then the requesting system rewrites its block to reset its lock flag and waits three seconds before attempting to obtain the physical STAM Control File lock again.  Within a CA VM:Tape server, there may be multiple processes or threads executing simultaneously.  Any one of these processes might need to obtain the STAM Control File lock, but only one process at a time can attempt to do so.  This process is said to control the logical STAM lock.  When more than one process is attempting to lock the STAM Control file, the processes must queue up and wait their turn.

The following message:

 VMTSTM000IA STAM CF already enqueued by OPERATOR (QUERY); PCB at 01DE21C0.

is a debugging message which is generated when a process attempts to get the STAM Control File lock but finds that the logical lock is already owned (or enqueued) by another process.  This does not indicate a pathological condition, but rather normal use that is causing some degree of contention on that CA VM:Tape server.  In this particular case the logical lock is apparently held by another process executing a QUERY command invoked by the OPERATOR virtual machine.  This process may not actually have the physical lock.  It could be trying to get the physical lock which is  owned by another CA VM:Tape server.  In that case, the QUERY command process will retain the logical lock and wait for three seconds before again attempting to get the physical.   When the process holding the physical lock is done and ready to release the lock it first releases the physical lock and only then gives control of the logical lock to the next process.  If there is such a process then it proceeds to get the physical lock.  When multiple CA VM:Tape servers are contending for the physical lock it is essentially unpredictable which one will obtain it next although chance does favor the system that previously held the physical lock over the other systems.  Excessive QUERY commands can lead to very long waits that might be mistaken for an actual pathological condition.  The design of the STAM Control File does have a mechanism that detects pathological conditions that could result if a CA VM:Tape server abends or begins looping while holding the physical lock.  Other CA VM:Tape systems that are working normally will begin to report a long standing lock being held by a particular system: 

 VMTSTD0376W System VMT1 has long standing lock on STAM control file.

 If this message is displayed because VMT1 has abended, then  restarting that system will cause the lock to be cleared.   However if that CA VM:Tape cannot be restarted for some reason then the STAM RELEASE VMT1 CLEAR command should be invoked from one of the other systems.  This will cause any  tape drives that were allocated to that CA VM:Tape system (VMT1) to be released so they can be used by the other systems.  CLEAR indicates that the physical lock held by the released system should be cleared thus allowing other systems to resume normal STAM operation.  Another way to reset things is by reformatting the STAM Control File with the VMTSTF utility.  You should stop STAM first by issuing the STAM STOPALL command.  If you do not do that STAM may notice the Control File has been formatted and abruptly terminate STAM on all systems.

 

Additional Information:

 This new section has been added to VM:Tape's Administrators Guide, Shared Tape Allocation Manager (STAM) Facility chapter which can be accessed at:

  https://docops.ca.com/display/VMTZVM/Shared+Tape+Allocation+Manager+%28STAM%29+Facility