Many folks today are using VMware platforms to host virtual machines in which they are running CA Service Management software on. VMware has the capability of taking what are called "Snapshots" which are restore points in which one can roll back to if necessary. These Snapshots come in handy for testing purposes, and short-term backup purposes, however, the activity of creating, restoring, or deleting snapshots in VMware can affect a running CA Service Management environment, leading things like performance issues, network or connectivity issues, and file access issues to name a few.
VMware Inc. states that snapshot activity such as taking a snapshot while a VMware machine is in a running state, OR deleting a snapshot while a VMware machine is in a running state and being used, may cause an outage.
CA's RECOMMENDATION is NOT to take "Running Snapshots" on servers where CA Service Management applications are in a running state. It is recommended to stop CA Service Management services prior to taking a snapshot, deleting a snapshot, or reverting to a snapshot on a VMware machine.
Things that we have noted through our experience when taking a "Running Snapshot" with CA Service Management products in a running state:
- CA Service Desk Manager may disconnect from the MS SQL Database server, causing a backlog and possible disruption to users using CA Service Desk during that time
- CA Service Desk Manager may become unresponsive for an amount of time while the Virtual Memory of the VMware machine is written to a snapshot file, at times causing the CA Service Desk processes to crash and be unrecoverable
- Network connectivity may be broken while a snapshot is being taken, causing a disconnection of the CA Service Desk Manager processes between one another causing a system outage which may not be recoverable and may require a complete cycle of services or a reboot of the server to recover from the outage
Things that we have noted through our experience when deleting a snapshot while CA Service Management products are in a running state:
- Web servers such as Apache Tomcat may crash or hang and become unresponsive
- Network activity may experience performance degradation leading to disconnections and a possible outage on the system
- Storage I/O may experience degradation leading to the system processes becoming unresponsive, thus causing an outage of the system and possibly a reboot of the server to recover from the outage