What happens when the CA Workload Automation DE (DSeries) scheduler/timer is cold started

Document ID : KB000012076
Last Modified Date : 14/02/2018
Show Technical Document Details
Introduction:

We know that if CA Workload Automation DE shuts down abruptly, it can result in database inconsistencies. 

Database inconsistencies can also be a result of improperly maintained system. For instance: a database without a regular housekeeping activity to clear out old files.

All of these can result in a situation where we need to perform a cold start of the CA Workload Automation DE server. Cold start would clear out the inconsistencies and errors in the database.

To do a cold start of CA Workload Automation, rename the runonce.properties.bak to runonce.properties.bak in the install directory.  

Start the CA WA.

Note: For high availability configuration, do this on both servers.

Question:

What happens behind-the-scenes when a CA Workload Automation DE server is cold started?

What does each cold start level do?

 

Environment:
CA Workload Automation DE (dSeries) 11.3.x and higher
Answer:

First, we need to understand what Workload Automation DE (dSeries) components are affected by a cold start.

These are scheduler (anything to do with events), and the runtime (the active workload) in the system (mainly jobs and application).

 

a) What happens when a CA Workload Automation DE scheduler is cold started?

  • All timers associated with the scheduler is removed.

The main objective of the timers is to group events that should occur at the same time. For example, all the events that should be processed together (e.g. at 10:00:00 AM) is stored in a single timer.

  • All data in ESP_TDR_DATA table is truncated. This essentially wipes out all scheduled event triggers.

 

b) What happens when a runtime (Distributed Manager) is cold started?

  • All incoming messages to be processed by distributed manger and all outgoing messages to be sent from the distributed manager are cleared from the database. These include messages to and from the Agents, or other internal dSeries components.
  • ESP_RT_WOB table is truncated. All active workload is lost. Jobs that were running on the Agent may complete, but manager would have lost track of it.
  • Any timers associated with the distributed manager are cleared. These included job’s time dependencies, external dependencies, etc
  • If the global.variables.cold.start set to true (default is false) in runonce.properties, global variables are deleted.
  • All variable dependencies are removed
  • All resources are reset to their initial values (run time values are lost).
  • If the global.variables.cold.start is set true (default is false) all invalid applications are removed from ESP_APPLICATION table.
  • All desktop client related data is truncated, including ESP_WSS_APPL and ESP_WSS_JOB table.
  • Status message tables are truncated.

 

Now, what happens at each cold start level?

 

start.type.level = -2

Cold start the scheduler

Cold start the runtime

scheduleallevents command issued

 

start.type.level = -1

Cold start the scheduler

Cold start the runtime

scheduleallevents command issued

Application generation count for all applications is set to zero

 

start.type.level = 0 

Cold start the scheduler

Cold start the runtime

 

start.type.level = 1 

Cold start the runtime 

Warm start the scheduler. 

All scheduled events (ESP_TDR_DATA) are preserved and executed. 

All active workload is lost (ESP_RT_WOB).

 

start.type.level = 2 

Cold start the scheduler

Warm start the runtime 

This preserves all active workload.

All scheduled events are lost.

 

start.type.level 1 and 2 are useful depending on what you want to achieve.

If you want to preserve your active workload, then 1, if you want to preserve your events, then 2.

Majority of the time, we end up performing cold start when we have a data corruption or poorly maintained database, which has degraded in performance to a point where we cannot be recover.

Your data will dictate which start level to choose.

 

Additional Information: