The AutoSys scheduler was down for several hrs, how can we restart it but NOT have the scheduler try to run the jobs that it should have during the outage?

Document ID : KB000011013
Last Modified Date : 14/02/2018
Show Technical Document Details

Customer scenario:
The AutoSys scheduler was down for 12 hours and 900+ jobs are queued up to START. The users requested that we not run any jobs scheduled during the outage. If the scheduler is simply restarted it will begin executing the job before someone could manually issue a JOB_ON_HOLD or JOB_ON_ICE sendevents. As they would be stacked behind the older STARTJOB events.

Product: Workload Automation AE 11.xPlatform: Windows, UNIX, LinuxDatabase: Oracle, Sybase, MS SQL

Possible solutions:

Option #1
Start the scheduler using GlobalAutoHold.
Excerpt from the "CA Workload Automation AE Administrator Guide" version 11.3:

Starting the scheduler in Global Auto Hold mode prevents the system from being flooded with jobs that were scheduled to run during the down time. The scheduler evaluates all the jobs whose starting conditions are met and eligible to run. Instead of starting the jobs, the scheduler puts them in ON_HOLD status.

This approach lets you decide which jobs should run and selectively start them by using the sendevent command to send a FORCE_STARTJOB event. The only way to start a job when you start the scheduler in Global Auto Hold mode is to send a FORCE_STARTJOB event.

Once you feel comfortable that the scheduler has caught up to the current time, stop the scheduler and restart without using GlobalAutoHold.

Option #2
If you do not want the scheduler to process the queued up, past, STARTJOB events which were based on time/date then you can choose to resave all the job definition prior to restarting the scheduler. This would cause AutoSys to re-evaluate their next start times. It would result in the removal of the jobs current STARTJOB events (ones based on time) in the ujo_event table, re-calculate the next future start time, and place a new future STARTJOB event in the ujo_event table.

This 2nd option would NOT prevent jobs from starting which were based on condition dependencies (not date_conditions) which occurred during the outage.