Clarity PPM: One or more jobs are not running immediately or processing

Document ID : KB000003995
Last Modified Date : 24/05/2018
Show Technical Document Details
Issue:

The job scheduler will appear to have halted and it will not process any job(s) in the queue.
The jobs are in the Waiting/Scheduled/Processing/Cancelled status but not changing statuses.

Cause:

This is caused by a disruption in the job scheduler which runs on the background (BG) service.
If the jobs were not paused and/or were not allowed to complete, prior to the disruption, the job scheduler will not recover.

Any action such as server upgrades, database refreshes, outages, connection issues, patch installs will affect the CA PPM job scheduler running on the BG service.

For SaaS customers, a Service Incident Advisory will be sent to primary site contacts.

For On-Premise customers, please check with your internal IT team for any maintenance performed recently.

Resolution:

In order to address the issue where jobs are not processing, please perform the steps in sequential order.

Step 1 - Take the load off the job scheduler

1. Go to Reports and Jobs > Jobs > Scheduled Jobs

a. Filter for Job Status = 'Waiting'
Select the 'Waiting' jobs and click the 'Pause' button.

b. Filter for Job Status = 'Scheduled'
Select the 'Scheduled' jobs and click the 'Pause' button.

2. Filter for Job Status = 'Cancelled'

Make a note of the schedule, as the job will need to be re-entered at a later time. A job in the 'Cancelled' status cannot be resumed, therefore will need to be deleted.

Delete all the 'Cancelled' instances of the following jobs:

-Time Slicing

-Post Timesheets

-Rate Matrix Extraction

-Investment Allocation

-Datamart Extraction/Datamart Rollup

-Load DWH

-Refresh Data Warehouse OData Model (for SaaS customers)

Step 2 - Check if the BG service needs a restart

Run an immediate instance of the Time Slicing job.

a. If the Time Slicing job does go to the 'PROCESSING' state, allow the instance to complete. 

Proceed to Step C.

b. If this Time Slicing job instance does not go to the 'PROCESSING' state immediately, but instead goes into an indefinite 'WAITING' or 'SCHEDULED' state,
proceed to cancel and delete the instance.

Contact the Support team to gather and review the BG and BG-SYSTEM files from all servers in the cluster.

NOTE: A restart of BG services would need to occur if no jobs are able to be processed.


Step 3 - Resume the jobs

1. Select all the 'PAUSED' jobs and set them to 'Resume' to let it get back to the normal schedule.

2. Re-enter the previously cancelled jobs in Step A.

 

Additional Information:

If job runtime has increased for one or jobs after migration to a different database such as Oracle 12.2, the DBA team will need to perform analysis to get the runtime down to where it was prior to the migration.

If a job fails, a subsequent run of the job should result in a completion. By design, the data is rolled back so that prior data is intact.

TIP: Ahead of an upgrade or patch install, the job scheduler should be in a no-load state which will prevent orphan jobs, database sequencing issues, and out-of-sync job starts.

After the planned outage completes, the following can be performed on the Reports and Jobs > Scheduled page, after the BG service(s) are started.
a. Run an immediate Time Slicing job to ensure it completes. Check the Administration > Time Slices page to ensure the 'Last Run' date reflects the most current time.
b. Checkmark all PAUSED jobs and click the RESUME button. The jobs will be autoscheduled.

These steps will ensure a smooth job scheduler startup as it will prevent jobs from getting stuck in the 'Processing', 'Waiting', and 'Scheduled' states.

If a job that is in any of the following status: Scheduled, Waiting, Processing and set to Cancelled, the job instance scheduled effectively has to be rescheduled, as a cancelled job cannot be resumed.