Performance issues related to large Backlog count

Document ID : KB000088051
Last Modified Date : 24/04/2018
Show Technical Document Details
Issue:
Error Message :
N/A

It is a known issue that allowing the Backlog to grow to 3,000 or more Jobs can cause performance degradation. This will result in the one or more, but not limited to the following behaviors:
  • Jobs stuck in QUEUED status
  • Slow Job status changes
  • Finished Jobs are delayed to drop into History
  • Slowness in overall Java Web Client refresh
  • Jobs not processing
  • A continued increase in backlog count
  • Awapi times out during startup
  • Inability to launch the Java Web Client
Investigation

Applications Manager is a 32-bit single threaded application that only has a limited number of Java threads that can be used for database/agent communication, sub-var resolution, Job scheduling/status updates, condition handling, Java Web Client refreshes, etc.

It is highly recommended to avoid any situation where thousands of Jobs are immediately scheduled into the Backlog or to allow the Backlog to grow to 3,000 or more Jobs.  Doing so will use up all available Java threads causing one or more of the above symptoms.

NOTE: Keep in mind that the 3000 Job threshold is ONLY an estimate. This number can vary greatly depending on hardware, network, and database performance as well as types of Jobs in the Backlog.
Environment:
OS Version: N/A
Cause:
Limited or no available Java threads due to large backlog count
Resolution:
Avoid any situation where the 1000's of Jobs are immediately scheduled into the Backlog or to allow the Backlog to grow to 3,000 or more Jobs. If an Applications Manager instance is found to be in this situation and the backlog count continues to grow, a number of actions can be taken such as:
  1. Deactivating high frequency Jobs
  2. Deleting high frequency Jobs stuck in SELF-WAIT
  3. Pausing Queues and releasing them slowly one by one
  4. Deleting Jobs that can be rescheduled to a later time
  5. Spreading out Job Schedules to run throughout the day instead of all within a small timeframe
  6. Cleaning up Jobs that are older then 1 Virtual Day.
  7. Re-engineering Jobs to streamline the unnecessary use conditions and notifications.
  8. Increasing Java memory
  9. Clearing out the backlog via sql (Please contact Support for sql)


Fix Status: No Fix

Fix Version(s):
N/A