Most tasks submitted are stuck in the "in progress" state

Document ID : KB000030193
Last Modified Date : 14/02/2018
Show Technical Document Details

Problem description

Almost all tasks submitted through the GUI are stuck "in progress", irrespective of whether the server is restarted

The server.log in the ENTM is showing the following errors

2014-06-26 00:18:00,553 ERROR [org.quartz.impl.jdbcjobstore.JobStoreTX] MisfireHandler: Error handling misfires: Couldn't store trigger: ORA-01438: value larger than specified precision allowed for this column org.quartz.JobPersistenceException: Couldn't store trigger: ORA-01438: value larger than specified precision allowed for this column [See nested exception: org.quartz.JobPersistenceException: Couldn't store trigger: ORA-01438: value larger than specified precision allowed for this column[See nested exception: java.sql.SQLException: ORA-01438: value larger than specified precision allowed for this column]] at
org.quartz.impl.jdbcjobstore.JobStoreTX.doRecoverMisfires(JobStoreTX.java:1354) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreS
upport.java:2449) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:2468)
* Nested Exception (Underlying Cause) ---------------org.quartz.JobPersistenceException: Couldn't store trigger: ORA-01438: value larger than specified precision allowed for this column [See nested exception: java.sql.SQLException: ORA-01438: value larger than specified precision allowed for this column]

 2014-06-26 14:18:04,123 ERROR [ims.llsdk.role.az.search] isUserMatchForDirectoryRule for user "CN=etrustadmu,OU=UnixGroups,OU=UNIX,DC=dmz,DC=prd" and rule "<MemberRule><Group name="CN=etrustadmview,OU=UnixGroups,OU=UNIX,DC=dmz,DC=prd"/></MemberRule>" throws error: [facility=4 severity=3 reason=0 status=6 message=Unrecognized command]
2014-06-26 14:18:04,124 ERROR [ims.llsdk.role.az.search] [facility=4 severity=3 reason=0 status=6 message=Unrecognized command] at com.netegrity.llsdk6.imsimpl.directory.jndi.JndiSearcher.isUserInGroups(Jndi Searcher.java:1722)

It was also observed that the audit queue had over a million messages and a size of 1.1 Gbyte

Solution:

  • The first error observed  may explain the Feeder Folder Polling task that is stuck in progress. This is a known issue and it has been resolved as part of the 12.6 SP1 CF3 release. The solution is thus to apply that CF3 release
  • The second error seen may explain the Modify policy task, with a Policy stuck in progress. To solve this it is necessary that the second group, in this case CN=etrustadmview,OU=UnixGroups,OU=UNIX,DC=dmz,DC=prd exists and that the CN=etrustadmu,OU=UnixGroups,OU=UNIX,DC=dmz,DC=prd is assigned to this group
  • Besides this, having a large message queue may impact performance to the point of hampering communication. So it is recommended to purge the queue by performing the following actions:
    • Connect to tibco using tibemsadmin (from the TIBCO program file)
    • Log in to tibco using connect ssl://<your_machine>:7243 or connect tcp://<your_machine>:7222
    • Show the queues: show queues. The output will be something similar to this

ssl://cmmanager:7243> show queues
  Queue Name                        SNFGXIBCT  Pre  Rcvrs     Msgs    Size
  >                                 ---------    5*     0        0     0.0 Kb
  $sys.admin                        +--------    5*     0        0     0.0 Kb
  $sys.lookup                       ---------    5*     0        0     0.0 Kb
  $sys.undelivered                  +--------    5*     0        0     0.0 Kb
* $TMP$.CMMANAGER.4C055238CE5E9.1   ---------    5      1        0     0.0 Kb
  ac_endpoint_to_server             +--+-----    5*     1        4     1.6 Kb
  ac_server_to_endpoint             +--+-----    5*     3        1     0.6 Kb
  ac_server_to_server               +--+-----    5*     3        0     0.0 Kb
  ac_server_to_server_local         +--------    5*     1    32369    11.9 MB
  com.netegrity.ims.msg.queue       +--+-----    5*     0        0     0.0 Kb
  queue.sample                      ---------    5*     0        0     0.0 Kb
  queue/A                           ---------    5*     0        0     0.0 Kb
  queue/audit                       +--+-----    5*     0     1622     1.1 MB
  queue/B                           ---------    5*     0        0     0.0 Kb
  queue/DLQ                         ---------    5*     0       19     9.2 Kb
  queue/snapshots                   +--+-----    5*     0        0     0.0 Kb
  RuntimeStatusDetailQueue          +--+-----    5*     0        0     0.0 Kb
  sample                            ---------    5*     0        0     0.0 Kb 

  • For the queues that show a large number of pending messages issue purge queue <queue_name>. For instance purge queue queue/audit
  • Restart CA PIM in the server

This should allow the errors to be resolved.