ITPAM_RT - Deadlock on db

Document ID : KB000094257
Last Modified Date : 03/05/2018
Show Technical Document Details

Designed a process definition that queries the Process Automation RUNTIME database to check to see if a specific process is already running.   This process works sometimes, and other times fails.   reviewing the logs shows that this process fails due to deadlocks in the database:

2018-05-02 11:23:09,763 WARN [com.optinuity.c2o.service.serviceoperation.jdbcserviceoperation.JDBCWrapper] [abc77795-79b1-4529-99f1-84433d84c1a5] Error reading result set: Transaction (Process ID 432) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. Transaction (Process ID 432) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Locking at the database is a normal function of the database. If 2 queries update the same record the data could be corrupted, so the first query gets a lock and subsequent queries to the same location is locked out.
Normally this is all behind the scenes and follow-up queries are locked very briefly while the prior query completes, then the 2nd query moves forwards.

Deadlocks begin to occur as more queries hit the database, and 2 queries end up both waiting for each other to complete. You can get summary from your DBA or through internet searches on deadlocking and why it occurs. 

We have optimized the Orchestrator engine so that outside of situations related to database performance, the Orchestrator should not normally encounter Deadlocking during normal instance execution.
A discussion should be had with the DBA to ensure that this isnt occurring due to the database taking excessive time for all queries.

But, the problem is likely due to the fact that the Process Automation runtime database is almost always being used by the Orchestrator engine while instances are running, so outside queries are going to lock, and very possibly deadlock.

For example:
The Orchestrator starts Process 'Instance123'; creates the database rows in the Runtime Database and has a lock on the rows for 'instance123' as the process is moves through its workflow.

Then another instance is launched that contains a queries to check if 'Instance123' is running.

When the instance reaches out to check the tables for 'instance123' they are locked by the Orchestrator to run the instance.   If the locking continues, the lock could become a deadlock, in which case the database will typically attempt to kill the youngest, or last run query, to allow the initial query to complete. This will result in the deadlock failure messages in the logs and a problem within the instance run that is running the query to check the database.

Unfortunately there is no way we can guarantee the Orchestrator is not locking tables when an external query is submitted.  If you query a table the orchestrator is using, that query will be locked out, and under some certain circumstances that lock will become a deadlock.

Support would recommend:
1.  Work with the DBA to ensure the database is performing optimally;

2.  If the database query is an occasional query that isnt run all that often then some simple exception handling added to the process to retry the database query after a short period would probably be enough to resolve this;

3. If you are frequently running the query our best recommendation would be to use one of the APIs instead of direct database queries into the runtime tables to get data on instance runs.

For instance, the getProcessStatus method;

or one of the Open API REST based methods, such as /v1/processinstances

will allow you to query the engine for details on a particular instance (or all instances) without having to worry about the tables being locked at the database.