Multicast problems showing up in Clarity 12.1.1

Document ID : KB000049546
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

A problem was found in Jgroups 2.12.1 (for Clarity 12.1.1) that can cause intermittent blocking that is unnecessary in the context of Clarity.

Due to CLRT-62608, thread deadlocking may occur in 12.1.0 or 12.1.1 of Clarity when JGroups classes running in threads are not able to be resolved.

JGroups is used for our multicast communication and most commonly would be exercised by process engine communication, caching consistency updates(notifications) and by the admin services to discover nodes within a cluster.

Thread deadlocking will cause a thread of execution to appear hung, as JVM is not able to resolve which thread can advance next.

This can manifest in a number of ways, but would be seen in a thread dump as a particular thread is being waited upon by many other threads. (see example below) If a thread dump shows the behavior below, it is likely that the environment is running into this problem. It can also manifest itself as database blocking on SQL Server based installs.

Example:

** note: the classes org.jgroups.protocols.FC and org.jgroups.protocols.FRAG2 are good indicators of this problem, if being executed by the thread that other threads are waiting on.

"Post Condition Transition Pipeline 2" daemon prio=10 tid=0x00002aab3ca5b800 nid=0xf25 waiting on condition [0x0000000043dd3000]
java.lang.Thread.State: TIMED_WAITING (parking)

at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116)
at org.jgroups.protocols.FC.handleDownMessage(FC.java:549)
at org.jgroups.protocols.FC.down(FC.java:423)
at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:215)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:894)
at org.jgroups.JChannel.down(JChannel.java:1623)
at org.jgroups.JChannel.send(JChannel.java:724)
at com.niku.union.utility.SimpleMessenger.broadcast(SimpleMessenger.java:325)
at com.niku.union.utility.SimpleMessenger.broadcast(SimpleMessenger.java:308)
at com.niku.union.utility.caching.CacheMessenger.remove(CacheMessenger.java:45)
at com.niku.union.utility.caching.CacheController.remove(CacheController.java:929)
at com.niku.union.utility.caching.CacheController.remove(CacheController.java:882)
at com.niku.security.cache.UserSessionCache.removeFromPersistence(UserSessionCache.java:294)
at com.niku.security.service.AuthenticationService.delete(AuthenticationService.java:217)
at com.niku.bpm.utilities.BpmUtils.logout(BpmUtils.java:93)
at com.niku.bpm.engine.exprevaluator.ExpressionEvaluator.evaluate(ExpressionEvaluator.java:231)
at com.niku.bpm.engine.rules.PostConditionTransitionPipeline.evaluatePostConditions(PostConditionTransitionPipeline.java:199)
at com.niku.bpm.engine.rules.PostConditionTransitionPipeline.execute(PostConditionTransitionPipeline.java:79)
at com.niku.bpm.engine.rules.Pipeline.run(Pipeline.java:221)

The blockee's could be:

clarity1@app830 threaddumps]$ grep 0x00002aaab87ca690 bg2-6.txt

- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

Solution:

WORKAROUND:

Replace $CLARITY_HOME/lib/jgroups-all.jar with the same file name that is attached to the CLRT-62608. Any Clarity support engineer is able to retrieve this jar file and deliver it.

STATUS/RESOLUTION:

Resolution provided with an updated jgroups-all.jar.
Resolved in Clarity 12.1.1 Generic Patch. Reference TEC553491
Resolved in Clarity 12.1.2

Keywords: CLARITYKB, CLRT-62608, clarity12resolved, clarity1212resolved.