Timeout putting events on internal event handler queue. This usually indicates an overloaded system.

Document ID : KB000110037
Last Modified Date : 14/09/2018
Show Technical Document Details
Issue:
After running for a 10 hour period I started to get error and overloaded systems in the coordinator.

22018-06-15 04:12:31,908Z (23:12) [Event Sink Thread Pool Thread 5] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric 
com.itko.lisa.test.EventDeliveryException: Timeout putting events on internal event handler queue. This usually indicates an overloaded system. 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:244) 
at com.itko.lisa.stats.MetricControllerImpl.fireTestEvent(MetricControllerImpl.java:580) 
at com.itko.lisa.stats.MetricControllerImpl.eventReceipt(MetricControllerImpl.java:523) 
at com.itko.util.EventThread.execEvents(ThreadedEventSink.java:79) 
at com.itko.util.EventThread.run(ThreadedEventSink.java:59) 
2018-06-15 04:12:31,909Z (23:12) [Event Sink Thread Pool Thread 5] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric 
java.lang.IllegalStateException: Could not put anything new on the event queue 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:249) 
at com.itko.lisa.stats.MetricControllerImpl.fireTestEvent(MetricControllerImpl.java:580) 
at com.itko.lisa.stats.MetricControllerImpl.eventReceipt(MetricControllerImpl.java:523) 
at com.itko.util.EventThread.execEvents(ThreadedEventSink.java:79) 
at com.itko.util.EventThread.run(ThreadedEventSink.java:59) 
Caused by: java.lang.InterruptedException 
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) 
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) 
at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:380) 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:242)
Resolution:
Add these two properties to the local.properties on the DevTest 10.3.0 Coordinator machine: 

lisa.eventPool.maxQueueSize=131070 
lisa.pathfinder.on=false 

The Coordinator will have to be restarted to pick up the new properties.

The timeout issue is not really a bug, it is an indication that the system is overloaded. Updating the lisa.eventPool.maxQueueSize will not fix the problem but will provide more resources to the system so that the timeout errors are delayed. 

It may be useful to engage our professional services for further investigation.