Timeout putting events on internal event handler queue. This usually indicates an overloaded system.

Document ID : KB000110037
Last Modified Date : 08/10/2018
Show Technical Document Details
Issue:
After running for a 10 hour period I started to get error and overloaded systems in the coordinator.

22018-06-15 04:12:31,908Z (23:12) [Event Sink Thread Pool Thread 5] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric 
com.itko.lisa.test.EventDeliveryException: Timeout putting events on internal event handler queue. This usually indicates an overloaded system. 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:244) 
at com.itko.lisa.stats.MetricControllerImpl.fireTestEvent(MetricControllerImpl.java:580) 
at com.itko.lisa.stats.MetricControllerImpl.eventReceipt(MetricControllerImpl.java:523) 
at com.itko.util.EventThread.execEvents(ThreadedEventSink.java:79) 
at com.itko.util.EventThread.run(ThreadedEventSink.java:59) 
2018-06-15 04:12:31,909Z (23:12) [Event Sink Thread Pool Thread 5] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric 
java.lang.IllegalStateException: Could not put anything new on the event queue 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:249) 
at com.itko.lisa.stats.MetricControllerImpl.fireTestEvent(MetricControllerImpl.java:580) 
at com.itko.lisa.stats.MetricControllerImpl.eventReceipt(MetricControllerImpl.java:523) 
at com.itko.util.EventThread.execEvents(ThreadedEventSink.java:79) 
at com.itko.util.EventThread.run(ThreadedEventSink.java:59) 
Caused by: java.lang.InterruptedException 
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) 
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) 
at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:380) 
at com.itko.lisa.simulator.EventHandler.testEvent(EventHandler.java:242)
Resolution:
Add these two properties to the local.properties on the DevTest 10.3.0 Coordinator machine: 

lisa.eventPool.maxQueueSize=131070 
lisa.pathfinder.on=false 

The Coordinator will have to be restarted to pick up the new properties.

The timeout issue is not really a bug, it is an indication that the system is overloaded. Updating the lisa.eventPool.maxQueueSize will not fix the problem but will provide more resources to the system so that the timeout errors are delayed.