List of DevTest Error Messages or Service Degrading Indicators to Base Splunk Alerts On

Document ID : KB000117380
Last Modified Date : 12/10/2018
Show Technical Document Details
Question:
Currently, have a few error messages Splunk has been configured to look for to determine if the Devtest Coordinator/Registry etc are becoming unstable. We would like to know if these messages are still valid search terms and if there are additional error messages which we should look for to identify potential issues requiring admin assistance.  This is to continue proactive support for our DevTest implementation.

alert1: cadevtest_registry_out_of_memory
index=app_rental_onprem sourcetype=rental_ca_registry java.lang.OutOfMemoryError

alert2: cadevtest_coordinator_overloaded_system
index=app_rental_onprem sourcetype=rental_ca_coordinator "Timeout putting events on internal event handler queue. This usually indicates an overloaded system."

alert3: cadevtest_cannot_stage_the_instances
index=app_rental_onprem sourcetype=rental_ca_registry "java.lang.OutOfMemoryError: GC overhead limit exceeded"

alert4: cadevtest_CoordinatorOutofMemory
index=app_rental_onprem sourcetype=rental_ca_coordinator "java.lang.OutOfMemoryError"

 
Answer:
Suggested messages:

Coordinator alert:
"Error capturing report data for Database: The database writer cannot keep up with the number of events being generated 
com.itko.lisa.test.EventDeliveryException: The database writer cannot keep up with the number of events being generated " 

Broker alert:
"Heap size approaching limit" 

Enterprise Dashboard or Registry alert:
"java.sql.SQLException: Connections could not be acquired from the underlying database!" 

Registry alert:
"Error in query Solr" 

Registry, Coordinartor or VSE alert:
"java.lang.OutOfMemoryError: unable to create new native thread"