Data Aggregator hangs ever 24-48 hours due to excessive Java Heap memory consumption

Document ID : KB000096492
Last Modified Date : 01/06/2018
Show Technical Document Details
Issue:
Every 24-48 hours the Data Aggregator shows Synchronization failures in the CA Performance Management UI. When investigating the Data Aggregator dadaemon java process we observe it's consuming 100% of memory in heap. Only a restart of the dadaemon service returns the system to normal functionality.
Environment:
CA Performance Management GA release r3.5 and r3.6
Cause:
Filter in a Monitoring Profile was applied during Change Detection. It was hitting a large number of elements and causing a memory consumption problem due to the inefficient handling of the filter. 
Resolution:
Complete Resolution

This is resolved in the following Monthly Update Kits.
  • r3.5 June Monthly Update Kit
  • r3.6 July Monthly Update Kit
The Monthly Update Kits contain defect fixes only. They are only available upon request from Support. To request one of these patch kits please open a new Support Case referencing this Knowledge Base article.

Temporary Workaround

If upgrading isn't an immediate option to resolve this, a restart of the Data Aggregator will allow the system to function again until the next hang.

Stop and restart all services on CAPC, then do a full sync. If this doesn't resolve the issue, then do the following:

1. Stop activemq and the dadaemon on DA. 

/etc/inid.d/activemq stop 
/etc/init.d/dadaemon stop 

2. Remove or move the data directory. To remove: 
rm -Rf /opt/IMDataAggregator/apache-karaf-2.3.0/data 
To move: 
mv /opt/IMDataAggregator/apache-karaf-2.3.0/data /opt/IMDataAggregator/apache-karaf-2.3.0/data.old 

3. Remove the local-jms-broker.xml file.
rm /opt/IMDataAggregator/apache-karaf-2.3.0/deploy/local-jms-broker.xml 

4. Start the DA (note the dadaemon start will start the activemq process also)
Additional Information:
Resolved via defect DE362699