What is the work being done by the harvesting process with respect to MOM? Is the high harvest capacity causing calculators to run with high harvest time?

Document ID : KB000031045
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

This article covers EM Performance and the Harvest Cycle.

 

Solution:

Q: What is the work being done by the harvesting process with respect to MOM? Is the high harvest capacity causing calculators to run with high harvest time?

A: It is the reverse: high calculator harvest time leads to high harvest duration resulting in much of the harvest capacity being used. The target is for Harvest Duration (and SmartStor Duration, too) to stay well below 3.5 seconds to satisfy ad-hoc queries.

 

The harvest cycle is a complex real-time loop with many dependencies:  Management Module calculators, Javascript calculators, built-in calculators, alert processing, etc.

Some of it is broken down in the metrics under Internal and Internal|Harvest. One Javascript calculator can raise the harvest time to above 10s resulting in many aggregated time slices and a totally unresponsive EM cluster. To check this you can remove all JS calculators from the scripts. Consequently, if harvest and calculator time drop significantly, then add them back one by one to identify the culprit(s). It could also be too many queries, e.g. from integrations.

 

Below are probable causes of EM performance degradation and the corresponding indicator metrics:

 

Agents Sending Too Many Metrics and/or Leaks.

  • Number of Metrics Handled
  • Harvest: Metrics From All Agents     (note: before clamping)
  • Number of Historical Metrics (going up)

Calculators/Alerts matching too many metrics.

  • Calculators: Total Number of Evaluated Metrics
  • Alerts: Total Number of Evaluated Metrics

Too Many Ongoing Queries or Transaction Trace Events.

  • Number of Registered Async Data Queries
  • Connections: Number of Events Processed

Too Many Broad Historical Queries.

  • SmartStor Queries Per Interval  / Cached Queries Per Interval
  • Metric Matches per Interval