Tomcat Agent leads to 100% CPU usage

Document ID : KB000100279
Last Modified Date : 06/06/2018
Show Technical Document Details
Issue:
This is an example stack trace showing the thread where most of CPU is used

com.wily.introscope.stat.gatherer.IntegerFluctuatingCounterGatherer.increment(IntegerFluctuatingCounterGatherer.java:77) com.wily.introscope.agent.stat.IntegerFluctuatingCounterDataAccumulator.IIntegerFluctuatingCounterDataAccumulator_increment(IntegerFluctuatingCounterDataAccumulator.java:82) <1> com.wily.introscope.agent.trace.ConcurrentInvocationCounter.ITracer_startTrace(ConcurrentInvocationCounter.java:116)
com.wily.introscope.agent.trace.InvocationData.IMethodTracer_startTrace(InvocationData.java:1356)
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java)
oracle.jdbc.driver.NTFConnection.readFromNetwork(NTFConnection.java:260)
Environment:
Information relevant to all supported agent releases
Resolution:
Let's look at the first line in the stack trace where Introscope code is mentioned and the preceding line

com.wily.introscope.agent.trace.InvocationData.IMethodTracer_startTrace(InvocationData.java:1356)
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java)

This identifies NIO as an area to investigate.
We can check toggles-typical.pbd and java2.pbd for areas of interest.

TurnOn: NIOSocketTracing
TurnOn: NIOSocketSummaryTracing
TurnOn: NIOSelectorTracing
TurnOn: NIODatagramTracing
TurnOn: NIODatagramSummaryTracing

Since we see the read method in the stack trace, we check for trace directives on the read method, all of the above tracer groups, apart from NIOSelectorTracing include the read method.

TraceOneMethodWithParametersIfFlagged: NIOSocketTracing read NIOSocketBandwidthTracer "Input Bandwidth (Bytes Per Second)"
TraceOneMethodWithParametersIfFlagged: NIOSocketTracing read NIOSocketBackendTracer "Backends|System {hostname} on port {port}"
TraceOneMethodWithParametersIfFlagged: NIOSocketSummaryTracing read NIOSocketSummaryBandwidthTracer "NIO|Channels|Sockets:Input Bandwidth (Bytes Per Second)"
TraceOneMethodIfFlagged: NIOSocketSummaryTracing read NIOSocketSummaryConcurrentInvocationCounter "NIO|Channels|Sockets:Concurrent Readers"
TraceOneMethodWithParametersIfFlagged: NIODatagramSummaryTracing read NIODatagramSummaryBandwidthTracer "NIO|Channels|Datagrams:Input Bandwidth (Bytes Per Second)"
TraceOneMethodIfFlagged: NIODatagramSummaryTracing read NIODatagramSummaryConcurrentInvocationCounter "NIO|Channels|Datagrams:Concurrent Readers

TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read NIODatagramBandwidthTracer "Input Bandwidth (Bytes Per Second)"
TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read NIODatagramCountTracer "Datagrams In"
TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read NIODatagramConcurrentInvocationCounter "Concurrent Readers"
TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read NIODatagramBackendTracer "Backends|System {hostname} on UDP port {port}"
TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read DatagramClampManager

So the first step to confirm would be to comment out all tracer groups related to NIO.
Also observe the expected metric names, for example compare:

TraceOneMethodWithParametersIfFlagged: NIOSocketTracing read NIOSocketBackendTracer "Backends|System {hostname} on port {port}"
TraceOneMethodWithParametersIfFlagged: NIODatagramTracing read NIODatagramBackendTracer "Backends|System {hostname} on UDP port {port}"

Look at the metrics you are receiving for the agent and see if there is a mention of a UDP port, if there is, you can assume the NIODatagramTracing group needs to be commented, otherwise it should be NIOSocketTracing.

If you can identify one tracer group in particular from the toggles file, then check the relevant entries in java2.pbd for that tracer group. Select only those relevant to the read method.
Attempt to isolate the one trace directive that causes the performance issue.

Review the metrics you are receiving once the culprit is determined. If you are receiving enough metrics, no further investigation is required.
If you have a strong need to report the metrics we have been required to disable, please contact CA Support with agent logs, profile and details of JVM if not present in the logs. If you can also indicate through screenshots how many responses there are on the socket, it can give us an indication of how busy this communciation channel is.
Additional Information:
The same procedure can be used for any performance or CPU issue, look for the first/lowest mention of Introscope in the stack trace and check the class/method on the line below it.
Then use that information to investigate in the pbds for possible tracer groups that could be commented out.