Hi, We're seeing intermittent high responses in our PROD gateway cluster, for specific services. After drilling down in our monitoring tool, we've identified that the method com.l7tech.server.transport.http.TimeoutInputStream:read sometimes takes a considerable amount of time. What is this method responsible for?
Inbound requests are attached to an input stream which wraps a TimeoutInputStream, This handles and tracks all read operations and the number of bytes read. Every read (i.e. native call to socket read) entry and exit is being tracked by the TimeoutInputStream.
If we look at this from a timeout perspective this is the logic:
There is a dedicated "InputTimeoutThread" that loops through all unread TimeoutInputStream’s in the queue and checks if certain blocked reads took longer than io.timeout, and if it did then it marks this stream as "timed-out", so that when the blocked read finally exits, or the next TimeoutInputStream read operation enters, it would fail with TimeoutIOException (i.e. "Stream timeout").
In addition InputTimeoutThread also checks for read rate by measuring how many bytes per second were actually read. This check starts after the configured io.rateTimeout milliseconds (in a sense guaranteeing there is enough data to make a valid rate check).
Once all bytes are read, an exception is thrown during read, the stream is closed or the stream is marked as timed-out, then the stream is removed from the InputTimeoutThread queue.
The following cluster properties (or optional system properties) can tweak the TimeoutInputStream triggers:
1. io.timeout (or com.l7tech.server.timeout) defines the IO timeout for incoming request reads (in milliseconds) - default: 60000
2. io.rateLimit (or com.l7tech.server.rateLimit) defines the minimum permissible rate for incoming requests (in bytes per second) - default: 1024, in combination with
3. io.rateTimeout (or com.l7tech.server.rateTimeout) defines the IO timeout for incoming request rate checking (in milliseconds) - default: 60000