Default compile of Apache does not lead to an efficient web server/web agent deployment

Document ID : KB000051211
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

Apache by default will build on Unix with "pre-fork" mode for its worker threads, this does not use threads, and creates an inefficient interface to the policy server.

The better solution is to select the "worker" mode for the Apache build, which will then use a threading module. However often people are stuck using the "pre-fork" mode, since they cannot upgrade immediately. This article includes some methods for minimizing the impact.

Solution:

Overview of Problem

With regards to working better with Apache pre-fork connections, as Apache themselves recommend:

The server can be better customized for the needs of the particular site.
For example, sites that need a great deal of scalability can choose to
use a threaded MPM like worker, while sites requiring stability or
compatibility with older software can use a prefork. In addition, special
features like serving different hosts under different userids (perchild) can be provided.

So it would be a good goal to change a setup from the pre-fork, or no-threaded module, to the worker threaded processing module.

However with SiteMinder the pre-fork module also causes several additional issues, mainly :

  1. Many excess socket connections from web server to the policy server

    Apache process will create separate connections to the policy server for each process, and it will tend to renew those processes regularly, often in peak times a process will handle one transaction then die, each process will connect as specified in the Host Config Object, so each process may startup, perform a handshake with the policy server, connect 2 sockets to each policy server in the cluster for load balancing, process one request then die.

    It is not unusually to have 90-200 Apache processes active at one time, each with 2 connections to each policy server, and the web server cycling though 40 different Apache child processes per

  2. The Policy Server will do a lot of additional processing

    Each initial connect request from a web agent is a high priority message, and a new policy server thread is allocated to handle it (up to a maximum). So often on these Apache/prefork environments the policy server has MAX_THREADS allocated even though most of them will actually be idle.

    In addition there is the overhead of handling the handshake for each process.

  3. A lot of sockets on the policy server are in the TIME_WAIT state

    SiteMinder is geared to expect repeated use of the same socket, Apache when it closes a child process does not seem to send a final close to the policy server, so as a rule with Apache pre-fork clients, the policy server tend to have a lot of TIME_WAIT connections, and a lot of IDLE timeout messages as the policy server closes them every 10min.

    This can be a strain on the MAX_CONNECTIONS setting for both SiteMinder and for the hardware, although a lot of sockets are open, most have timed out and are waiting for closure.

  4. A restart of the policy server can cause a flood of new requests on the policy server

    Since each process is independent, when a policy server restarts they, all perhaps 100 on each web server will try and reconnect to the policy server, this can cause an avalanche of new connections, overloading the policy server. What tends to then happen is the client's timeout, the socket remains open and the client tries a new connection, which usually makes the startup process worse. If your policy server fails several times before it restarts successfully then this may be what is happening in your case.

Why is worker mode better

Apache using "worker" mode still uses process, but each process also has about 20 worker threads. The thread pool can go up and down, but the number of process is both smaller and from a SiteMinder viewpoint, there is only one handshake per process, and all the threads in a pool share the connections to the policy server so they actually get re-used, and also load balanced between policy servers.

Why is Apache pre-fork the default

Unfortunately the out of the box, will compile on any Unix platform setting for an Apache build is pre-fork. It works on all Unix platforms because it does not use any threads. Threads are however now fairly consistent across most platforms so hopefully the default will be changed in the near future.

I am stuck with pre-fork for now what can I do?

The best solution is to use Apache worker mode, not pre-fork however that usually requires a re-build of the Apache executable.

If you are stuck with pre-fork mode for a while there are two major setting that will help reduce the problems that it causes:

  1. Change the Host Config Object

    Now, the only agent log is for trusted host that I have is:

    hostname=''. maxsocketsperport='20'.minsocketsperport='2'. newsocketstep='2'.policyserver='policyserver1,44441,44442,44443'.policyserver='policyserver2,44441,44442,44443'.requesttimeout='60'
    The minsocketperport and newsocketstep should be reduced to 1.

    minsocketsperport='1'.newsocketstep='1'.
    This will ensure that for your 100 Apache child process each will initially only establish 1 connection to each policy server, rather than two, since each process is only handling one request at a time, there is no advantage of having more than one connection to each policy server. That will halve your connections.

    In addition if you can set the Host Config Object to perform failover, often it is easiest to have two or more HCO objects each failing over in a separate sequence through the policy server farm, that will further reduce your startup time for each new Apache process, and number of connections.

  2. Change the socket timeout

    The default setting is 10min, again since the majority of these socket connections are apache ones waiting for a timeout, delivering the timeout earlier will reduce the overall number of socket we have opened at one time.

    Halving the time to 5min timeout reduces the total number of TIME_WAIT sockets resulting from IDLE connection from the web agents.