MTP TIM worker process load balancing via the balancer.cnf file configuration.
The 9.7 wiki page has some useful advice for the MTP TIM worker process load balancing.
How to Connect CA APM to Multi-Port Monitor : TIM in Multi-Process Mode
- Multi-Process mode is automatically set
- The OOTB balancer.cnf file ensures that the number of workers is set to the number of CPU cores available and "shared=client" is assumed for the default mode
If customization of the balancer.cnf is needed, then consider the following:
The TIM logs protocolstats*.csv files "cpu-use" columns for each of the 16 workers can be viewed to check the load. As a general rule, as long as values do not exceed 90% for several successive 5 minute intervals, then the TIM is not overloaded.
Client IP address:
Normally "shared=client" is adequate for most environments ensuring that the same worker process monitors all packets belonging to the same business transaction instance. It also provides load distribution across the workers.
Server IP address:
"shared=server": If the TIM is positioned behind a proxy server or load balancer, so all traffic appears to be coming from the same client IP address, then changing "shared=client" to the Server IP Address option "shared=server" is required. This ensures connections are distributed among the 16 TIM workers. Otherwise all connections will be given to a single worker process and cause the TIM to crash.
- It is possible to assign a specific worker to a server. However, this is limited to one worker per server because packets from a single web server IP on multiple workers cannot be processed.
- Even if "shared=server" is set, with only 2 web servers, then only 2 workers will be loaded (at best) no matter how many workers are configured. If the number of different IP addresses used to do routing is small compared to the number of workers, then it will likely result in unloaded workers due to the randomness of hashing algorithm not being uniform enough for a small set. Unloaded workers still use some base CPU causing an unnecessary overhead. (That is why the restriction of one worker per server.)
- For a datacenter with one highly loaded web server A, and low loaded web servers B,C,D, then a profile can be created with 2 workers, with web server A assigned to worker 1 and web servers B,C,D assigned to worker 2. There is no benefit to even more workers in this scenario, as they would just use additional CPU or would be underloaded.
- The number of workers ("workers N") should not be greater than the number of CPU cores available and based on CA's internal testing. So choosing a smaller number of workers might give the best results. This is because not all the work is parallelized between workers, and each worker imposes a fixed CPU cost that is not shared. Adding more workers will increase base CPU usage on the server even if these additional workers are not loaded.
- Going from 1 worker to 2 has a great benefit, but it will not double the throughput. Also with each additional worker the relative throughput benefit will get smaller and smaller, while CPU usage still grows linearly. Going from 15 workers to 16 may give no significant additional benefit in throughput and will just raise CPU usage.
- The optimum number of workers would depend on traffic profile and distribution and number of cores available. The recommendation is to try different settings on larger servers, but on a 16 core system, the optimum number of workers would probably be between 6 and 10.