PIM 12.81.0.2688: Control minder again went down on the server even after disabling the file class --

Document ID : KB000100169
Last Modified Date : 21/06/2018
Show Technical Document Details
Question:
PIM 12.81.0.2688 is not working for us.  We had also made these changes to see if it helps.

Token change of the restart enabled from 0 to 1 under the Report Agent 
env config 
sr config accommon.ini section(ReportAgent) token(restart_enabled) 
er config accommon.ini section(ReportAgent) token(restart_enabled) value(1) 

Special Programs 
nr specialpgm /sbin/vxconfigd pgmtype(fullbypass) owner(nobody) 
nr specialpgm /sbin/multipathd pgmtype(fullbypass) owner(nobody) 
nr specialpgm /opt/hp/hp-snmp-agents/nic/bin/hpetfe pgmtype(fullbypass) owner(nobody) 
nr specialpgm /opt/hp/hp-snmp-agents/nic/bin/cmanicd pgmtype(fullbypass) owner(nobody) 
nr specialpgm /opt/hp/hp-snmp-agents/utils/usb-device.sh pgmtype(fullbypass) owner(nobody) 
nr specialpgm /opt/surapps/java/jdk1.7.0_17/bin/java pgmtype(fullbypass) owner(nobody) 
nr specialpgm /opt/surapps/faircom/ctree-ace/v11.2.1.16099.170104/bin/ace/sql/ctreesql pgmtype(fullbypass) owner(nobody) 

To disable the FILE class from selang please try the following: 

selang>SO CLASS-(FILE) 
(localhost) 
Successfully updated CA ControlMinder options 

and selang> so list should show the following: 
FILE : No 

/etc/seos.ini:
Blocking Interval to 160 

AgentManager = yes, /opt/CA/AccessControlShared/lbin/agent_manager.sh start 
to 
AgentManager = no 

agent_manager_check_enabled = yes 
to 
agent_manager_check_enabled = no 
Answer:
You are experiencing a lot of communication timeout messages, as shown below, prior to the server crashing, where seoswd is repeatedly trying to restart seosd. 
seoswd: Communication time out to seosd. Executing seosd 
seoswd: Communication time out to seosd. Executing seosd 
seoswd: Communication time out to seosd. Executing seosd 

When we introduced multi-threading to seosd (build 2097), we saw issues where seoswd was unable to restart it properly when there was a communication timeout.  Because seoswd could not kill the problem seosd process successfully, it would keep re-trying to kill it.  Eventually, the repeated attempts to kill seosd and restart it would lead to database corruption and even server hangs/crashes. This behavior is identical the issue you are experiencing:
seosd[37592]: Database seems corrupted, rebuilding it 

The way SE fixed this behavior was by disabling multi-threading in seosd with fix T3E7240 (build 2919). After disabling multi-threading, seoswd has been able successfully restart seosd without running into the issues which were occurring with a multi-threaded seosd.