How to properly restart the CA Performance Management environment

Document ID : KB000032043
Last Modified Date : 25/07/2018
Show Technical Document Details
Introduction:

The specific commands and correct order to restart the different CA Performance Management component server services.

Background:

There are times when the system that a CA Performance Manager (CAPM) component runs on requires a restart or reboot.

Often it is required for things like regular system maintenance, OS patches, or any other of a variety of reasons.

When this is done there are some steps required to ensure the health of the system is maintained. This helps ensure a trouble free restart.

Environment:
All supported CA Performance Management releases
Instructions:
Data Aggregator (DA) Host
The DA runs  the dadaemon and activemq services on the DA host. When restarting it is recommended you run both the stop and start commands even if the process are not running. 
Stopping and Starting the DA dadaemon service on its own is done using the 'service' command.
Run it as the root, or sudo root, user that owns the installation (the same user that ran the installation).
Check the activemq and/or dadaemon service status:
service dadaemon status
service activemq status

Stop and then restart the services in the following order:

Stop the dadaemon service:

service dadaemon stop
If still running stop the activemq service:
service activemq stop
Start the dadaemon service:
service dadaemon start
Note that the activemq service should start when the dadaemon service is started.

Per the user guides it is recommended that the dadaemon service have a cron job configured to automatically restart it after 60 seconds should it go down for some reason on its own. Its very important to note this because if the goal is to stop the Data Repository (DR) DB the DA dadaemon service must be shut down first to close any open connections it has with the DR DB. If this cron job is configured, before stopping the dadaemon, disable or remove the cron job to prevent the dadaemon from restarting on its own.

 
Data Collector (DC) Host
The DC runs  the dcmd and activemq services on the DC host. When restarting it is recommended you run both the stop and start commands even if the process are not running. 
Stopping and Starting the DC dcmd and activemq services on their own is done using the 'service' command.
Run it as the root, or sudo root, user that owns the installation (the same user that ran the installation).
Check the activemq and/or dcmd service:
service dcmd status
service activemq status

Stop the dcmd service:
service dcmd stop
If still running stop the activemq service:
service activemq stop
Start the dcmd service:
service dcmd start
Note that the activemq service should start when the dadaemon service is started.
 
Data Repository (DR) Host
The DR host only requires the DB be stopped and shut down prior to system restarts. The Vertica processes that run the DB can be left as is during a restart. The key detail is ensuring the DR DB is stopped. Also, as noted above in the DA section, the dadaemon service for the DA MUST but shut down before stopping the DR DB. If this is not done an error will be seen. When trying to stop the DB via the adminTools utility, if the DA remains running a message will be seen that states:
"Error: NOTICE 2519: Cannot shut down while users are connected"
If that message is observed in a popup message in adminTools when trying to step the DB, check to ensure the dadaemon service is shut down and not running.
To shut down the DR DB (can be done from any active node in a multi-node cluster):
Log into the DR DB host as the dradmin or equivalent DB admin user created during installation.
Go to the /opt/vertica/bin directory and run:
./adminTools
If the environment is configured properly you may alternatively run this from any location:
/opt/vertica/bin/adminTools
Choose option 4 "Stop the Database".
Choose the DB name to stop. Standard environments should only have one entry.
Enter the password and wait for the DB to stop.
When complete in the main menu for adminTools, select option 1 "View Database Cluster State" to ensure all nodes show as down before they are rebooted.
To start the DB again after the host is rebooted, if not automatically restarted, choose option 3 "Start the Database", choose the DB to restart, enter the correct password and wait for the restart to complete.
Once more to check status, choose option 1 "View Database Cluster State" to ensure all nodes show as up post reboot.
Note that while the DB will not allow shut down while open connections from the dadaemon exist, once that is shut down, we have seen instances where the DB didn't fully complete shut down despite the status in adminTools showing as down. This is known as a 'dirty' DB shut down whereby the DB appears to be down to the user and the system but isn't fully and truly down. Thus when the restart is performed the DB won't restart without restoration using the last known good epoch. While there is no simple way to check for the DB state outside of the adminTools UI, the simple act of patience will often help avoid this problem. When stopping the DB allow if possible an extra 20-40 minutes before restarting the DR host.
 
CA Performance Manager (CAPM) Host
The CAPM host consists of four different primary services. They are:
  • caperfcenter_console
  • caperfcenter_devicemanager
  • caperfcenter_eventmanager
  • caperfcenter_sso
There are also MySql services. Under normal conditions the MySql services do not require a user involved stop or start cycle.
The four primary services should be stopped and started in a specific order for best results. When stopping the services do so in this order:
  1. Stop the caperfcenter_console service
  2. Stop the caperfcenter_devicemanager service
  3. Stop the caperfcenter_eventmanager service
  4. Stop the caperfcenter_sso service
Restart the services in the reverse order, Please note the delay between the startup:
  1. Start the caperfcenter_sso service
  2. Wait one minute, then start the event manager and device manager:
  3. Start the caperfcenter_eventmanager service
  4. Start the caperfcenter_devicemanager service
  5. Wait one minute, then start the console service:
  6. Start the caperfcenter_console service
To check the CAPM services status values individually run:
service caperfcenter_<serviceName> status
For example:
service caperfcenter_console status
To stop the CAPM services individually run:
service caperfcenter_<serviceName> stop
For example:
service caperfcenter_console stop
To start the CAPM services individually run:
service caperfcenter_<serviceName> start
For example:
service caperfcenter_console start
 
Additional Information:
See Knowledge Base Article ID KB000010703 for a video showing similar details.

https://comm.support.ca.com/kb/how-to-restart-the-ca-performance-management-ca-performance-center-processes/kb000010703