An eHealth system that is a member of a cluster had to be rebuilt. No DB was loaded or restored on the system, thus it is effectively a new install, on the same host that was previously a member of the cluster.
How can the host be added back to the cluster again as an active member?
NOTE: If the situation is the same with the exception of having loaded a DB into the rebuilt system, follow the steps in Knowledge Base article ID TEC515902 for the steps to add the system back to the cluster.
NOTE: This solution assumes the rebuilt system is using the same host name and IP addresses as it was before the rebuild. If that is not the case contact Support for additional assistance.
The overview of the process is as follows:
1 - The rebuilt machine has a machine_ID value of zero (0). We need to identify the original machine_ID value and update it in the rebuilt systems database.
2 - The rebuilt machine now needs to get a copy of the cluster membership table and related values. We do this by copying that data from one of the other existing active cluster members to the newly rebuilt machine.
To execute this process follow these steps:
1 - Run nhListClusterMember on one of the Back End (BE) Pollers. Note the machine_id value for the RFE in the output. Run one of these commands to gather this information:
nhListClusterMember -all (will show member table for all cluster members including the local host its run on)
nhListClusterMember -host <localHostName> (specify local host name where command is run, shows only local systems table info)
2 - Update the machine_id value for the RFE, in the nh_machine table, in the RFE's Oracle DB.
2A - To log in to SqlPlus in eHealth releases 6.3.1 or newer:
- Where $NH_USER/$NH_USER is replaced with the $NH_USER name
- Where '@HOSTNAME_DBNAME' should be replaced with the eHealth servers host name and the DB Name.
If any of the necessary values are unknown, to figure out what the are:
The $NH_USER command, if unknown, can be found running this command:
ENV | grep -i NH_USER
The Oracle DB name, if unknown, is in the ENV variable ORACLE_SID. To find it run this in the command prompt:
ENV | grep -i ORACLE_SID
As an additional alternative, if you run this command:
ENV | grep -i CONNECT_STRING
You should see an ENV variable named NH_DB_CONNECT_STRING. Its value is what you would place after the '@' symbol in the command.
For example if my user is 'eHealth', my host name is 'EHSVR1' and my DB name is 'DATA' the command would be:
The resulting output show show something similar to this:
SQL*Plus: Release 10.2.0.3.0 - Production on Fri Mar 7 11:32:57 2014
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
2B - Once connected to the SqlPlus prompt, use this query to check and validate the current table information:
select * from nh_machine;
(verify machine_id value; should be 0 (zero) right now)
2C - Change it from the current value to the correct machine_id value with the command:
update nh_machine set machine_id=<oldNumber_from_nhListClusterMember_output> where machine_id=0;
2D - Verify the update was made correctly with the query:
select * from nh_machine;
3 - To complete the process we now run the nhJoinCluster command. It may show errors trying to get itself added to both of the BE Poller systems cluster tables since it is already known to them. Those errors are benign since at the same time it should still pull the member information from the BE Poller system tables and place it into the rebuilt RFE systems database values. Use the command:
nhJoinCluster -host <hostName-of-a-BE_Poller> -useStdDac yes
Once all steps were completed the RFE is now rebuilt and functioning properly as a member of the cluster. To validate the correct information is set up in the rebuilt system, run this command on that system:
The output for the rebuilt RFE system should match what is returned from the BE Poller systems.