Policy Server Fail-over within a cluster (Legacy_Onyx KB Id: 223414)

Document ID : KB000054883
Last Modified Date : 14/02/2018
Show Technical Document Details

Description:

When one policy server fails in a policy server cluster, webagents that have this policy server listed as the first server in the SmHost.conf do not fail over to the available policy server as expected.

Example

Configuration
WebAgentA ponits to pserverA and then pserverb in the SmHost.conf
WebAgentB points to pserverb and then pservera in the SmHost.conf
pservera and pserverb are in a policy server cluster and will only fail-over to the other cluster if both policy servers fail. Both SmHost.conf files point to the same host configuration object and this is the host config object that defines the cluster.

WebAgents are SiteMinder APACHE 2.0 WebAgents, Version 6.0 QMR03 HF2
Policy Servers are Version: 6.0; Update: 03; Build: 818

Test Scenario 1
pservera fails and pserverb is still working
WebAgent A requests fail %100 of the time
WebAgent B requests work %100 of the time

Test Scenario 2
pserverb fails and pservera is still working
WebAgentA requests work %100 of the time
WebAgentB requests fail %100 of the time

Test Scenario 3
pservera and pserverb are both working
WebAgentA requests work %100 of the time
WebAgentb requests work %100 of the time


We can repeat all there test senarios over and over again, can you confirm if this is the expected result ?


Solution:

The policy server cluster in question in the policy export that you provided is configured with a failover threshold percent of 100%.

SiteMinder Policy Server Administration Guide explains the functionality of the failover threshold percent setting as follows :

<documentation>

12. In the Failover Threshold Percent field, enter a percentage of Policy Servers that must be active and click Apply. If the percentage of active servers falls below the percentage you specify, the cluster failovers to the next available cluster in the list of clusters. The Policy Server User interface automatically calculates the Failover Threshold values displayed in the column to the right of the lists of servers in each cluster. The number that appears in the Failover Threshold column is the minimum number of servers in the cluster that must be available. If the number of available servers falls below the specified number, failover occurs. When you set the Failover Threshold Percentage, it applies to all clusters that use the Host Configuration Object.

</documentation>

Since this cluster is configured with a failover threshold percent of 100%, if less than 100% of the policy servers are available, then SiteMinder will fail over to the next policy cluster associated with the host configuration object. In this case, if even 1 policy server is down, then SiteMinder will attempt to fail over to the next cluster. But in the policy store uploaded to the case, there is only one policy server cluster associated with this host configuration object, which means that the policy server has nowhere to fail over to once even a single policy server is down.

You could likely resolve this by lowering the failover threshold percent below 50%. If you were to set the threshold to something like 40%, then with one policy server up and one down within the cluster, SiteMinder will not try to fail over to another cluster.

However, if you're going to have only two policy servers in place, it may make more sense to configure the PolicyServer attribute in the host configuration object as multi-valued, with each policy server set as one of the two values, and also to enable failover in that same host configuration object. This will circumvent any cluster-related complications. The only caveat that should accompany this step is the following, also from Policy Server
AdministrationGuide:

<documentation>

14. Note the Policy Server parameter in the Parameter Name column. The value of the Policy Server parameter is overridden by the contents of the Clusters tab. This parameter should be commented for clarity (pro- ceeded by a # symbol), since its value does not apply as long as Policy Servers are specified in the Clusters tab.
Note In order for the value of the Policy Server parameter in the General tab to apply, no Policy Servers should be specified in the Clusters tab. If clusters are configured and you decide to remove the clusters in favor of a simple failover configuration using the Policy Server parameter in the General tab, be sure to delete all Policy Server information in the the Clusters tab.

</documentation>

In other words, Policy Server failover takes effect only when no cluster is in place in the host configuration object in question.

A third alternative would be to create an additional policy server cluster, so that when the first goes down, SiteMinder can fail over to the second.