Gateway logs indicate replication failing due to an error calculating delay

Document ID : KB000042828
Last Modified Date : 13/07/2018
Show Technical Document Details
Introduction:

Solution

Background

The API Gateway monitors the status of replication via the delay (in seconds) of replicated queries between two database hosts. If the Gateway is unable to check this value then it will log an error message that indicates an error calculating the delay in seconds. This error does not necessarily imply a failure in replication. It indicates that the Gateway is unable to check the status of replication which would obscure an error. Replication can always be checked by an administrator by running the SHOW SLAVE STATUS\G; query from the MySQL prompt of the Gateway appliance. This article will prescribe the steps necessary to resolve a common configuration error that may cause this problem.

Presentation

This article is applicable if the following error message occurs in one or both database nodes in the cluster:?WARNING 71 com.l7tech.server.cluster.DatabaseReplicationMonitor : 2381: Replication failing for host/database <node>: error calculating delay

The presence of this error indicates that the Gateway cannot check the replication status and is usually caused by a configuration problem with regards to MySQL user grants.

Troubleshooting

The grant configuration of the Gateway cluster should be examined to determine whether it is the source of the issue. Perform the following procedure to determine if MySQL grants are the cause of this issue:

  1. Log in to the primary?API Gateway database appliance as the?ssgconfig user
  2. Select Option #3: Use a privileged shell (root)
  3. Print out the MySQL user grants: mysql -e "SELECT user,host,password FROM mysql.user;"
  4. Log in to the secondary API Gateway database appliance as the?ssgconfig user
  5. Select Option #3: Use a privileged shell (root)
  6. Print out the MySQL user grants: mysql -e "SELECT user,host,password FROM mysql.user;"

The user grants should have the same information uniformly. The user account names, allowed hosts, and password fields should be the same across all records in the table. It is acceptable for the records to be in different orders but each record should match across both database hosts. An example grant configuration is illustrated below and a similar output should be present on both database nodes in the cluster. Each node in the cluster should have a privileged user (root), an unprivileged user (gateway), and an unprivileged replication user (repluser). These user names may vary by customer implementation but should all be present.

+----------+--------------------------+-------------------------------------------+
| user ? ? | host ? ? ? ? ? ? ? ? ? ? | password ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|
+----------+--------------------------+-------------------------------------------+
| root ? ? | localhost ? ? ? ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| repluser | gateway2.ca.com ? ? ? ? ?| *D98280F03D0F78162EBDBB9C883FC01395DEA2BF |
| gateway ?| % ? ? ? ? ? ? ? ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| gateway2.ca.com ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| gateway1.ca.com ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| localhost ? ? ? ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| localhost6 ? ? ? ? ? ? ? | *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| localhost.localdomain ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
| gateway ?| localhost6.localdomain6 ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |
+----------+--------------------------+-------------------------------------------+

If the?database nodes do not have the same MySQL grants then the observed error may occur. An example of an incorrectly configured database node is illustrated below:

+----------+--------------------------+-------------------------------------------+
| user ? ? | host ? ? ? ? ? ? ? ? ? ? | password ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|
+----------+--------------------------+-------------------------------------------+
| root ? ? | localhost ? ? ? ? ? ? ? ?| *78377EDD3733CC2048B9FBB4B0E0299C8720D4FA |

| repluser | gateway2.ca.com ? ? ? ? ?| *D98280F03D0F78162EBDBB9C883FC01395DEA2BF |
+----------+--------------------------+-------------------------------------------+


The absence of the unprivileged database user (gateway) is the source of the problem. Such an absence would result in the observed error message and should be corrected

Resolution

The following SQL queries should be run on all database nodes in the cluster while replication is active:

grant all on ssg.* to gateway@'%';
grant all on ssg.* to gateway@'localhost';
grant all on ssg.* to gateway@'localhost.localdomain';
grant all on ssg.* to gateway@'localhost6';
grant all on ssg.* to gateway@'localhost6.localdomain6';
UPDATE mysql.user SET password = PASSWORD('7layer') WHERE user = "gateway";
flush privileges;?

Please note that the database username and password (emphasized in bold) should be set to the known values in these SQL queries. The queries above use factory default values but the values used in an implementation may vary across environments. Restart the API Gateway appliance after making these changes to force the grants to take effect.

If this does not resolve the problem please look into the /opt/SecureSpan/Gateway/node/default/etc/conf/node.properties file
The 
main.host should point to the primary database in both primary and secondary node. 

Instructions:
Please Update This Required Field