GMU import leads to MySQL Database Deadlocks

Document ID : KB000107796
Last Modified Date : 25/07/2018
Show Technical Document Details
Question:
ENV Details: 
MySql 5-7-19, 5-7-21 
Gateway: 9.3 
GMU : 1.5 
running on AWS 

The customer is facing the following issue: they've set up an API Gateway in Container Form Factor in AWS running on a Kubernetes Cluster connected to an AWS MySQL RDS DB. 

The API Gateway can connect to the Database and it is also possible to create Services and Policies using the Policy Manager. 
When trying to import (migrateIn) a GMU Export Bundle the import fails with the error: "Unexpected end of file from server" 

In Gateway Logs we can see some MySQL Errors like: 

WARNING: SQL Error: 1213, SQLState: 40001 
Jul 18, 2018 1:50:44 PM org.hibernate.util.JDBCExceptionReporter logExceptions 
SEVERE: Deadlock found when trying to get lock; try restarting transaction 
and 
WARNING: SQL Error: 1205, SQLState: 41000 
Jul 18, 2018 1:51:35 PM org.hibernate.util.JDBCExceptionReporter logExceptions 
SEVERE: Lock wait timeout exceeded; try restarting transaction 

In the MySQL DB we can see Deadlocks, which were not present befor the GMU Import: 

mysql tables in use 1, locked 1 
LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s) 
MySQL thread id 194, OS thread handle 47778553521920, query id 158004 11.220.73.50 root update 
insert into folder (name, version, security_zone_goid, parent_folder_goid, goid) values ('mbbmock', 0, null, x'0000000000000000FFFFFFFFFFFFEC76', x'0BAD49997DE5BC52C76F99DFDE3E680A') 
Trx read view will not see trx with id >= 63347, sees < 63199 
------- TRX HAS BEEN WAITING 44 SEC FOR THIS LOCK TO BE GRANTED: 
RECORD LOCKS space id 296 page no 7 n bits 304 index PRIMARY of table `ssgclusterb`.`folder` trx id 63347 lock mode S waiting 
Record lock, heap no 27 PHYSICAL RECORD: n_fields 7; compact format; info bits 0 
0: len 16; hex 0bad49997de5bc52c76f99dfde3e680a; asc I } R o >h ;; 
1: len 6; hex 00000000f6df; asc ;; 
2: len 7; hex d0000001d40110; asc ;; 
3: len 4; hex 80000000; asc ;; 
4: len 7; hex 6d62626d6f636b; asc mbbmock;; 
5: len 16; hex 0000000000000000ffffffffffffec76; asc v;; 
6: SQL NULL; 

------------------------------------------------------------------- 
---TRANSACTION 63199, ACTIVE 165 sec 
1091 lock struct(s), heap size 123088, 42631 row lock(s), undo log entries 79542 
MySQL thread id 188, OS thread handle 47778587682560, query id 175152 11.220.73.50 root 
Trx read view will not see trx with id >= 63199, sees < 63199 
------------------------------------------------------------------- 

Have seen this kind of error before? Do you know how to solve it? 

MySQL Database Engine tested: 5.7.17, 5.7.19, 5.7.21 
DB Instance Class: db.t2.medium 

Custom MySQL Parameters set: 
log_bin_trust_function_creators = 1 
max_allowed_packet=16777216 


For example, if you create a TEST folder and a service on a source system, you can then successfully import the folder and its contents on the target. 

However, we also tried just importing the global policies and fails.with 
  
WARNING: SQL Error: 1452, SQLState: 23000 
Jul 19, 2018 12:56:03 PM org.hibernate.util.JDBCExceptionReporter logExceptions 
SEVERE: Cannot add or update a child row: a foreign key constraint fails (`ssgclusterb`.`policy`, CONSTRAINT `policy_folder` FOREIGN KEY (`folder_goid`) REFERENCES `folder` (`goid`) ON DELETE NO ACTION ON UPDATE NO ACTION) 
Jul 19, 2018 12:56:03 PM org.hibernate.util.JDBCExceptionReporter logExceptions 
WARNING: SQL Error: 1452, SQLState: 23000 
Jul 19, 2018 12:56:03 PM org.hibernate.util.JDBCExceptionReporter logExceptions 
SEVERE: Cannot add or update a child row: a for 
Environment:
ENV Details: 
MySql 5-7-19, 5-7-21 
Gateway: 9.3 
GMU : 1.5 
running on AWS 
Answer:


Having started to look deeper into the components in between the Internet and the Gateway Container. 
I found that the issue was caused by the Idle Timeout setting of the Amazon ELB. The timeout was set by default to 60s.

Since there's no communication between the Gateway and the Client until the migration is finished, the ELB closed the connection. 

I assume that GMU was retrying the request after the ELB closed the connection the first time. The Gatway was still working on the first request, when receiving the 2nd request from GMU. The second request then causes the MySQL Deadlocks. 

I increased the ELB Idle timeout to 1800s and now it's working fine.