SAP Manager Crash when child job handling is activated

Document ID : KB000085690
Last Modified Date : 14/04/2018
Show Technical Document Details
Issue:
Error Message :
In universe.log:

========================================================================================================
| 2014-11-05 04:00:08 |ERROR|X|SAP|pid=16029.140167968962304| uxsap_pchain_create | uxsap_disconnect returns in error
| 2014-11-05 04:03:03 |WARN |X|SAP|pid=16029.140167968962304| uxsap_pchain_create | Warning: unable to reconnect to system
| 2014-11-05 04:03:04 |ERROR|X|SAP|pid=16029.140167968962304| uxsap_pchain_create | uxsap_disconnect returns in error
| 2014-11-05 06:55:34 |INFO |X|SAP|pid=16029.140167952176896| uxsap_jnl_del_job | Job (BIBCAST1IE4HNC4B9IQCE56MA1U8S8BL) (06550400) not found in jnl file (0)
| 2014-11-05 06:55:34 |INFO |X|SAP|pid=16029.140168094672640| uxsap_jnl_del_job | Job (BIBCAST5BUDGQ2JSNC6Z2JPO8U3M8TG0) (06550400) not found in jnl file (0)
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=3302.140114861700864| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=46045.139761863300864| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|END|pid=13037.140419450586880| k_handshakeHello | u_recv_msg in error [-1]: Errno syserror 104: Connection reset by peer (recv returns error)
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=11241.140289622673152| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=46049.140048344524544| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=2246.140556836165376| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|sap|pid=46058.139707936401152| read_new_network_message | Network error -2: Connection closed
| 2014-11-05 06:56:04 |ERROR|X|END|pid=13037.140419450586880| owls_connect_auth | k_connect_auth_timeout(uk3pw1/SAP) returns error [-1]
========================================================================================================

In job log maybe found also these kind of messages:

========================================================================================================
| 2014-12-24 00:14:20 |INFO |X|SAP|pid=28584.140060886296320| uxsap_sync_wait_for_job_c | Following job (CACRASH) (00075801) not child of father [CACRASH][00075804]. Skipping...
| 2014-12-24 00:14:20 |INFO |X|SAP|pid=28584.140060406044416| uxsap_sync_wait_for_job_c | Following job (CACRASH) (00082401) not child of father [CACRASH][00082400]. Skipping...
========================================================================================================

gdb core analysis indicates a segmentation fault error and should be similar to this (here on Linux SAP)

========================================================================================================
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000438ceb in uxsap_synchro_sys (p_conteneur_sys=0x744f30) at /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c:255
255 /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c: No such file or directory.
in /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c
(gdb) where
#0 0x0000000000438ceb in uxsap_synchro_sys (p_conteneur_sys=0x744f30) at /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c:255
#1 0x00000000004389d6 in uxsap_synchro () at /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c:159
#2 0x000000000043887d in uxsap_synchro_start () at /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c:104
#3 0x00000000004387c8 in uxsap_sync_thread_trt (arg=0x0) at /etud_objp/indus/du_as/6.2.41/Linux_26_64/src/k_sap_mgr/src/server//uxsap_thread_sync.c:69
#4 0x00007f9eb41707b6 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f9eb3ecbc5d in clone () from /lib64/libc.so.6
#6 0x0000000000000000 in ?? ()
(gdb) quit
 =======================================================================================================

Patch level detected:Dollar Universe 6.2.00
Product Version: Dollar.Universe 6.2.0

Description :The SAP manager processe crashes when activity is high and SAP Child Jobs is activated. A core is generated. Possibly the journal file has to be deleted to allow the restart.

Conditions necessary to encounter the problem
-- high activity
-- SAP job with Child jobs
Environment:
OS: All
OS Version: But other OS maybe impacted : like AIX of HP Unix
Cause:
Cause type:
Defect
Root Cause: N/A
Resolution:
Update to the fix version listed below or a newer version if available.

Fix Status: Released

Fix Version(s):
Component: Application.Server
Version: Dollar.Universe 6.3.41
Additional Information:
Workaround :
N/A