Jobs stop launching "OBT_RECDAT_ERROR"

Document ID : KB000084599
Last Modified Date : 29/01/2019
Show Technical Document Details
Issue:
Error Message :
|ERROR|X|IO |pid=p.t| OBtFile_GetRecord_nodat | o_btree_file_get err=[OBT_RECDAT_ERROR], file=[/duas_folder/data/exp/u_fmsb60]
Or:
|ERROR|X|IO |pid=p.t| u_read_direct | LBtreeFile_GetRecord_nodat error [OBT_RECDAT_FATAL_ERROR] for file [C:\Program Files (x86)\ORSYP\DUAS\NODE\data\exp\u_fmsb60]
or
|ERROR|X|IO |pid=p.t| u_read_direct | LBtreeFile_GetRecord_nodat error [OBT_RECDAT_FATAL_ERROR] for file [/duas_folder/data/exp/u_fmcx60]

Since Dollar Universe 6.3 IO server may crash after some errors in universe.log containing the keywords "OBT_RECDAT_ERROR" or "OBT_RECDAT_FATAL_ERROR".
Another symptoms may be:
- jobs cannot be submitted
- jobs remain on same status
- some commands fail

It all depends on which data file has been impacted ( file specified in the error message in universe.log).
Environment:
OS: All
Cause:
Cause type:Defect
Root Cause: Corruption of the idx (btree) file due to unknown reasons
Resolution:
Since Dollar Universe 6.9.01 Application Server, the IO server will attempt a reorganization of the impacted data file if the index file is found as corrupted while the node is running.

In case the issue can be reproduced at will, a new maintenance node setting (U_IO_IDX_DUMP) has been created to activate index dump (data file and log) for the specified files.

This needs to be set on the "Node Settings" - Maintenance of the impacted node with a list of comma-separated list of files for which index debug/dump mode will be activated:
Example for the files u_fmsb60, u_fmhr60, u_fmhs60

Activate dump of index files in case of error in area X: u_fmsb60,u_fmhr60,u_fmhs60

After having enabled this setting, a node restart is necessary.

If the issue occurs again:
  • Problematic files (.dta/.idx) are saved in the corresponding data directory (ex. data/exp/u_fmsb60.dta_2018012214241300)
  • Dumps will be saved in log directory (ex. bt_dump_u_fmsb60_2018012214241300.log)
  • Traces can be found in universe.log when the issue occurs:
 INFOXIOpid=p.treorganizeReorganize file C:\Automic\DUAS\DU6901\data\exp\u_fmsb60
 INFOXIOpid=p.treorganizeFile C:\Automic\DUAS\DU6901\data\exp\u_fmsb60 reorganized.

 

Additional Information:
Workaround :
Launch manually a reorganization of the impacted data file or a whole offline reorganization.