CAUAJM_E_10656 The database <xxxxxx> has encountered a critical error.

Document ID : KB000007148
Last Modified Date : 14/02/2018
Show Technical Document Details
Issue:

Several errors like below ones are seen in the application server log when processing a long query for WCC

CAUAJM_E_18416 Event Server: <xxxxxx> Failed Query: <select jr.startime, jr.endtime, j.job_name, j.joid, jb.job_name, j.box_joid, j.as_group, j.as_applic, j.job_type, jr.run_machine, j.mach_name, j.owner, jr.status, i.timezone, jr.run_num, jr.ntry, jr.over_num, jr.exit_code, e.event, e.status, e.text, e.event_time_gmt, round((e.que_status_stamp-to_date('19700101','YYYYMMDD'))*24*60*60+(select int_val from ujo_alamode where type='gmt_offset')), j.description, j.job_ver, e.evt_num, js.status from ujo_job_runs jr inner join ujo_job ja on ja.joid=jr.joid and ja.job_ver=ja.job_ver and ja.is_active=1 and ja.is_currver=1 inner join ujo_job j on j.joid=ja.joid and j.job_ver=jr.job_ver and j.over_num=jr.over_num join ujo_job jb on jb.joid=j.box_joid and jb.job_ver=jb.job_ver and jb.over_num=-1 join ujo_sched_info i on i.joid=jr.joid and i.job_ver=jr.job_ver and i.over_num=jr.over_num join ujo_job_status js on js.joid=jr.joid left outer join ujo_proc_event e on e.event=e.event and e.status=e.status and e.joid=jr.joid and e.run_num=jr.run_num and e.ntry IN (0,jr.ntry) where (e.que_status = 2 and jr.endtime > 0 and (jr.endtime >= 1496860819 or e.event_time_gmt >= 1502038832) and jr.endtime >= 1496689060) union select jr.startime, jr.endtime, j.job_name, j.joid, jb.job_name, j.box_joid, j.as_group, j.as_applic, j.job_type, jr.run_machine, j.mach_name, j.owner, jr.status, i.timezone, jr.run_num, jr.ntry, jr.over_num, jr.exit_code, e.event, -1, e.text, e.event_time_gmt, 0, j.description, j.job_ver, e.evt_num, js.status from ujo_event e inner join ujo_job ja on ja.joid=e.joid and ja.is_active=1 and ja.is_currver=1 inner join ujo_job j on j.joid=ja.joid and j.job_ver=e.job_ver and j.over_num=-1 join ujo_job jb on jb.joid=j.box_joid and jb.job_ver=jb.job_ver and jb.over_num=-1 join ujo_sched_info i on i.joid = e.joid and i.job_ver=e.job_ver and i.over_num=j.over_num join ujo_job_status js on js.joid=e.joid left outer join ujo_job_runs jr on e.run_num = jr.run_num and e.joid=jr.joid and e.ntry IN (0,jr.ntry) where (e.que_status = 2 and jr.endtime > 0 and (jr.endtime >= 1496860819 or e.event_time_gmt >= 1502038832) and jr.endtime >= 1496689060) order by 4 ASC NULLS FIRST , 25 ASC NULLS FIRST , 15 ASC NULLS FIRST , 16 ASC NULLS FIRST , 22 ASC NULLS FIRST , 26 ASC NULLS FIRST > 

CAUAJM_E_18402 ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], [] 

CAUAJM_I_18403 Processing OCI function not used(4) 

CAUAJM_E_18400 An error has occurred while interfacing with ORACLE. 

CAUAJM_E_18401 Function <doExecute> invoked from <execute> failed <862> 

CAUAJM_W_10900 The database monitoring system has detected a potential problem with the database. 

CAUAJM_I_10901 The database monitoring system is beginning validation of database connections.

 

Same errors seen in the event_demon log:

CAUAJM_E_18416 Event Server: <AUTSYSP> Failed Query: <BEGIN :RetVal := ujo_batch_pkg.ujo_batch_get_event (:I_time, :RefCursor); END;> 

CAUAJM_E_18402 ORA-20870: First update of ujo_batch_get_event failed - -600 -ERROR- ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], [] 

ORA-06512: at "AEDBADMIN.UJO_BATCH_PKG", line 34 

ORA-06512: at line 1 

CAUAJM_I_18403 Processing OCI function ODEFIN(34) 

CAUAJM_E_18400 An error has occurred while interfacing with ORACLE. 

CAUAJM_E_18401 Function <doExecute> invoked from <bind> failed <555> 

CAUAJM_W_40207 An unexpected problem occurred while fetching events. Continuing..

Environment:
CA WAAE 11.3.5 on Linux with Oracle database 11G2But it might happen with any other Autosys release
Cause:

Oracle DBA ran this command and got same Oracle error

SQL> Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online;

Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online

*

ERROR at line 1:

ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [],

[], [], [], [], []

 

The error was caused by a table / index mismatch. 

There was an entry at the INDEX pointing to a row that did not exist at the table level. 

When you recreated the table using export / import, the index got recreated and it cleared the mismatch 

The root cause could be an ORACLE bug or a storage issue that allow the inconsistency to take place. 

CA suggest that you review the Note 285586.1 - ORA-600 [kdsgrp1] , it lists all the known ORACLE bugs. 

Also, upgrade the database to the most recent release of Oracle 11g, 11.2.0.4, and install all the latest Oracle patches

Resolution:

DBA  exported, deleted and imported the table: 

- expdp system tables=AEDBADMIN.UJO_EVENT directory=DATA_PUMP_DIR dumpfile=autosys_20170612.dmp logfile=autosys_20170612.dmp.log

- impdp system tables=AEDBADMIN.UJO_EVENT directory=DATA_PUMP_DIR dumpfile=autosys_20170612.dmp logfile=impdp_autosys_20170612.dmp.log

 

After that, the error disappeared:

SQL> Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online;

Table analyzed.

SQL> exit

Additional Information:

Oracle DBA was working with Oracle support on that problem.

 

But before running expdp and impdp, CA would suggest to rebuild the indexes with:

perl reindexDB.pl

And finally, if it still does not fix this ORA error, run the Oracle expdp and impdp steps.