Best Practise: First steps preparing Server Logfiles for further analysis

Document ID : KB000088807
Last Modified Date : 14/04/2018
Show Technical Document Details
Issue:

First steps preparing Server Logfiles for Analysis in case of:

  • Production Down
  • Performance problems
  • Crash of Server Processes
  • Error messages in the messages window like "error in Serverroutine"

 

Resolution:

Analysis

Information needed for analysis:

  • Logfiles of the UC4 Server (CP and WP) from the time the problems occurred +/- an hour. (default location temp folder of the Automation Engine)

  • Trace files from the time the error occurred. (default location temp folder of the Automation Engine)

  • Dump files from the time the error occurred.

  • The exact error message that appeared in the messages window.

Logfile Analysis

  • Combine the WP Logfiles using the Tracemix tool
  • Combine the CP Logfiles using the Tracemix tool
  • Combine the Tracefiles using the Tracemix tool

! Note please separate WP and CP logfiles and WP and CP Tracfiles when combining them with the Tracemix tool.

! Note dont forget to rename or change the location of the output file of the Tracemix tool called "TraceOutput.txt" because this file will be overwritten by creating new Tracemixes, TraceOutput.txt is the default name the Tracemix tool uses.

(I usually rename the output file to WPlogtraceout.txt for WP-logfile-mixes, CPlogtraceoutput.txt for CP-logfile-mixes, WPtraceoutput for WP-trace-mixes and CPtraceoutput for CP-trace-mixes)

To combine the Log and Tracefiles simply select all the files you want to have combined and select "send to Tracemix".

.png

 

After the Tracemix tool has finished combining these files it will open a popup-window with the text "Done!".

Drag and drop the created Tracemix file that contains the mix of the WP logs into the RS File Viewer (RSVIEW.exe)

The easiest way to search and highlight lines in the RS File - Viewer is by using the select lines function:

.png

Enter the Textsearch to highlight all found entries:

.png

  • Compare the Initialdata and Automation engine Version

Search for the following entries:

To find the Automation Engine Version ->  U0003400 Autotmic Server 'UC4PROD#WP' version '9.00A218-441' started.

To find the Initial Data Version             ->  U0003621 Version of Automic database objects: '9.00A216-341'

! - Note in Automic Version 9, Initial Data, Automation Engine, Userinterface and Utility must be from the same servicepack

! - Note that the Agents of a Servicepack can only run with an Automation Engine and Initial Data of the same or an older Servicepack level.

  • Search for and Highlight Primary Mode settings:

PrimaryMode=

Automic reccomends to use Primary Mode = 1 setting in the UCSRV.ini.

The Primary mode "1" setting can provide a performance win in high system load situations. 

  • Search for and Highlight Operating System:

U0003395 Operating system

Operating System that Automic is installed on.

!Note Automic Nonstop Servers Operating Systems need to be of the same Versions.

  • Search for and Highlight Database Information:

 U0003535 DB INFO

 U0003535 DB INFO: 'DATABASE-NAME         = Automic'
 U0003535 DB INFO: 'DBMS-NAME             = Microsoft SQL Server'
 U0003535 DB INFO: 'DBMS-VER              = 09.00.1399'
 U0003535 DB INFO: 'ODBC-VER              = 03.80.0000'
 U0003535 DB INFO: 'DRIVER-NAME           = SQLNCLI.DLL'
 U0003535 DB INFO: 'DRIVER-ODBC-VER       = 03.52'
 U0003535 DB INFO: 'DRIVER-VER            = 09.00.4035'
 U0003535 DB INFO: 'ODBC-API-CONFORMANCE  = 2'
 U0003535 DB INFO: 'TXN-CAPABLE           = 2'
 U0003535 DB INFO: 'DEFAULT-TXN-ISOLATION = 2'
 U0003535 DB INFO: 'TXN-ISOLATION-OPTION  = 0000000000101111B ( 47)'
 U0003535 DB INFO: 'POS-OPERATIONS        = 0000000000011111B ( 31)'
 U0003535 DB INFO: 'POSITIONED-STATEMENTS = 0000000000000111B (  7)'
 U0003535 DB INFO: 'SCROLL-CONCURRENCY    = 0000000000001111B ( 15)'
 U0003535 DB INFO: 'SCROLL-OPTIONS        = 0000000000010111B ( 23)'
 U0003535 DB INFO: 'STATIC-SENSITIVITY    = 0000000000000101B (  5)'
 U0003535 DB INFO: 'LOCK-TYPES            = 0000000000000001B (  1)'
 U0003535 DB INFO: 'ODBC_TRACE            =0'

! Note compare Database Information with the settings in the connection string: SQLDRIVERCONNECT=ODBCVAR=NNNNNNRN,DSN=UC4;UID=dbo;PWD=???;Mars_Connection=YES

! Note incorrect settings in the connection sting can lead to System Failure.

  • Search for and Highlight Working Process Startup Performance test and Reference values: 

U0003533 Check of data source finished: No errors. Performance CPU/DB: '35953115'/'1871'
U0003544 Reference values tested with Windows 2003 on XEON 1500 MHz: CPU 813865, DB 470

Automic runs a Performance Test on Startup of the Server processes, the DB value states how many times a simple UPDATE, SELECT DELETE statement was able to be performed in a second.

!Note if the DB value drops below 100 it could be an indication for Database or Network (Connection to the Database) failures.

  • Search for and Highlight 'MQ_CHECK_TIME'  and  'MQ_BLOCK_COUNT':

 'MQ_CHECK_TIME'

 'MQ_BLOCK_COUNT'

Interval (seconds) in which message queues are checked (only Oracle databases).

From Version 9 Docu:

Setting for message-queue reorganization (only Oracle databases) 
The Automic Variable UC_SYSTEM_SETTINGS now contains two new settings which can be used to check the Automic Server's message queues regularly and have them reorganized if required. The new validity keywords are 
MQ_CHECK_TIME  and MQ_BLOCK_COUNT.

!Note this setting is responsible for the message-queue reorganization in Oracle, misconfiguration can lead to performance issues.

 

  • Search for and Highlight the Server Options settings:

 'SERVER_OPTIONS'

The Server_Options setting in the UC_SYSTEM_SETTINGS Varible in client 0.

!Note misconfiguration can lead to failure.

  • Search for and Highlight the setting 'WP_MIN_NUMBER' and Nodename 

Search for:

'WP_MIN_NUMBER'  -> U0011847 'WP_MIN_NUMBER' set to 'UC4_1=5'.

Nodename                                 -> nodename = uc4_1

The WP_MIN_NUMBER is a varible set in the UC_SYSTEM_SETTINGS and declares how many of the active Server Processes should be used as Working Processes

The maximum number of work processes that can perform as dialog processes can be specified for each "node name" in the Automic Variable UC_SYSTEM_SETTINGS (key WP_MIN_NUMBER). The same "node name" can be used in all the INI file for achieving if the Server processes are not limited to one computer in order to define a setting which is applicable to all computers.

!Note this setting can cause performance issues in an high load performance environment, when setting is not optimized and for example not enough Dialog Work Processes are being defined.

  • Search for and highlight Time critical DB calls:

"U0003524", "U0003525", and "===>"

For example:

U0003524 ===> Time critical DB call!       OPC: 'SLCO' time: '1:396.488.118'
U0003525 ===> 'SELECT TOP 1 MQPWP_PK FROM MQPWP with (NOLOCK) WHERE MQPWP_SchedTime<=? ORDER BY MQPWP_Priority,

Time critical DB calls state the time the Database needs to respond to a statement sent by Automic, the time from which on a time critical DB call is protocolled into Automic can be set in the SERVER_OPTIONS Key in the UC_CLIENT_SETTINGS.

!Note Time critical DB calls are the main indication for performance issues, and have the following main 3 root causes:

    1. Network performance issues (especially the connection between Automic Server and Database)
    2. Broken Indexes on Database
    3. Large unused spaces on Automic Tables 
  • Search for and highlight changes in Primary work Process:

U0003396 The Server did not respond for 'seconds' seconds. Server "Serverproces" is now the primary Server.

This message states that the Primary Work Process has been changed due to receiving no response from the previous PWP for a certain period of time that is defined in the UCSRV.ini under "alivetimeout"

alivetimeout= Duration in seconds in which the Automic.Nonstop-Server expects a message from the primary work process. If this time span is exceeded, the Automic.Nonstop-Server becomes the primary work process.

!Note, Changes in PWP can be an essential key to performance and production down analysis.

  • Search for and highlight oracle error messeges:

Search for: ORA-,

For example:

U0003590 DB error: 'OCIStmtExecute', 'ERROR   ', '', 'ORA-00942: table or view does not exist'.

Automic Protocolls responding Oracle error messages into the logfiles.

!Note, Oracle error messages can be an essential key in performance and production down analysis, if the Database fails, Automic fails.

 

 

  • Search for and highlight changes in Primary work Process:

Search for:; U0003590, DB error:, 'SQLExecDirect', ,'ERROR ,

For example:

U0003590 DB error: 'SQLExecDirect', 'ERROR   ', '42000', 'Internal Query Processor Error: The query processor encountered an unexpected error during execution.'

Automic Protocolls responding SQL error messages into the logfiles.

!Note, SQL error messages can be an essential key in performance and production down analysis, if the Database fails, Automic fails.

  • Search for and highlight Deadlocks:

Search for: Deadlock

For example:

U0003592 Status: '40001' Native error: '1205' Msg: 'Transaction (Process ID 67) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.'
U0000006 DEADLOCK

deadlock is a situation in which two or more competing actions are each waiting for the other to finish, and thus neither ever does, a deadlock is a situation which occurs when a process enters a waiting state because a resource requested by it is being held by another waiting process.

Time critical Database calls can be responsible for a deadlock situation.

!Note, Deadlock messages can be an essential key in performance and production down analysis.

 

  • Search for and highlight Socket Errors in the CPlogfiletracemix:

Search for:; Socket Error

for example:

U0003413 Socket call 'bind' returned error code '98'.
                 Address already in use

Socket errors are Operating System related Network errors, protocolled by Automic and written into the Logfiles.

!Note, socket errors can be essential for Performance and production down analysis, Automic cant perform without a Network connection.

 

Please save your RS File VIEWER Analysis into the Ticket folder and summarize your findings into your forward message.