How CA Support Reads Through CA Service Desk Manager (CA SDM) STDLOGs

Document ID : KB000029476
Last Modified Date : 14/02/2018
Show Technical Document Details

DESCRIPTION:

This TECDOC will describe what CA Support looks at when reviewing the CA SDM STDLOGs to troubleshoot an issue. 

Disclaimer:  This guide provides a general overview of CA SDM log file examination.  It is strongly recommended that if you are experiencing problems with the install, to open a CA Support issue and provide the log materials (NX_ROOT\LOG directory) so that CA Support can further troubleshoot the issue.

SOLUTION:

Whenever a problem arises, a commonly requested item are the CA SDM STDLOGs (NX_ROOT\LOG directory).  CA Support often asks for the entire set of logs.  This is so that CA Support can get an overall view of the install's history, including any prior occurrences of the given problem that may have arisen but went undetected or undiagnosed.

It is also asked that CA Support obtain the logs directly to examine the issue and analyze the logs more thoroughly.  Some of the pitfalls of trying to examine the logs over a remote session would include false-positives or wading through considerable amounts of irrelevant information.  Further, any later examination of the issue by Sustaining Engineering would require the logs be sent in for examination.

While the tools to examine the issue may vary, the most frequently used tool are string and expression searches.  This may take the form of DOS/Windows' findstr command or Unix/Linux/MKS Tools grep command.

Once CA Support receives the logs, the log materials are examined for specific concerns being reported.  Techniques and methods may vary depending on the issue.  The following are some of the more common concerns:

Performance

Depending on what the nature of the performance issue is, one could search for instances of the word "millisecond", which may manifest entries such as:

SERVER1      web:local            2292 SIGNIFICANT  session.c             3888 This request took 1829940 milliseconds to complete. session id:0 login name:User1 htmpl name:detail_cr_ro.htmpl

SERVER1      sqlagt:select2       3128 SIGNIFICANT  sqlclass.c            1049 The following statement took 4623 milliseconds: Clause (SELECT act_log.time_stamp...

SERVER1      bpvirtdb_srvr        5880 SIGNIFICANT  vdbagent.c             504 Select queue is currently backlogged 3012 millisecond.

SERVER1      sqlagt:select2       3128 SIGNIFICANT  sqlclass.c            1230 A FETCH for the following statement took 2040 milliseconds: Clause (SELECT call_req.id...

In the above, there may be multiple instances of any of the above messages.  One can consider such instances to form a pattern.  For example, is the query on the detail_cr_ro.htmpl or the act_log table or call_req table always running slow?  Were any changes made to the given web form?  How large are the given tables?  If there is a specific timeframe specified, is everyone running a report at the same time or even coming in at once?


Database

If there are issues regarding the database, one can examine for entries that contain "sqlclass" or "orclclass".  For instance, one may find:

SERVER1      sqlagt:select3       3984 ERROR        sqlclass.c             996 SQL Execute  failed: [Microsoft OLE DB Provider for SQL Server] [ SQL Code=11 SQL State=08S01] [DBNETLIB][ConnectionWrite (send()).]

SERVER1      sqlagt:select0       3640 ERROR        sqlclass.c             470 Failed to logon to SQL Server (DBSERVER\INSTANCE) Reason: [Microsoft OLE DB Provider for SQL Server] [ SQL Code=17 SQL State=08001] [DBNETLIB][ConnectionOpen (Connect()).]

In the above, the logging shows CA SDM is having problems with the backend SQL Server.  There is a reference to a SQL Code and a SQL State in both of the above example entries.  These are NOT CA SDM specific error values as they reference the SQL Server.  One may perform a Google Search to find out what these error codes actually signify.

 

Web Access

Sometimes, there is a question about the web interface, that some accesses such as login or searches, may take time to complete.  One thing that CA Support may consider is searching for instances of the web process (usually the primary server's web:local or the secondary's web:SECONDARY-SERVER process on the secondary servers

For instance:

SERVER1      web:local            3296 SIGNIFICANT  session.c            10468 Session 1933497577:0x0EADFB68 login by analyst user1 (cnt:ABCDEFGHIJKLMNOPQRSTUV); session count 78

SERVER1      web:local            3296 SIGNIFICANT  session.c             7823 Session User1-218221404:0x8f88838 ended without logout; session count 77

SERVER1      web:local            3296 SIGNIFICANT  session.c             6539 Web Statistics - Cumulative Sessions (340) Most Sessions (81) Current Sessions (77)

From the above, if one were to see muliple instances of "login by analyst" by the same user in a short time frame, we can infer that the user might be attempting multiple logins across workstations, or if there is an impossibly high number of the same login in a very short amount of time, either the user has loaned their login out or may be part of a third party monitoring tool that could test CA SDM's availability via login access.

Similarly, "ended without logout" may point to such a monitoring tool, especially if a pattern of the same user timing out the session is seen.  Similarly, if end users are in a habit of leaving their browsers or just closing them, but have a high Timeout value in web.cfg, that may point to a performance issue.

The third line points to session statistics that are recorded for the given webengine.  Something to be concerned is if a given entry infers a high value for "Most Sessions" or "Current Sessions" (over 100 sessions) which may point to a highly taxed environment with the amount of sessions the given server is handling.

 

Time Specific

If there is a specific timeframe of interest, one can obtain a list of all records that name a specific time/date range.  For instance, if one knows a given issue had occurred at or around Jan 6, 12:25, one could search for all records that start with "01/06 12:2".  What this does is show all activity that occurred from Jan 6, 12:20 to Jan 6, 12:29.  This is especially useful if a specific time is not known, but a general range could be considered.

For more information on the content of the CA SDM STDLOGs, refer to TEC478009