At times, SQL Server may be slow in returning results back to Service Desk.
This document describes methods to diagnose and resolve these types of issues.
Service Desk and MS SQL Server Performance
At times, SQL Server may be slow in returning results back to Service Desk.
This would impact the performance of the application, making searches and updates slow.
The Service Desk\log\ stdlog files will show messages like the following:
09/28 07:08:47.21 SERVERNAME sqlagt:select1404 17512 SIGNIFICANT sqlclass.c 1056 The following statement took 13515 milliseconds: Clause (SELECT call_req.open_date, call_req.id, min(attached_sla.time_to_violation) AS "mintime", min(usp_target_time.target_time) AS "mintgt", call_req.ref_num FROM (call_req LEFT JOIN attached_sla ON call_req.persid = attached_sla.mapped_cr LEFT JOIN usp_target_time ON call_req.persid = usp_target_time.mapped_cr) WHERE ( call_req.description LIKE ? AND call_req.active_flag = 1 ) AND ( call_req.type = ? ) GROUP BY call_req.open_date, call_req.id, call_req.ref_num ORDER BY call_req.open_date DESC) Input (<string>drive%|<string>I)
Before troubleshooting performance in SQL, be sure to understand:
- Is the SQL Server physical or virtual?
- If virtual, does it have resources dedicated to that instance?
- Be aware, CA does NOT recommend using SQL on a virtual platform.
- Are you using CA BI (BOXI) reporting? Does that point to the Service Desk production application server and database? Are you using other reporting tools that point to the production MDB?
- What other applications share the MDB with Service Desk?
- Does the Database server have databases for other applications?
Note: Much of what is covered here (with the exception of the section on the Virtual Database and connections) is the realm of an experienced MS SQL Database Administrator and not CA Support. Please consult your Database Administrator prior to running any tests in MS SQL as described in this document.
Slow Running SELECT statements
When you are testing, be sure to have SQL Server start from the same state each time. Clear the cache (sometimes referred to as the buffer) first; otherwise it prevents the data, execution, or both from being cached, thus corrupting the next test.
- To clear the SQL Server cache
Execute the following commands:
This command clears all the data from the cache.
This command clears the stored procedure cache.
- In SQL Manager, Turn on the following statistics from the "Query" menu (as shown in Figure 1):
"Include Actual Execution Plan"
"Include Client Statistics"
- On the Primary Service Desk server, execute the following commands to get a list of long running SQL Queries:
Switch to the Service Desk\log\ folder:
Run the following command:
type stdlog* | find /i "following statement" >> SQL_Statements_To_Test.txt
This will produce output like the following in the text file: "SQL_Statements_To_Test.txt":
10/08 11:36:25.58 walsh04-VM5397 sqlagt:select5 4412 MILESTONE sqlclass.c 1043 The following statement took 1870 milliseconds: Clause (SELECT call_req.open_date, call_req.id, call_req.ref_num, call_req.id FROM call_req WHERE ( call_req.priority IN (0,1,2) ) AND ( call_req.assignee IS NULL AND call_req.group_id IS NULL ) AND call_req.active_flag = 1 AND call_req.type = ? ORDER BY call_req.open_date DESC) Input (<string>I)
Note: Service Desk logs SQL statements that take longer than 1,000 milliseconds. If we want to capture timings for all queries, the following command can be used:
pdm_logstat -f sqlclass.c MILESTONE
This should be used with caution and only for short periods of time as it will fill the stdlogs quickly.
The SQL statement starts with the "(SELECT". This long string would need to be manipulated to run directly in SQL Server.
Service Desk executes all SQL queries within a Stored Procedure due to the usage of Microsoft's SQL OLEDB API calls. As a result, the query would need to be run using the following syntax to ensure accurate results:
exec sp_executesql N'SELECT call_req.open_date, call_req.id, call_req.ref_num, call_req.id FROM call_req WHERE ( call_req.priority IN (0,1,2) ) AND ( call_req.assignee IS NULL AND call_req.group_id IS NULL ) AND call_req.active_flag = 1 AND call_req.type = @P1 ORDER BY call_req.open_date DESC',N'@P1 nvarchar(1)',N'I'
- Run the SQL statements inside SQL Manager
For each SQL statement in the file: "SQL_Statements_To_Test.txt", please repeat this process.
Run each statement and measure the response time and where the bulk of the processing takes place (see Figure 2):
It should show where the bulk of the processing time was.
It should highlight an index and we can check if the index is fragmented or needs to be rebuilt.
Updates query optimization statistics on a table or indexed view. By default, the query optimizer already updates statistics as necessary to improve the query plan; in some cases you can improve query performance by using UPDATE STATISTICS or the stored procedure sp_updatestats to update statistics more frequently than the default updates.
Updating statistics ensures that queries compile with up-to-date statistics. However, updating statistics causes queries to recompile. We recommend not updating statistics too frequently because there is a performance tradeoff between improving query plans and the time it takes to recompile queries. The specific tradeoffs depend on your application. UPDATE STATISTICS can use tempdb to sort the sample of rows for building statistics.
Looking for Fragmentation on Indices
There's a general consensus that you should reorganize ("defragment") your indices as soon as index fragmentation reaches more than 5 (sometimes 10%), and you should rebuild them completely when it goes beyond 30%
Sample Script for seeing how fragmented the indices are:
SELECT t.NAME 'Table name', i.NAME 'Index name', ips.index_type_desc, ips.alloc_unit_type_desc, ips.index_depth, ips.index_level, ips.avg_fragmentation_in_percent, ips.fragment_count, ips.avg_fragment_size_in_pages, ips.page_count, ips.avg_page_space_used_in_percent, ips.record_count, ips.ghost_record_count, ips.Version_ghost_record_count, ips.min_record_size_in_bytes, ips.max_record_size_in_bytes, ips.avg_record_size_in_bytes, ips.forwarded_record_count FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED') ips INNER JOIN sys.tables t ON ips.OBJECT_ID = t.Object_ID INNER JOIN sys.indexes i ON ips.index_id = i.index_id AND ips.OBJECT_ID = i.object_id WHERE AVG_FRAGMENTATION_IN_PERCENT > 0.0 ORDER BY AVG_FRAGMENTATION_IN_PERCENT, fragment_count
Sample output in Figure 3 below:
How Can We Defragment Indices?
DBCC INDEXDEFRAG (Transact-SQL)
DBCC INDEXDEFRAG defragments the leaf level of an index so that the physical order of the pages matches the left-to-right logical order of the leaf nodes,therefore improving index-scanning performance.
Slow Running INSERTs and UPDATEs
Two common reasons for slow inserts or updates are blocking and auto growth/shrink.
Is there any reporting on these tables or is there any blocking?
There are a few different tools that can be used to look for this.
"Sp_who2" (see Figure 3 for output):
USE Master GO EXEC sp_who2 GO
(see Figure 4 for output):
USE Master GO SELECT session_id, wait_duration_ms, wait_type, blocking_session_id FROM sys.dm_os_waiting_tasks WHERE blocking_session_id <> 0 GO
"SQL Server Management Studio Activity Monitor"
(see Figures 5 and 6 for output):
"SQL Server Mgmt. Studio Reports"
(see Figure 7):
"SQL Server Profiler"
(see Figure 8):
sp_configure 'show advanced options', 1 GO RECONFIGURE GO sp_configure 'blocked process threshold', 20 GO RECONFIGURE GO
- Auto Shrink / Auto growth
If these are not set appropriately, and the size of the data files changes frequently, this will cause delays.
(see Figure 9):
For certain tables it may make sense, to put those tables on their own disk or file group. The speed of the disks should be looked at.
For example, using MS SQL (as shown in Figure 10):
ALTER DATABASE [MDB] ADD FILEGROUP [Second]
- Service Desk Virtual Database and Database Agents
Pdm_vdbinfo is one of the best tools for determining when to increase the number of database agents. The pdm_vdbinfo command produces a report that shows you what each database agent is currently doing in your CA Service Desk Manager environment. One of the most important pieces of information from the pdm_vdbinfo report is whether work requests are being queued.
For example, when an end-user search request is backlogged (queued) in the Virtual Database (bpvirtdb_srvr), that user does not get a response until the backlog clears, and the longer the backlog, the longer the user has to wait for the response. This type of delay can cause the user to interpret the performance as a slow response or "hang" in the interim.
Check Agent Activity
The pdm_vdbinfo command produces a report that can provide you with a snapshot of what each database agent is currently doing in your CA Service Desk Manager environment.
To check agent activity
pdm_vdbinfo Report Output
- Open a command prompt on the primary server.
- Enter pdm_vdbinfo and direct the output to a text file for review. For example, enter the
pdm_vdbinfo > vdbinfo.out
Note: Because this report produces a snapshot in time, you will need to run this command on more than one occasion to see how the system is performing over time.
- After you execute this report, carefully note the number of the following:
- Pending requests
- Agents with lots of requests in Work
- Check for tables with many I/O requests that are performed on ID 00 database agent. ID 00 is the generic update agent. This agent updates all tables that do not have specific agents associated with them. If there are many updates to a specific table on agent 00, the agent is busy processing those updates and not available to process updates for other tables. Moving this table to a separate agent alleviates the overhead caused by updates to this single table. The primary function of this report is to identify the amount of work currently en route, or queued to the database agents. To make the most of this information, review it next to the stdlog output and in the context of the end-user perceived performance.
One of the most important pieces of information from the pdm_vdbinfo report is whether work requests are being queued. For example, when an end-user search request is backlogged (queued) in the Virtual Database (bpvirtdb_srvr), that user does not get a response until the backlog clears, and the longer the backlog, the longer the user has to wait for the response. This type of delay can cause the user to interpret the performance as a slow response or "hang" in the interim.
The pdm_vdbinfo report output is divided into several sections. The header section contains most of the information necessary for making preliminary judgments about the state of the bpvirtdb_srvr process. This information includes the following details:
- Min Config Agents -Specifies the minimum number of agents to start. The bpvirtdb_srvr -n parameter in the pdm_startup file sets this value. In this example, bpvirtdb_srvr is set to start 25 SELECT database agents upon startup.
- Max Config Agents -Specifies the maximum number of agents to start. The NX_MAX_DBAGENT variable sets this value. In this example, the bpvirtdb_srvr is configured to start up a maximum of 40 SELECT database agents. Upon startup, the minimum number of agents start; when all of those agents are busy, additional agents start (up to the maximum number of agents specified). When agents are no longer busy for a period, they stop until the specified minimum number of agents is reached. At times, during runtime, the system reduces the maximum number to a number that is one less than the configured number.
- Max DB Agents -Specifies the maximum number of database agents to start. This value is typically the same as Max Config Agents.
- Tgt num idle -Represents how many agents must be idle for a period before the bpvirtdb_srvr process starts terminating database agents down to the minimum configuration limit. This value is hardcoded to "2."
- Num Agents running -Identifies the total number of agents currently connected to the bpvirtdb_srvr process.
- Num Agents starting -Indicates the number of database agents that bpvirtdb_srvr is in the process of starting.
- Num Requests pending -Indicates the total number of SELECT requests currently queued in the bpvirtdb_srvr process waiting for an available database agent.
- Actual num idle -Indicates the number of SELECT agents waiting for work. In this example, there are no (0) agents waiting, which indicates that there is a backlog as all agents are doing work. If there is no queuing or if the queuing is only short term, generally you do not need to increase the number of agents. However, if database requests are constantly queued during peak times, this behavior can indicate a valid need to increase the number of agents.
Note: If query requests are being queued, determine the cause before making any adjustments. Otherwise, if you increase the number of database agents without knowing the actual cause of the increased queuing, the result may be degraded performance rather than improved performance. For example, if the database is already overworked as the result of bad index statistics, and cannot handle more requests efficiently, then increasing the number of requests sent to the database could result in a slower response, and ultimately, more queuing.
It may be useful to run pdm_vdbinfo under different load conditions to determine the effect of load level on queuing. Compare the results to the stdlogs to determine if long-running queries are being executed at the time. Remember to look both forward and backward in the log because long-running queries are logged only when the query completes (which may occur after you detect queuing). If throughput was acceptable and the backlog was based on load, try increasing the number of database agents.
The pdm_vdbinfo report also includes a Delayed ID Queue section, which identifies the "select short" cache usage. A select short query is one that has the WHERE clause of "WHERE id=?" and only returns one row. Consider the following example (in Figure 11):
This listing displays any table that is accessed as part of a select short query. For each table entry, the following information appears:
- Hash -Reserved for developer use only.
- Queue -Indicates the actual number of select short responses that are currently queued. If the queue "wraps," this number equals the MaxQueue value. A queue is limited to its MaxQueue value. On the MaxQueue +1 select short query, the oldest query is dropped and the queue wraps.
- MaxQueue -Configures maximum length of the queue for this table. The default value is 101 rows.
- Min -Indicates the shortest time the queue has taken to wrap since the bpvirtdb_srvr process started. If the queue has not wrapped, this value is 0 (zero).
- Max -Indicates the longest time the queue has taken to wrap since the bpvirtdb_srvr process started. If the queue has not wrapped, this value is 0 (zero).
- Cur -Indicates the age of the oldest entry in the queue. If both Min and Max values are 0 (zero), this value is the amount of time since the table was first referenced.
- When an update occurs, the queue should not wrap until all the domsrvr processes have requested the update. A good rule is to keep the Min value greater than two seconds. If the Min value is greater than two, increasing the number of queued entries only increases memory usage and does not improve performance.
Tables with many I/O requests use the ID 00 database agent. ID 00 is the generic update agent. This agent updates all tables that do not have specific agents associated with them. If there are many updates to a specific table on agent 00, the agent is busy processing those updates and not available to process updates for other tables. Moving this table to a separate agent alleviates the overhead caused by updates to this single table. The primary function of this report is to identify the amount of work currently en route, or queued to the database agents. To make the most of this information, review it next to the stdlog output and in the context of the end-user perceived performance.