sqlserver probe checkpoints - query timed out or failed to execute alarms

Document ID : KB000035010
Last Modified Date : 24/10/2018
Show Technical Document Details
Issue:

The UIM "sqlserver" probe generates alarms if query times out or profile fails to execute in scheduled time interval.

- Code=0x80004005 Source=Microsoft OLE DB Provider for ODBC Drivers Description=[Microsoft][ODBC SQL Server Driver]Timeout expired
- query timed out! alarms

Example:
Profile xxxxxxx, instance xxxxxxx, checkpoint 'free_space' - query timed out!

or

If you are getting alarms for
"Profile <Database Server>, failed to execute in scheduled time interval, delayed by <seconds> seconds",
This means that your profile is taking more time to get executed then the described heartbeat interval.








 

 

Resolution:

Explanation of sqlserver probe timeout fields to help get rid of timeout failures/ 'query timed out' alarms:


Here is a complete explanation describing all of the timeout fields in sqlserver profiles which can help get rid of any timeout alarms/query timeouts/failures:

1. Heartbeat - Defines the interval at which all profile checkpoint schedules will be tested and trigger eventual checkpoint execution.
This number should be a common denominator to all used check interval values.
The higher the value, the lower the profile overhead.

2. Check Interval - Default value for check interval in the profile.
This will be used if nothing else is defined in the checkpoint and overwrites the default checkpoint list setting.

3. Profile Timeout - Defines the maximum processing time for all checkpoints in the profile.
If this timeout is reached, the interval processing is finished and the probe waits for the next heartbeat to evaluate any checkpoint schedules. The alarm message is issued.

4. SQL Timeout - Every checkpoint query runs asynchronously.
In case the query reaches the SQL timeout, the checkpoint processing will be terminated and the next checkpoint will be started. An alarm is issued.

5. Delay Threshold -
Timeout threshold for the profile delay alarm.

Example:

If you are getting alarms for
"Profile <Database Server>, failed to execute in the scheduled time interval, delayed by <seconds> seconds", it means that your profile is taking more time to get executed then the described heartbeat interval.
For example, if the profile has configured as follows:-
1. Heartbeat- 60 Seconds.
2. Checkinterval- 2 Mins-(Checkpoint execution)
3. Profile Timeout- 10 mins.
4. delay_threshold = 15 Sec.

You are getting the alarm because the Profile would be getting executed say 2 mins (1 min more than scheduled interval i.e Heartbeat) which are within the profile timeout limit, but more than "Delay_Threshold" limit.
- Heartbeat is set to start the profile execution.
- Check interval is set for checkpoint execution.
- Profile timeout is the value within which the profile execution should be completed else new run for profile would start.
- Delay threshold is the value within which the next scheduled run of the profile should start else an alarm will be generated.

***The query timed out! alarm comes when the SQL Timeout in the profile is less than how long the query takes so please increase this value to some appropriate value.

If you have lots of data in the database then a checkpoint will take more time which can lead to high profile execution time and hence result in timeout alarms.

You can configure the remaining timeouts based on the above explanation and the alarm you are getting.