using the SLM get statistics screen I can see that for one of my robots that QOS is not longer being stored. I used DR nimbus and see it is making it to the primary but its not saving it to the database. What is happening?

Document ID : KB000045860
Last Modified Date : 14/02/2018
Show Technical Document Details
 
 

Problem:

I can see QOS data coming in from my probe to the primary hub via Dr. Nimbus but I can not see it in SLM or in PRD.
Steps already done:
- re-started Data_Engine;
- re-started NAS;
- re-deployed probe
 
What can be causing the issue?
 

Solution:

To find the solution we need to do the following:
1) increase the loglevel to 3 and logsize to 50000 on the data_engine
2) deactivate the data_engine clear the current logs
3) Activate the data_engine wait 5 minutes
4) deactivate and activate the problem probe
5) Activate the problem probe and wait for 5 minutes
 
In the data_engine logs we saw entries such as  the following:
 
ug 18 16:39:52:238 [8288] de: ADO_QoSInsert::InsertQosObjectEx - exists already qos=QOS_CPU_USAGE, source=ComputerName, target=Total
Aug 18 16:39:52:239 [8288] de: QoSInsert::AddQoSObjectToMaps - table=886174 qos=QOS_CPU_USAGE source=ComputerName target=Total metric=MD82CBBD26C4661B3795896A4903BDBBA Complete: yes
Aug 18 16:39:52:239 [8288] de: QoSInsert::CreateQoSObjectDB - QoS object creation failed, unknown reason


Aug 18 16:39:52:258 [8288] de: HandleMessage: Insert Data - nimid: ZN95990931-01412 (2016-08-18 16:39:50) table_id: 0 value= 22275.00 time: 2016-08-18 16:39:50 qos=QOS_DISK_USAGE source=ComputerName target=D:\
Aug 18 16:39:52:259 [8288] de: ADO_QoSInsert::InsertQosObjectEx - exists already qos=QOS_DISK_USAGE_PERC, source=ComputerName, target=D:\
Aug 18 16:39:52:259 [8288] de: QoSInsert::AddQoSObjectToMaps - table=886187 qos=QOS_DISK_USAGE_PERC source=ComputerName target=D:\ metric=M8EB9F6BDCD43D9068B4C9C3CA1C7345E Complete: yes
Aug 18 16:39:52:259 [8288] de: QoSInsert::CreateQoSObjectDB - QoS object creation failed, unknown reason
Aug 18 16:39:52:259 [8288] de: QoSInsert::InsertData Failed to create data object in DB. id=-1 defid 2 qos=QOS_DISK_USAGE_PERC
 
This was tracked down to a corruption or problem in the S_QOS_DATA table were the check some that is stored in the database is not longer valid.
When the entry in the s_qos_data is first created a check sum is created on the following fields.
QOS
Source
target
 
If any of these values changes then the check sum calculation will fail and you will see the above message.
 
The best way to correct this is to delete the record having an issue from the s_qos_data and then depreciate and activate the problem probe.
This will cause a new record to be created and the data can then be inserted again.
 
 
 
NOTE:
In this case the client had run a manual update query on the source field to change it from FQDN to short host name. This caused the problem.
 
To change from using the FQDN to short host name the below is a better solution.
The default for any probe is to use the source of the robot that it sits on - which by default is the server's hostname (such as uslil740.am.jllnet.com). Some probes have a source override, these are usually remote monitoring probes like interface_traffic, rsp, net_connect, etc... as well as cdm since it can monitor remote shares. Probes that are limited to monitoring the robot machine itself generally do not have a source override, and instead simply use the robot's source.

The robot's source is controlled through the config GUI, under Setup > Misc. If the "Set QoS source to robot name instead of computer hostname" is checked, it will use the short name. If it is unchecked, it will use the hostname from DNS (usually the FQDN). You can determine exactly what the controller's source is by using the controller's probe utility (highlight the controller and press ctrl+p) and issuing the get_info callback, which will have the source.

Where possible, I would avoid custom sources inside the probes and only do so where necessary, such as the case of remote monitoring probes like I mentioned earlier. Get the robot to report the desired source (FQDN or short name) and the rest of the probes will follow unless they are overriding the source themselves.

As for the recommendation of whether to go FQDN or short name as a best practice, the direction of engineering and PM is to encourage FQDN. This is especially important when you are working with multiple customers or domains where you may have the same name if it is not fully qualified. (i.e. exchange.apj.jll.com; exchange.us.jll.com) this is typically more true in MSP’s rather than single businesses. However, that does not preclude the use of short name if it is determined to be a better answer.