VMware probe Error Failed to execute - max query size

Document ID : KB000010294
Last Modified Date : 07/11/2018
Show Technical Document Details
Introduction:

When monitoring some vCenters with the vmware probe, errors similar to the following may appear in the vmware.log file:

Failed to execute single perf query for entity resgroup-1040. Max query size exceeded. Request processing is restricted by administrator (vpxd.stats.maxQueryMetrics). Follow VMware KB 2107096 to resolve.

The VMware KB 2107096 article specifies the following for setting maxQueryMetrics in the monitored vCenter:

"When you see the Request processing is restricted by the administrator message, if you are in doubt about the value you should set the maxQuerySize parameter, you can start with a value that is above the largest number of virtual machines or hosts in the cluster"

Setting maxQuerySize to 1024 did not resolve this error in a vCenter with 254 virtual machines.

Background:

The VMware KB 2107096 article indicates that setting maxQueryMetrics to -1 disables the limit, but will allow larger queries to the vCenter Server database which may result in deterioration in the general responsiveness of the vCenter Server or increased response time to display some of the charts.  If this is a concern, then we need a better set of guidelines for determining the value to use for the maxQueryMetrics setting.

Environment:
- UIM Server 8.x
- vmware v7.11 and earlier
- vCenter versions supported by the v7.11 and earlier vmware probes
Instructions:

The probe actually makes 2 kinds of queries to the monitored vCenter for metrics:

- real time performance
- summary performance

The real time performance queries are made to each individual VM/ESX host managed by the monitored vCenter and does not touch the vCenter database. The number of metrics retrieved by these queries are controlled by the perf_request_batch_size key set in the setup section of the vmware probe's configuration file. You can access this through the probe's Raw Configure GUI. By default, this is set to 64 and typically does not need to be changed.

The summary performance metrics are collected separately for the following 4 vCenter items:

datastores
resource pools
cluster
vApps

The collection of these metrics are made against the vCenter database and are affected by the setting of the maxQueryMertics setting in your vCenter.  If the vmware log files contain the "Max query size exceeded" error when trying to pull the metrics for any of these 4 items, it means that there are more metrics available for the item than the vCenter maxQueryMetrics setting allows.

The VMware KB 2107096 article recommendation for setting maxQueryMetrics will only work for the resource pools summary performance metrics if you have no resources associated with the virtual machines or hosts in the cluster.  When the vmware probe retrieves the summary performance metrics for the resource pools, the query is requesting all resources associated with each virtual machine.  This includes resources such as CPUs, memory, disks, network adapters, etc.  So if you have 254 virtual machines in your vCenter each with 4 CPUs, memory, disks, and network adapters and you set maxQueryMetrics to 1024, then the vmware probe will continue to raise this error because there are more than a total of 1024 resources associated with all 254 virtual machines.

If you look at the vpxd log in the vCenter, it will tell you the actual number of metrics associated with the query being made by the vmware probe.  The message will look something like the following:

error vpxd[7F1E957A1700] [Originator@6876 sub=MoPerfManager opID=2d0077b] The query size of 1964 metrics exceeded the vpxd.stats.maxQueryMetrics limit of 1024 metrics. Dropping.

Setting maxQueryMetrics to -1 disables the limit check, but if you are concerned with potential performance degradation in the vCenter by using this setting, then you will need to establish a value which will allow the vwmare probe to retrieve these metrics without exceeding the maxQueryMetrics setting.  You can use a formula like

<average number of resources on a VM> * <number of VMs in the vCenter>

The other option is to check for the "query size" error messages in the vpxd.log file and use a value slightly larger than the biggest query size value reported.  In either case, you will have to keep an eye on the maxQueryMetrics setting if you start to increase the total number of VMs in the monitored vCenter.

If you have no interest in monitoring any of the metrics associated with the summary metrics collected by the vmware probe and you do not want to increase the maxQueryMetrics setting in your vCenter, you can basically ignore the error message in the vmware.log file.

The probe, by design, will continue to attempt to retrieve these summary performance metrics every polling cycle.

In this case there was one more change that had to be made to resolve the issue.
There was a missing hostname that had to be added (for the given host that vmware was running on), in the /etc/hosts file, because it wasn’t resolving via DNS.
 

 

Additional Information:

If maxQueryMetrics is set to a high enough value and you were to configure the probe using the Configuration GUI from Infrastructure Manager, then the polled summary performance metrics would appear in the GUI.  They are not displayed in the Admin Console Configure Page, but the probe still attempts to collect these metrics every polling cycle.

The probe will publish the summary performance metric QoS, if you define an active monitor to collect any of these metrics and you have maxQueryMetrics set to a value that will accommodate all of the summary performance metrics for all summary performance items on the monitored vCenter. If the maxQueryMerrics value is not set high enough, the probe will not be able to publish the active summary performance metric QoS and you will not see these values in the probe's Configure GUI in Infrastructure Manager.

 

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2107096