Troubleshooting Sysload Agents: Collector common errors

Document ID : KB000110461
Last Modified Date : 16/08/2018
Show Technical Document Details
Introduction:
When an issue occurs regarding a Sysload program, a message is displayed in graphic interfaces and/or written in log files.

Messages may have different severities:
  • Information: Message for information only
  • Warning: Something unexpected happened, the program deals with it and continues working normally (but some things may be de-activated for example)
  • Error: Something unexpected happened, the program deals with it but may not be in a stable mode
  • Fatal error: Something unexpected happened and caused an unplanned stop of the program
Depending on the error messages, it is possible to know where the problem is located.
For agents, the most common cases are:
  • Collector is unable to retrieve token(s)
  • Collector listener cannot be bound
  • Perfmon counters are not available (Windows platforms)
  • Historical files are corrupted
  • Historical files are not consistent with the agent’s name
  • Too many domains to monitor
  • File Server is unable to start
  • File Server cannot handle request
These are mainly due to inappropriate configuration, communication issues between the agent and other Sysload components, inability to retrieve needed token(s), corruption of history files.
Question:
How to find the root cause of an issue with a Sysload Agent?

Please find below the most common error messages :

COLLECTOR ERROR MESSAGES

  • Issues related to the configuration:
MM DD HH:MM:SS YYYY : coll - vreq - ERROR, no listener has been bound to CNX Manager
 
  • Issues related to history data
 
MM DD HH:MM:SS YYYY: hist - Cannot create file : c:\Program Files\Sysload\history\activity
MM DD HH:MM:SS YYYY: LPtrVerif OSldEvtHistFile.c[1245] - Error
 
  • Issues related to license retrieval:
 
MM DD HH:MM:SS YYYY: coll - WARNING : Could not connect to XXX.XXX.XXX.XXX:9555 (timeout=5 s). [OLicenseClient_SendRequestToTokenServer]
MM DD HH:MM:SS YYYY: coll -  [1872:1888] *** Error: cannot resolve Token Server Name 'hostname‘
MM DD HH:MM:SS YYYY: coll - checking license: 'Failed to send request to token server (iRc=-201). [OLicenseClient_GetRequestResponseFromTokenServer]'
MM DD HH:MM:SS YYYY: [coll] WARNING : could not retrieve license (Failed to send request to token server (iRc=-201). [OLicenseClient_GetRequestResponseFromTokenServer])
MM DD HH:MM:SS YYYY: [coll] INFO : Trying to retrieve license...

FILE SERVER ERROR MESSAGES

  • Issues related to history data:
 
MM DD HH:MM:SS YYYY: fs_treq - *** Error in file 'C:\Program Files (x86)\Sysload\sldlmd\history\LogMonitor-ent64-Event.idx'; file header (þÿ) is INCORRECT
 

 
Environment:
Sysload Agents
Answer:

Collector is unable to retrieve token(s)
Error(s) / Warning(s)Check(s)Solution
Could not connect to <Token_Server_IP_Address>:<Port>- Is the Token Server started?- Start Token Server
- Is the Token Server listening on the socket specified in the agent configuration file (SLDTKD section)?- Modify Token Server configuration or Agent configuration appropriately and restart services
- Does a “telnet <Token_Server_IP_Address> :<Port>” launched from the agent’s host go through successfully?- Modify network configuration and/or firewall rules.
cannot resolve Token Server Name '<Token_Server_Hostname>‘- Does a “ping <Token_Server_IP_Hostname>” launched from the agent’s host go through successfully?- Check DNS configuration
'Failed to send request to token                        server- Does a “telnet <Token_Server_IP_Address> :<Port>” launched from the agent’s host go through successfully?
- Does a “ping <Token_Server_IP_Hostname>” launched from the agent’s host go through successfully?
- Check DNS configuration
Not enough token- Find the latest occurrence of “coll -  Token(s) needed “: in the sldrmd.log file
- Log on to Token Server Web interface & check tokens availability on the “Token Status” page.
- Update license file accordingly (License modification must be handled by Sales Administration team)
 

Collector listener cannot be bound
Error(s) / Warning(s)Check(s)Solution
coll - vreq - ERROR, no listener has been bound to CNX Manager- What does netstat –av |grep <Collector_port>  (Unix) or netstat –av|find “<Collector_port>” (Windows) command return?
- Is the socket configured for the Collector listener available?
- Are all Collector processes stopped? (On Windows platform :sldrmd.exe & sldrmdhst.exe. On UNIX/Linux platform : sldrm_coll & sldrm_hist)
 
- Modify Collector’s configuration so it starts on a different port.
- Make the currently configured socket available. (If necessary, “kill” the Sysload processs)
 

Perfmon counters are not available (Windows platforms)
Error(s) / Warning(s)Check(s)Solution
Several occurrences of coll - Reallocating 262144 bytes for PerfBuffer at the end of the sldrmd.log file.- Launch Perfmon and check the availability of counters- Rebuild Performance counters : (KE000276).
Historical files are corrupted
Error(s) / Warning(s)Check(s)Solution
 - Does the agent start if you temporarily move the historical files to a different directory?
 
- Rebuild historical files (sld_rebuild or sldrebuild)
- If the rebuild procedure fails, remove historical files & start the agent.(Historical data will be lost)
Historical files are not consistent with the agent’s name
Error(s) / Warning(s)Check(s)Solution
Checking Station (xxxxxx) --> ERROR - In the Event History File the found station is 'yyyyyy' instead of xxxxxx” in the sldrmd.log file- Does the agent start if you temporarily move the historical files to a different directory?
 
- Apply the sld_rename/sldrename procedure
- In case historical data are not needed, simply delete history files
 
Too many domains to monitor
Error(s) / Warning(s)Check(s)Solution
fnBuildTabCom - Too many domains psCom->Igen.sNb_dom(xxxx) >= g_nMaxDom(yyyy) in the sldrmd.log file Increase the maximum number of domains.






 
Additional Information: