The types of problems outlined above are discussed in detail below. Each type/cause has a unique solution which is given immediately after the details of the problem and how to confirm if that problem applies to what you are seeing.
1. Timeout while communicating with a sub-process that has established a connection with the parent process
If this is the cause for the standard user interface error then the following message is typically seen inside of the nolio_all.log:
2016-02-01 10:53:41,937 [ProcessInvoker-Server-0] DEBUG (com.nolio.platform.shared.datamodel.execution.remote.ProcessesInvoker:485) - Handling remote process request...
2016-02-01 10:53:42,045 [ProcessInvoker-Server-0] INFO (com.nolio.platform.shared.datamodel.execution.remote.ProcessesInvoker:495) - new user process connected: myImpersonateUserId
2016-02-01 10:53:42,045 [ProcessInvoker-Server-0] ERROR (com.nolio.platform.shared.datamodel.execution.remote.ProcessesInvoker:510) - error handling remote process connection for user myImpersonateUserId
java.lang.IllegalStateException: cannot find process for user myImpersonateUserId
at com.nolio.platform.shared.datamodel.execution.remote.ProcessesInvoker.handleClient(ProcessesInvoker.java:498)
at com.nolio.platform.shared.datamodel.execution.remote.ProcessesInvoker.run(ProcessesInvoker.java:461)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
2016-02-01 10:53:41,940 [main] DEBUG (root:186) - [myImpersonateUserId] connect successfully. sending username myImpersonateUserId
2016-02-01 10:54:41,914 [Thread-0] INFO (root:394) - [myImpersonateUserId] process of user myImpersonateUserId is up
2016-02-01 10:55:17,352 [HealthMonitor] DEBUG
Also, the following messages are typically logged to the <NolioAgentInstallDir>/logs/myImpersonateUserId_output.log:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/LISAReleaseAutomationAgent/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/LISAReleaseAutomationAgent/actionslib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]NolioSocketAppender is not ready yet!
10:53:41,775 [main] DEBUG (ProcessesConfig.java:89) - remote process execution properties:{cmd.to.execute=./SudoActionsRunner.sh, max.processes=5, core.agent.instance.action.threads=10, max.process.action.threads=5, server.timeout=20000, ssh.port=22, grant.permissions=false, max.idle.time.in.seconds=900, process.timeout=20000}[NolioSocketAppender is not initialized yet] [null] remote process execution properties:{cmd.to.execute=./SudoActionsRunner.sh, max.processes=5, core.agent.instance.action.threads=10, max.process.action.threads=5, server.timeout=20000, ssh.port=22, grant.permissions=false, max.idle.time.in.seconds=900, process.timeout=20000}
10:53:41,812 [main] INFO (ProcessesConfig.java:128) - ProcessInvoker properties:processCreationTimeout-20000, serverInitializationTimeout-20000, grantPermissions-false, coreAgentInstanceActionThreads-10, maxAgentInstanceActionThreads-40, maxRemoteProcesses-5, maxRemoteProcessActionThreads-5, maxIdleTimeInSeconds-900, cmdToExe-./SudoActionsRunner.sh, sshPort-22[NolioSocketAppender is not initialized yet] [null] ProcessInvoker properties:processCreationTimeout-20000, serverInitializationTimeout-20000, grantPermissions-false, coreAgentInstanceActionThreads-10, maxAgentInstanceActionThreads-40, maxRemoteProcesses-5, maxRemoteProcessActionThreads-5, maxIdleTimeInSeconds-900, cmdToExe-./SudoActionsRunner.sh, sshPort-22ActionRunner is running
10:53:41,847 [main] INFO (ActionsRunner.java:137) - changing log configuration[NolioSocketAppender is not initialized yet] [null] changing log configuration
10:53:41,940 [main] DEBUG (ActionsRunner.java:186) - connect successfully. sending username myImpersonateUserId
10:54:41,914 [Thread-0] INFO (ActionsRunner.java:394) - process of user myImpersonateUserId is up
Note:
The "connect successfully. sending username myImpersonateUserId" message seen in the myImpersonateUserId_output.log is confirmation that the sub-process has established a connection with the parent process. The timeout occurring here is while the parent process sends the sub-process instructions on what to do.
The message "cannot find process for user myImpersonateUserId" in the nolio_all.log file is the message that ties this cause to its respective solution.
Resolution:
Increase the <NolioAgentInstallDir>/conf/processes.properties value for "process.timeout". By default it is 20000 (20 seconds).
2. Timeout while waiting for the sub-process to connect with the parent process.
If this is the cause of the standard user interface error then the following messages will be found in the nolio_all.log file:
2018-08-13 13:05:58,521 [main] ERROR (root:195) - [myImpersonateUserId] error occurred
java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at com.nolio.platform.shared.datamodel.execution.remote.ActionsRunner.init(ActionsRunner.java:183)
at com.nolio.platform.shared.datamodel.execution.remote.ActionsRunner.main(ActionsRunner.java:107)
This is seen after:
a. The exception that accompanies the standard user interface error. Example:
2018-08-13 13:05:55,351 [job-3735684-jobServer-3735797-6:Run as myImpersonateUserId(P8633694000.F8633698000.E8633699000):Run Command Line] ERROR (com.nolio.platform.shared.datamodel.Action:119) - Exception caught: com.nolio.platform.shared.datamodel.execution.remote.ProcessCreationFailedException: Unable to run action for user myImpersonateUserId. Verify the user running Nolio agent has privileges to impersonate other users, and that user myImpersonateUserId has permissions on Nolio installation folder.
b. A message that may or may not appear regarding the process being up. Example:
2018-08-13 13:05:55,521 [Thread-0] INFO (root:394) - [ldgadm] process of user myImpersonateUserId is up
The message indicating that the process is up is a little misleading. The process does start. But a successful connection between the sub-process and agent process involves the two connections. One connection/port used for the parent process to send instructions. Another connection/port used for the parent to read stdout/stderr generated by the commands run by the sub-process. This port is referred to as the "logger port". The problem reported to technical support was related to a timeout with the logger port. This generated the following additional message in the nolio_all.log file:
2018-08-13 13:05:58,531 [Thread-11] ERROR (com.nolio.platform.shared.datamodel.execution.remote.ConsolidationLogger:98) - Error while handling remote logging
java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at com.nolio.platform.shared.datamodel.execution.remote.ConsolidationLogger$ClientLoggerHolder.run(ConsolidationLogger.java:87)
at java.lang.Thread.run(Unknown Source)
The myImpersonateUserId_output.log looks pretty standard with the following exception - the "process of user myImpersonateUserId is up" message is seeing after the time out (in this case we had already increased it to 60 seconds):
13:04:55,516 [main] INFO (ActionsRunner.java:137) - changing log configuration[NolioSocketAppender is not initialized yet] [null] changing log configuration
13:05:55,521 [Thread-0] INFO (ActionsRunner.java:394) - process of user myImpersonateUserId is up
13:05:58,521 [main] ERROR (ActionsRunner.java:195) - error occurred
java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at com.nolio.platform.shared.datamodel.execution.remote.ActionsRunner.init(ActionsRunner.java:183)
at com.nolio.platform.shared.datamodel.execution.remote.ActionsRunner.main(ActionsRunner.java:107)
The problem was found to be related to a invalid DNS entry for the servername. This was detected by running "netstat -aonp | grep <pid of sub-process>" and seeing something like the following:
tcp 0 1 . ::ffff:192.168.0.12:33258 ::ffff:192.168.2.12:36758 SYN_SENT <pid of sub process>/./jre/bin/Nol <etc.>
In this case the local address (192.168.0.12) is correct address and the foreign address (192.168.2.12) was invalid. When doing an nslookup against the servername it was found to point to the invalid address. The address is being used by the parent process for its end of the logger port connection.
Resolution:
- Update DNS. After DNS is updated you will need to stop the agent, flush dns (and possibly arp cache) and then restart the agent; or
- Update /etc/hosts with an entry for the correct IP address and this machines hostname (short and fully qualified hostname).
3. Noliouser does not have access to execute processes on behalf of myImpersonateUserId
If this is the cause of the standard user interface error then the following message is usually seen in the myImpersonateUserId_output.log:
sudo: no tty present and no askpass program specified
You can usually confirm this by logging into myAgentMachine as noliouser and run:
sudo -u mymyImpersonateUserId echo test
Good result output:
test
Bad result:
Sorry, user noliouser is not allowed to execute '/bin/echo test' as myImpersonateUserId on myAgentMachine.
Resolution:
Add one (only one) of the following lines to the /etc/sudoers file:
- noliouser ALL=(myImpersonateUserId) NOPASSWD:ALL
- noliouser ALL=(ALL) NOPASSWD:ALL
- noliouser ALL=(myImpersonateUserId) ALL
- noliouser ALL=(ALL) ALL
4. The myImpersonateUserId does not have r-x access to <NolioAgentInstallDir>/jre folder/files and sub folder/files
When the myImpersonateUserId doesn't have r-x access to <NolioAgentInstallDir>/jre folder/files and sub folder/files the following message may be logged to the myImpersonateUserId_output.log
./ActionsRunner.sh: line 14: ./jre/bin/NolioAgent: Permission denied
Resolution:
chmod -R 755 <NolioAgentInstallDir>/jre
5. The myImpersonateUserId does not have access to any folders/files except <NolioAgentInstallDir>/jre
When the myImpersonateUserId does not have access to the files/folders (except <NolioAgentInstallDir>/jre) then the myImpersonateUserId_output.log gets created but is blank.
Resolution:
chmod -R 755 <NolioAgentInstallDir>
6. The noliouser does not have "Defaults:noliouser !requiretty" in sudoers
The following message can be seen in the <NolioAgentInstallDir>/logs/myImpersonateUserId_output.log:
sudo: sorry, you must have a tty to run sudo
Resolution:
Add the following line to your /etc/sudoers file:
Defaults:noliouser !requiretty
7. The noliouser sudoers can impersonate myImpersonateUserId but it is not configured appropriately
The following message can be seen in the <NolioAgentInstallDir>/logs/myImpersonateUserId_output.log:
sudo: no tty present and no askpass program specified
Resolution:
Add the following line to your /etc/sudoers file:
noliouser ALL = (myImpersonateUserId) /<NolioAgentInstallDir>/ActionsRunner.sh
or
noliouser ALL = (myImpersonateUserId) NOPASSWD:/<NolioAgentInstallDir>/ActionsRunner.sh
8. The myImpersonateUserId does not have appropriate resources.
The following message can be seen in the <NolioAgentInstallDir>/logs/nolio_all.log:
No privileges.
And the following message can also be seen in the <NolioAgentInstallDir>/logs/myImpersonateUserId_output.log
.../<NolioAgentInstallDir>/ActionsRunner.sh: fork: retry: Resources temporarily unavailable...
Resolution:
Compare the output of the ulimit command (run as both noliouser and myImpersonateId). Make sure that the limits for myImpersonateId are greater than or equal to the limits set for noliouser.