Bug 161096 - ServerNotAvailableException when launching tests on Linux.
Summary: ServerNotAvailableException when launching tests on Linux.
Status: CLOSED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: TPTP (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P1 blocker (vote)
Target Milestone: ---   Edit
Assignee: Igor Alelekov CLA
QA Contact:
URL:
Whiteboard:
Keywords: plan
Depends on: 175264
Blocks: 125103
  Show dependency tree
 
Reported: 2006-10-16 13:02 EDT by Paul Slauenwhite CLA
Modified: 2016-05-05 11:02 EDT (History)
2 users (show)

See Also:


Attachments
Service log (DEBUG logging level) (23.11 KB, application/x-zip-compressed)
2006-10-16 13:03 EDT, Paul Slauenwhite CLA
no flags Details
ACStart.sh (1.69 KB, application/octet-stream)
2007-02-02 05:47 EST, Igor Alelekov CLA
no flags Details
ACStop.sh (981 bytes, application/octet-stream)
2007-02-02 05:49 EST, Igor Alelekov CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Slauenwhite CLA 2006-10-16 13:02:39 EDT
ServerNotAvailableException when launching tests on Linux.

Using the TPTP-4.3.0-200610160100 build, launching a test (e.g. manual) on the Linux (Linux paules 2.4.27-2-386 #1 Mon May 16 16:47:51 JST 2005 i686 GNU/Linux), the following ServerNotAvailableException is thrown on the client side:

java.lang.RuntimeException: org.eclipse.hyades.internal.execution.core.file.ServerNotAvailableException: java.net.ConnectException: Connection refused: connect
	at org.eclipse.hyades.execution.harness.JavaExecutionDeploymentAdapter.deployFile(JavaExecutionDeploymentAdapter.java:433)
	at org.eclipse.hyades.execution.harness.JavaExecutionDeploymentAdapter.deployToNode(JavaExecutionDeploymentAdapter.java:496)
	at org.eclipse.hyades.execution.harness.JavaExecutionDeploymentAdapter.deployTestAssets(JavaExecutionDeploymentAdapter.java:469)
	at org.eclipse.hyades.execution.harness.util.ExecutionAdapterUtilities.adaptExecutionDeployment(ExecutionAdapterUtilities.java:159)
	at org.eclipse.hyades.execution.harness.TestExecutionHarness.launchTestExecution(TestExecutionHarness.java:1973)
	at org.eclipse.hyades.execution.harness.TestExecutionHarness.access$3(TestExecutionHarness.java:1817)
	at org.eclipse.hyades.execution.harness.TestExecutionHarness$2.run(TestExecutionHarness.java:741)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:58)
Caused by: org.eclipse.hyades.internal.execution.core.file.ServerNotAvailableException: java.net.ConnectException: Connection refused: connect
	at org.eclipse.hyades.internal.execution.core.file.dynamic.FileServerCommandFactory.connectSocketChannel(FileServerCommandFactory.java:311)
	at org.eclipse.hyades.internal.execution.core.file.dynamic.FileServerCommandFactory.connectSocketChannel(FileServerCommandFactory.java:269)
	at org.eclipse.hyades.internal.execution.core.file.dynamic.FileServerCommandFactory.createPutFileCommand(FileServerCommandFactory.java:535)
	at org.eclipse.hyades.execution.local.file.FileManagerExtendedImpl.putFile(FileManagerExtendedImpl.java:992)
	at org.eclipse.hyades.execution.local.file.FileManagerExtendedImpl.putFile(FileManagerExtendedImpl.java:1056)
	at org.eclipse.hyades.execution.local.file.FileManagerExtendedImpl.putFile(FileManagerExtendedImpl.java:1024)
	at org.eclipse.hyades.execution.local.file.FileManagerExtendedImpl.putFile(FileManagerExtendedImpl.java:1009)
	at org.eclipse.hyades.execution.harness.JavaExecutionDeploymentAdapter.deployFile(JavaExecutionDeploymentAdapter.java:427)
	... 7 more
Caused by: java.net.ConnectException: Connection refused: connect
	at sun.nio.ch.Net.connect(Native Method)
	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:481)
	at org.eclipse.hyades.internal.execution.core.file.socket.SocketChannelFactory.create(SocketChannelFactory.java:78)
	at org.eclipse.hyades.internal.execution.core.file.socket.SocketChannelFactory.create(SocketChannelFactory.java:63)
	at org.eclipse.hyades.internal.execution.core.file.dynamic.FileServerCommandFactory.connectSocketChannel(FileServerCommandFactory.java:300)
	... 14 more
Comment 1 Paul Slauenwhite CLA 2006-10-16 13:03:19 EDT
Created attachment 52044 [details]
Service log (DEBUG logging level)
Comment 2 Paul Slauenwhite CLA 2006-10-16 13:04:26 EDT
This defect was reproduced using the new Agent Controller for Linux (LINUX-IA32)
Comment 3 Joe Toomey CLA 2006-10-16 13:11:25 EDT
I think this is a known issue where you need to add the libjvm.so path to the LD_LIBRARY_PATH.  I hope that requirement will go away before we ship.
Comment 4 Paul Slauenwhite CLA 2006-10-16 13:12:04 EDT
Could this be caused by the note having the JVM libraries (e.g. jre/bin/classic/libjvm.so) added to the LD_LIBRARY_PATH?  There are no new install instructions in the getting_started.html.
Comment 5 Karla Callaghan CLA 2006-10-16 16:24:57 EDT
Yes, you need to have the "classic" in the path when using the IBM jvm.  I expect you would see the same issue when using the RAC.  If you launch the AC using the start script RAStart.sh (or ACStart.sh) then the path is taken care of for you, otherwise, you need to add it to LD_LIBRARY_PATH.

We can add a note to the getting started about having to do this if the startup scripts are not used, it's not in there now because it tells the user to use RAStart so that they don't have to do this.

Can you confirm this fixes your problem and either close the bug or mark it as a Doc need?
Comment 6 Thayaparan Shanmugaratnam CLA 2006-10-16 18:43:57 EDT
I am also seeing similar behavior, but I am not sure if this is a separte bug or whether it is the cause of this bug.

Build: TPTP-4.3.0-200610090100
Steps:
1. Install AC with security enabled (No problem to report when security is diabled)
2. Start the AC using RAStart.sh or ACStart.sh
3. From your workbench, try a TestConnection

Notice that either your workbench will hang or nothing will happen.  Cindy and I were able to see this problem on 2 seperate Linux machines.  
In the servicelog.log here is the error that appears:

java.lang.IllegalArgumentException: port out of range:134683320
	at java.net.InetSocketAddress.<init>(Unknown Source)
	at com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(Unknown Source)
	at com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(Unknown Source)
	at org.eclipse.hyades.internal.execution.local.control.SecureConnectionImpl.connect(SecureConnectionImpl.java:95)
	at org.eclipse.hyades.internal.execution.local.control.NodeImpl.connect(NodeImpl.java:324)
	at org.eclipse.hyades.security.internal.util.BaseConnectUtil.secureConnect(BaseConnectUtil.java:430)
	at org.eclipse.hyades.security.internal.util.BaseConnectUtil.connect(BaseConnectUtil.java:221)
	at org.eclipse.hyades.security.internal.util.BaseConnectUtil.connect(BaseConnectUtil.java:538)
	at org.eclipse.hyades.security.internal.util.BaseConnectUtil.connect(BaseConnectUtil.java:199)
	at org.eclipse.hyades.trace.ui.HyadesUtil.testConnection(HyadesUtil.java:599)
	at org.eclipse.hyades.trace.ui.internal.core.TraceHostUI$1.run(TraceHostUI.java:564)
	at org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:67)
	at org.eclipse.hyades.trace.ui.internal.core.TraceHostUI.testConnection(TraceHostUI.java:490)
	at org.eclipse.hyades.trace.ui.internal.core.TraceHostUI.widgetSelected(TraceHostUI.java:731)
	at org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:90)
	at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:66)
	at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:928)
	at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3348)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2968)
	at org.eclipse.jface.window.Window.runEventLoop(Window.java:820)
	at org.eclipse.jface.window.Window.open(Window.java:796)
	at org.eclipse.debug.internal.ui.launchConfigurations.LaunchConfigurationsDialog.open(LaunchConfigurationsDialog.java:1086)
	at org.eclipse.debug.ui.DebugUITools$1.run(DebugUITools.java:383)
	at org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:67)
	at org.eclipse.debug.ui.DebugUITools.openLaunchConfigurationDialogOnGroup(DebugUITools.java:387)
	at org.eclipse.debug.ui.DebugUITools.openLaunchConfigurationDialogOnGroup(DebugUITools.java:329)
	at org.eclipse.debug.ui.actions.OpenLaunchDialogAction.run(OpenLaunchDialogAction.java:80)
	at org.eclipse.jface.action.Action.runWithEvent(Action.java:499)
	at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:539)
	at org.eclipse.jface.action.ActionContributionItem.access$2(ActionContributionItem.java:488)
	at org.eclipse.jface.action.ActionContributionItem$5.handleEvent(ActionContributionItem.java:400)
	at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:66)
	at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:928)
	at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3348)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2968)
	at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:1914)
	at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1878)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:419)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
	at org.eclipse.ui.internal.ide.IDEApplication.run(IDEApplication.java:95)
	at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:78)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:92)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:68)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:400)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:177)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.core.launcher.Main.invokeFramework(Main.java:336)
	at org.eclipse.core.launcher.Main.basicRun(Main.java:280)
	at org.eclipse.core.launcher.Main.run(Main.java:977)
	at org.eclipse.core.launcher.Main.main(Main.java:952)

On the server side if you issue a netstat -a |grep 1000 you will notice that connections to port 10002 are in the CLOSE_WAIT state for every test connection that you attempt.
If this is a seperate bug, please let me know and I will create a new bug report.

Comment 7 Karla Callaghan CLA 2006-10-16 19:58:32 EDT
Thay - I don't see the relationship between your entry and Paul's original report.  He is reporting a different exception entirely and makes no mention of security being enabled.  I think his is a problem with the LD_LIBRARY_PATH.

Your exception is caused by an invalid port value used in trying to obtain the secure socket.  Please open a different bug.
Comment 8 Paul Slauenwhite CLA 2006-10-17 08:52:44 EDT
(In reply to comment #5)

Karla, I am using the RAStart.sh script which sets the path correctly.  Also, manually adding the JRE's classic directory to the LD_LIBRARY_PATH variable does not resolve the problem.
Comment 9 Paul Slauenwhite CLA 2006-10-17 09:59:16 EDT
This defect only affects local test execution on Linux (x86).
Comment 10 Karla Callaghan CLA 2006-10-17 13:16:56 EDT
Changed the "Version" field to 4.3 since this was seen using 4.3 driver pkg.

Added Target and Priority and assigning to Kevin.
Comment 11 Karla Callaghan CLA 2006-10-24 12:58:54 EDT
Retarget to 4.4.  This is a blocker to the adoption of the new AC's backwards compatibility on Linux (125103).
Comment 12 Karla Callaghan CLA 2007-01-25 13:11:30 EST
Assigning new AC Linux bugs to Igor, consider for 4.4.
Comment 13 Igor Alelekov CLA 2007-02-02 05:44:47 EST
Paul, could you repeat the test case with new ACStart.sh and ACStop.sh scripts I attached? Please replace original scripts in the $AC_HOME/bin folder by the new two and use them to launch and stop AC.
Comment 14 Igor Alelekov CLA 2007-02-02 05:47:33 EST
Created attachment 58102 [details]
ACStart.sh
Comment 15 Igor Alelekov CLA 2007-02-02 05:49:31 EST
Created attachment 58103 [details]
ACStop.sh
Comment 16 Paul Slauenwhite CLA 2007-02-06 15:36:18 EST
(In reply to comment #13)
> Paul, could you repeat the test case with new ACStart.sh and ACStop.sh scripts
> I attached? Please replace original scripts in the $AC_HOME/bin folder by the
> new two and use them to launch and stop AC.
> 

After trying the attach scripts using the 4.3.1 TP1 driver on SLES 10, I get the following error when starting the Agent Controller:

Error starting transport layers, Agent controller exiting.
See servicelog.log for error report.
ACServer failed to start.

servicelog.log:
<CommonBaseEvent creationTime="2007-02-06T20:32:55.874455Z" globalInstanceId="A5C8E5F7000D57D07FDD553B746F2220" msg="Unable to create shared memory: acbuffer." severity="50" version="1.0.1">
	<sourceComponentId component="AgentController" componentIdType="TPTPComponent" executionEnvironment="SharedMemListener.c, line 222" instanceId="1003" location="paules.site" locationType="IPV4" processId="3898" subComponent="Shared Memory TL" threadId="3898" componentType="Eclipse_TPTP"/>
	<situation categoryName="ReportSituation">
		<situationType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ReportSituation" reasoningScope="INTERNAL" reportCategory="LOG"/>
	</situation>
</CommonBaseEvent>
<CommonBaseEvent creationTime="2007-02-06T20:32:55.876971Z" globalInstanceId="A5C8E5F7000D61A56C2F92B7065D58DC" msg="An error was returned from TransportLayer(1003)::startTransportLayer errNum = -3" severity="50" version="1.0.1">
	<sourceComponentId component="AgentController" componentIdType="TPTPComponent" executionEnvironment="ConnectionManager.c, line 263" instanceId="2" location="paules.site" locationType="IPV4" processId="3898" subComponent="Connection Manager" threadId="3898" componentType="Eclipse_TPTP"/>
	<situation categoryName="ReportSituation">
		<situationType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ReportSituation" reasoningScope="INTERNAL" reportCategory="LOG"/>
	</situation>
</CommonBaseEvent>

Environment:
LIBPATH=/home/ractest/rac/lib:$LIBPATH
#IBM JRE V1.5.0 SR3:
PATH=/home/ractest/rac/bin:/home/ractest/IBM_JRE_V1.5.0_SR_3/jre/bin:$PATH
MOZILLA_FIVE_HOME=/usr/lib/mozilla
LD_LIBRARY_PATH=$MOZILLA_FIVE_HOME:$LD_LIBRARY_PATH
Comment 17 Igor Alelekov CLA 2007-02-07 03:28:42 EST
It seems this is another defect in shared memory creation on Linux.
After AC get stopped, a shared memory block remains in the System. And another user can't launch AC due to permission restrictions (previous user is marked as owner of the memory block). The solution might be in freeing (destroying) this shared memory block during AC termination.

Coming back to the initial bug, please try to launch AC with new ACStart.sh script by superuser (root) or try to delete shared memory blocks manually (ipcrm -m <shmid>, where <shmid> could be get known by ipcs -m)
Comment 18 Paul Slauenwhite CLA 2007-02-23 12:09:35 EST
(In reply to comment #17)

Sorry for the late reply.

Using the attached startup scripts with a 32-bit JRE/Agent Controller, I cannot even start the Agent Controller despite having root privledges:

ractest@paules:~/rac/bin> su
Password:
paules:/home/ractest/rac/bin # ./ACStart.sh
Starting Agent Controller.
Error starting transport layers, Agent controller exiting.
See servicelog.log for error report.
ACServer failed to start.

servicelog.log:
<CommonBaseEvent creationTime="2007-02-23T17:03:10.403270Z" globalInstanceId="A5DF1E4E0006273952298D8535E1CC39" msg="Unable to create shared memory: acbuffer."
severity="50" version="1.0.1">
        <sourceComponentId component="AgentController" componentIdType="TPTPComponent" executionEnvironment="SharedMemListener.c, line 222" instanceId="1003" location="paules.site" locationType="IPV4" processId="7329" subComponent="Shared Memory TL" threadId="7329" componentType="Eclipse_TPTP"/>
        <situation categoryName="ReportSituation">
                <situationType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ReportSituation" reasoningScope="INTERNAL" reportCategory="LOG"/>        </situation>
</CommonBaseEvent>
<CommonBaseEvent creationTime="2007-02-23T17:03:10.414401Z" globalInstanceId="A5DF1E4E000652BD4A255C1B395B47F8" msg="An error was returned from TransportLayer(1003)::startTransportLayer errNum = -3" severity="50" version="1.0.1">
        <sourceComponentId component="AgentController" componentIdType="TPTPComponent" executionEnvironment="ConnectionManager.c, line 263" instanceId="2" location="paules.site" locationType="IPV4" processId="7329" subComponent="Connection
Manager" threadId="7329" componentType="Eclipse_TPTP"/>
        <situation categoryName="ReportSituation">
                <situationType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ReportSituation" reasoningScope="INTERNAL" reportCategory="LOG"/>        </situation>
</CommonBaseEvent>

This appears to be the same as https://bugs.eclipse.org/bugs/show_bug.cgi?id=175264.
Comment 19 Paul Slauenwhite CLA 2007-02-23 12:10:43 EST
(In reply to comment #18)

Note, I was using the agntctrl.linux_ia32-TPTP-4.4.0-200702211545 driver.
Comment 20 Igor Alelekov CLA 2007-02-26 04:05:59 EST
I've successfully installed and launched agntctrl.linux_ia32-TPTP-4.4.0-200702211545 AC build on RHEL 4 upd 3 with IBM and Sun JRE and on RHAS 2.1 with Sun JRE.
(Note: this AC build doesn't requre corrected ACStart.sh script, only ACStop.sh should be substituted).
Paul, what OS did you use?
Please ensure that you don't have another launched AC instances and attach service log with DEBUG level.
Comment 21 Paul Slauenwhite CLA 2007-02-26 06:54:29 EST
(In reply to comment #20)

Both SLES 9 and 10 showed the symptoms.  This is a clean test machine with no other Agent Controller installs/instances.
Comment 22 Igor Alelekov CLA 2007-02-27 02:48:31 EST
Paul,
I successfully launched AC on SLES 9 and SLES 10 with IBM and Sun JVM.
Is you installation a special?
Please submit an output of "ipcs -m" command on your computer.
If you will see allocated shared memory blocks, could you release them by the command "ipcrm -m <shmid>", where <shmid> should be taken from the "ipcs -m" command. And try to launch AC again please.
Comment 23 Igor Alelekov CLA 2007-02-27 02:51:55 EST
(In reply to comment #18)
> This appears to be the same as
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=175264.

Paul, Is this problems occured on EM64T machine?

Comment 24 Paul Slauenwhite CLA 2007-02-27 06:14:59 EST
(In reply to comment #23)
> (In reply to comment #18)
> > This appears to be the same as
> > https://bugs.eclipse.org/bugs/show_bug.cgi?id=175264.
> Paul, Is this problems occurred on EM64T machine?

Yes, it is. However, we are running a 32-bit Eclipse/JRE.
Comment 25 Igor Alelekov CLA 2007-02-27 07:07:36 EST
(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #18)
> > > This appears to be the same as
> > > https://bugs.eclipse.org/bugs/show_bug.cgi?id=175264.
> > Paul, Is this problems occurred on EM64T machine?
> Yes, it is. However, we are running a 32-bit Eclipse/JRE.

Paul, there are opened several bugs to report AC defects on EM64T - 125105, 175264, 175313. This bug is dedicated to IA32 platform.
Please test AC and backward compatibility on IA32 platform with security disabled (this bug) and with security enabled (bug #161289). 
Comment 26 Igor Alelekov CLA 2007-03-13 06:54:35 EDT
resolving as fixed
Comment 27 Paul Slauenwhite CLA 2007-06-27 14:58:21 EDT
Verified in TPTP-4.4.0-200706140100C.