Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion

Dave,

You'll also need to add the line:

<module>../../releng/org.eclipse.remote.proxy.server.linux.ppc64le</module>

to the <modules> section in the same file.

If you like, I can set it up to build ppc64le and commit it to the repo. That way I can make sure it works before you update.

Regards,
Greg

On Feb 14, 2018, at 7:33 AM, David Wootton <dwootton@xxxxxxxxxx> wrote:

Greg
I pulled in the update you made and modified the org.eclipse.remote.build pom.xml file as you asked. The section of pom.xml now looks like
<configurationn
<resolver>p2</resolver>
<pomDependencies>consider</pomDependencies>
<environments>
<environment>
<os>linux</os>
<ws>gtk</ws>
<arch>x86</arch>
</environment>
<environment>
<os>linux</os>
<ws>gtk</ws>
<arch>x86_64</arch>
</environment>
<environment>
<os>win32</os>
<ws>win32</ws>
<arch>x86</arch>
</environment>
<environment>
<os>win32</os>
<ws>win32</ws>
<arch>x86_64</arch>
</environment>
<environment>
<os>macosx</os>
<ws>cocoa</ws>
<arch>x86_64</arch>
</environment>
<environment>
<os>linux</os>
<ws>gtk</ws>
<arch>ppc64le</arch>
</environment>
</environments>
<dependency-resolution>
<extraRequirements>
<requirement>
<type>eclipse-plugin</type>
<id>org.eclipse.ui.ide</id>
<versionRange>0.0.0</versionRange>
</requirement>
</extraRequirements>

I initially had the new ppc64le environment first and had the same problem, so moved it, thinking order might matter.

I get one or two steps farther along now and this is what I think is the relevant section of the log

[INFO] Building org.eclipse.remote.proxy.server.linux.x86_64 1.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO]
[INFO] --- tycho-packaging-plugin:0.26.0:build-qualifier (default-build-qualifier) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO] The project's OSGi version is 1.0.0.201802141224
[INFO]
[INFO] --- tycho-packaging-plugin:0.26.0:validate-id (default-validate-id) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO]
[INFO] --- tycho-packaging-plugin:0.26.0:validate-version (default-validate-version) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO]
[INFO] --- tycho-versions-plugin:0.26.0:update-pom (versions) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO] Making changes in C:\Users\IBM_ADMIN\git\org.eclipse.remote\releng\org.eclipse.remote.proxy.server.linux.x86_64
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO] Using 'ISO-8859-1' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Users\IBM_ADMIN\git\org.eclipse.remote\releng\org.eclipse.remote.proxy.server.linux.x86_64\src\main\resources
[INFO]
[INFO] --- tycho-compiler-plugin:0.26.0:compile (default-compile) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO] Using 'ISO-8859-1' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Users\IBM_ADMIN\git\org.eclipse.remote\releng\org.eclipse.remote.proxy.server.linux.x86_64\src\test\resources
[INFO]
[INFO] --- target-platform-configuration:0.26.0:target-platform (default-target-platform) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO]
[INFO] --- tycho-packaging-plugin:0.26.0:package-plugin (default-package-plugin) @ org.eclipse.remote.proxy.server.linux.x86_64 ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Remote Parent ...................................... SUCCESS [ 0.654 s]
[INFO] org.eclipse.remote.target .......................... SUCCESS [ 0.585 s]
[INFO] org.eclipse.remote.core ............................ SUCCESS [ 5.257 s]
[INFO] org.eclipse.remote.jsch.core ....................... SUCCESS [ 1.651 s]
[INFO] org.eclipse.remote.ui .............................. SUCCESS [ 1.859 s]
[INFO] org.eclipse.remote.jsch.ui ......................... SUCCESS [ 1.003 s]
[INFO] org.eclipse.remote.proxy.protocol.core ............. SUCCESS [ 0.734 s]
[INFO] org.eclipse.remote.proxy.core ...................... SUCCESS [ 0.902 s]
[INFO] org.eclipse.remote.proxy.ui ........................ SUCCESS [ 1.140 s]
[INFO] org.eclipse.remote.proxy.server.core ............... SUCCESS [ 0.708 s]
[INFO] org.eclipse.remote.proxy.server.product ............ SUCCESS [ 49.166 s]
[INFO] org.eclipse.remote.proxy.server.linux.x86_64 ....... FAILURE [ 0.065 s]
[INFO] org.eclipse.remote.proxy.server.macosx.x86_64 ...... SKIPPED
[INFO] org.eclipse.remote.console ......................... SKIPPED
[INFO] org.eclipse.remote.serial.core ..................... SKIPPED
[INFO] org.eclipse.remote.serial.ui ....................... SKIPPED
[INFO] org.eclipse.remote.telnet.core ..................... SKIPPED
[INFO] org.eclipse.remote.telnet.ui ....................... SKIPPED
[INFO] org.eclipse.remote.doc.isv ......................... SKIPPED
[INFO] org.eclipse.remote ................................. SKIPPED
[INFO] org.eclipse.remote.proxy ........................... SKIPPED
[INFO] org.eclipse.remote.console ......................... SKIPPED
[INFO] org.eclipse.remote.serial .......................... SKIPPED
[INFO] org.eclipse.remote.telnet .......................... SKIPPED
[INFO] org.eclipse.remote.repo ............................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:35 min
[INFO] Finished at: 2018-02-14T07:25:19-05:00
[INFO] Final Memory: 69M/317M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.eclipse.tycho:tycho-packaging-plugin:0.26.0:package-plugin (default-package-plugin) on project org.eclipse.remote.proxy.server.linux.x86_64: C:\Users\IBM_ADMIN\git\org.eclipse.remote\releng\org.eclipse.remote.proxy.server.linux.x86_64\build.properties: bin.includes value(s) [proxy.server.tar.gz] do not match any files. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

Dave

<graycol.gif>Greg Watson ---02/13/2018 08:04:59 PM---Hi Dave, That first one was strange. I only saw the org.eclipse.core.runtime.jobs.ISchedulingRule is

From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 02/13/2018 08:04 PM
Subject: Re: [ptp-dev] Fw: IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: ptp-dev-bounces@xxxxxxxxxxx





Hi Dave,

That first one was strange. I only saw the org.eclipse.core.runtime.jobs.ISchedulingRule issue when I opened the file in the editor. The build worked fine for me otherwise. I've added the missing dependency to the repo if you want to update.

The second problem is because you need to add the lines:

<environment>
<os>linux</os>
<ws>gtk</ws>
<arch>ppc64le</arch>
</environment>

somewhere in the <environments> section of the target-platform-configuration in the pom.xml in o.e.remote.build.

Hopefully that will resolve the errors, and hopefully all this wasn't a waste of time :-).

Greg


      On Feb 13, 2018, at 5:31 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

      Greg
      I took another look at this and realized the message was complaining about line 54 in LocalResource.java. I opened that file in Eclipse, and it was flagged with an error, where I don't think it was before since I was able to launch my Eclipse runtime instance without problems.


      On a hunch, I added org.eclipse.core.runtime.jobs to the Imported Packages list in the Dependencies tab for org.eclipse.remote.core/plugin.xml, cleaned my workspace, rebuilt and the error was gone. I re-ran the mvn command and it looks like it got past that point.


      However now I get another error


      [ERROR] Cannot resolve dependencies of product proxy.server.product:
      [ERROR] eclipse-plugin artifact with ID "org.eclipse.cdt.core.linux.ppc64le" and version matching "0.0.0" was not found in the target platform
      [INFO] ------------------------------------------------------------------------
      [INFO] Reactor Summary:
      [INFO]
      [INFO] Remote Parent ...................................... SUCCESS [ 0.623 s]
      [INFO] org.eclipse.remote.target .......................... SUCCESS [ 0.648 s]
      [INFO] org.eclipse.remote.core ............................ SUCCESS [ 4.838 s]
      [INFO] org.eclipse.remote.jsch.core ....................... SUCCESS [ 1.596 s]
      [INFO] org.eclipse.remote.ui .............................. SUCCESS [ 3.073 s]
      [INFO] org.eclipse.remote.jsch.ui ......................... SUCCESS [ 1.238 s]
      [INFO] org.eclipse.remote.proxy.protocol.core ............. SUCCESS [ 1.070 s]
      [INFO] org.eclipse.remote.proxy.core ...................... SUCCESS [ 0.873 s]
      [INFO] org.eclipse.remote.proxy.ui ........................ SUCCESS [ 0.632 s]
      [INFO] org.eclipse.remote.proxy.server.core ............... SUCCESS [ 0.787 s]
      [INFO] org.eclipse.remote.proxy.server.product ............ FAILURE [ 0.301 s]
      [INFO] org.eclipse.remote.proxy.server.linux.x86_64 ....... SKIPPED
      [INFO] org.eclipse.remote.proxy.server.macosx.x86_64 ...... SKIPPED


      It looks like I don't have org.eclipse.cdt.core.linux.ppc64le in my environment and I'm not sure how I get it.


      Dave

      ----- Forwarded by David Wootton/Poughkeepsie/Contr/IBM on 02/13/2018 05:18 PM -----


      From:
      David Wootton/Poughkeepsie/Contr/IBM
      To:
      Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
      Date:
      02/13/2018 04:23 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion





      Greg
      I think I did what you asked. I edited the proxy.server.product file in a text editor since I wasn't sure what to do in the custom Eclipse editor for that file, duplicating the line for org.eclipse.cdt.core.linux_x86_64 and changing x86_64 to ppc64le.


      The
      mvn command failed as follows,
      [INFO] ------------------------------------------------------------------------
      [INFO] BUILD FAILURE
      [INFO] ------------------------------------------------------------------------
      [INFO] Total time: 02:58 min
      [INFO] Finished at: 2018-02-13T16:12:58-05:00
      [INFO] Final Memory: 57M/91M
      [INFO] ------------------------------------------------------------------------
      [ERROR] Failed to execute goal org.eclipse.tycho:tycho-compiler-plugin:0.26.0:compile (default-compile) on project org.eclipse.remote.core: Compilation failure: Compilation failure:
      [ERROR] C:\Users\IBM_ADMIN\git\org.eclipse.remote\bundles\org.eclipse.remote.core\src\org\eclipse\remote\internal\core\services\local\LocalResource.java:[54]
      [ERROR] fResource.refreshLocal(IResource.DEPTH_INFINITE, monitor);
      [ERROR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      [ERROR] The type org.eclipse.core.runtime.jobs.ISchedulingRule cannot be resolved. It is indirectly referenced from required .class files
      [ERROR] 1 problem (1 error)
      [ERROR] -> [Help 1]
      [ERROR]
      [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
      [ERROR] Re-run Maven using the -X switch to enable full debug logging.
      [ERROR]
      [ERROR] For more information about the errors and possible solutions, please read the following articles:
      [ERROR] [Help 1]
      http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
      [ERROR]
      [ERROR] After correcting the problems, you can resume the build with the command
      [ERROR] mvn <goals> -rf :org.eclipse.remote.core


      I don't know what caused that since this looks like a standard Eclipse plugin. When my workspace completely rebuilds in Eclipse on my Windows system, I sometimes get build errors because Windows messes up file permissions and prevents plugins from building. If I rebuild the errors go away. I tried the same mvn command again and it ran much quicker, ending with the same error. I don't know how to force a complete rebuild or if that would help.


      Dave


      <graycol.gif>Greg Watson ---02/13/2018 03:10:16 PM---From: Greg Watson <g.watson@xxxxxxxxxxxx> To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

      From:
      Greg Watson <g.watson@xxxxxxxxxxxx>
      To:
      Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
      Date:
      02/13/2018 03:10 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      ptp-dev-bounces@xxxxxxxxxxx




      Hi Dave,


      I didn't realize your target machine was ppc64le, so yes I can see why you'd have these issues. You'll need to do a couple of other things:


      1. Add the plugin org.eclipse.cdt.core.linux.ppc64le to proxy.server.product in org.eclipse.remote.proxy.server.product (you'll need this for the CDT spawner)
      2. Add the lines:


      <copy file="${project.build.directory}/products/proxy.server-linux.gtk.ppc64le.tar.gz"
      tofile="${basedir}/../org.eclipse.remote.proxy.server.linux.ppc64le/proxy.server.tar.gz"/>


      to the pom.xml in org.eclipse.remote.proxy.server.product.


      You'll need to manually build the server by changing into the org.eclipse.remote.build directory and running "mvn clean package". Make sure that a proxy.server.tar.gz file is copied into the o.e.remote.proxy.server.linux.ppc64le directory. When the proxy is deployed, it should now have a plugin that corresponds to your architecture.


      Regards,
      Greg

              On Feb 13, 2018, at 2:50 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

              Greg

              I found out that I had somehow imported code from two different repos for org.eclipse.remote. I deleted all the org.eclipse.remote.* plugins and re-imported them from the master branch and resolved those problems.

              I did discover several new problems:
              1) The machine I am running on has OpenJDK version 1.8.0_102 installed, and Java -version reports

              -bash-4.2$ java -version
              openjdk version "1.8.0_102"
              OpenJDK Runtime Environment (build 1.8.0_102-b14)
              OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)


              The result is that bootstrap.sh fails with message
              fail:invalid java version $major.$minor; must be >= 1.8

              I modified bootstrap.sh as follows

              do_check() {
              java_vers=`java -version 2>&1`
              major=`
              expr "$java_vers" : "openjdk version \"\([0-9]*\)\.[0-9]*.*\""`
              minor=`
              expr "$java_vers" : "openjdk version \"[0-9]*\.\([0-9]*\).*\""`
              if test "$major" -
              ge 2 -o "$minor" -ge 8; then

              as a quick hack to get around that problem.

              2) Then I got a message complaining that plugin org.eclipse.remote.proxy.server.linux.ppc64le could not be found, since my remote node is Power 64 bit little endian. I cloned org.eclipse.remote.proxy.server.linux.x86_64 and changed all the x86_64 strings in all the files in that plugin to ppc64le.

              3) Then I got messages

              !ENTRY org.eclipse.ptp.launch 4 4 2018-02-13 13:56:40.085
              !MESSAGE Unable to start server: null


              The only place I can find that message is line 271 of ProxyConnectionBootstrap.java.

              That's apparently due to the bootstrap.sh sending back the string
              ok:not_found/linux/ppc64le

              Then there's some state machine logic near line 139 of ProxyConnectionBootstrap trnsitioning to either Sttes.START or States.DOWNLOAD depending on the value of the first token. Since it's not_found the state goes to States.DOWNLOAD and that causes a failure.

              not_found is set at line 55 of bootstrap.sh then the code following fails to find org.eclipse.remote.proxy.server.core_$1.jar where I guess $1 is ppc64le.

              I'm not sure where this jar file comes from. It's not anywhere on my remote system, including .eclipsesettings and I can't find where it comes from in the workspace. I'm guessing I could clone and rename the x86_64 jar but I can't find any files with x86_64 in their name anywhere in org.eclipse.remote.* either

              Dave

              <graycol.gif>
              Greg Watson ---02/13/2018 09:59:42 AM---Hi Dave, Have you pulled all the latest changes into your org.eclipse.remote repository?

              From:
              Greg Watson <g.watson@xxxxxxxxxxxx>
              To:
              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
              Date:
              02/13/2018 09:59 AM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              ptp-dev-bounces@xxxxxxxxxxx






              Hi Dave,

              Have you pulled all the latest changes into your org.eclipse.remote repository?

              Regards,
              Greg
                              On Feb 12, 2018, at 5:35 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                              Greg
                              I switched PTP to the ptp_9_1 branch and imported what I think is the proxy plugins, from
                              https://drwootton@xxxxxxxxxxxxxxx/r/p/ptp/org.eclipse.remote.git
                              I get the following errors when building
                              <1D549623.gif>I tried clicking 'Quick fix' to see if it could find the required imports wit no luck. They are not in org.eclipse.remote.core. I tried replacing org.eclipse.remote.core in my workspace with what's in the remote master branch, thinking it didn't get refreshed when I tried to import git projects, picking the remote origin/master branch and also with the remote origin/proxy branch with no luck. Do I need to delete all the org.eclipse.remote plugins and re-import, and if so, from what git branch.

                              This is what I currently have in org.eclipse.remote.core at the HEAD level.
                              <1D105929.gif>

                              Dave


                              <graycol.gif>
                              Greg Watson ---02/12/2018 04:36:03 PM---Hi Dave, You'll need to be on the ptp_9_1 branch for PTP and master branch for o.e.remote. You proba

                              From:
                              Greg Watson <g.watson@xxxxxxxxxxxx>
                              To:
                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                              Date:
                              02/12/2018 04:36 PM
                              Subject:
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by:
                              ptp-dev-bounces@xxxxxxxxxxx







                              Hi Dave,

                              You'll need to be on the ptp_9_1 branch for PTP and master branch for o.e.remote. You probably don't have the o.e.remote.proxy.* projects in your workspace, so you'll need to import these from your git repository.

                              Let me know if you need help.

                              Regards,
                              Greg
                                                              On Feb 12, 2018, at 4:17 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                                                              Greg
                                                              The only options I have in that dropdown are SSH, Telnet and Serial Port. I'm not sure what I am missing. I issued a pull for both org.eclipse.ptp on the PTP master branch and the org.eclipse.remote PTP-REMOTE branch so I think I have up to date code. I also cleared the workspace for my runtime instance to make sure I didn't have someting in my workspace causing problems.

                                                              I also used Eclipse FILE search to look for the string PROXY in org.eclipse.remote source, and didn't find any string, even ignoring case taht looked like it could be used as a widget text element.

                                                              Dave

                                                              <graycol.gif>
                                                              Greg Watson ---02/12/2018 11:40:28 AM---Hi Dave, What you're describing is a two-hop ssh connection. To use a proxy, you first need to click

                                                              From:
                                                              Greg Watson <g.watson@xxxxxxxxxxxx>
                                                              To:
                                                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                                                              Date:
                                                              02/12/2018 11:40 AM
                                                              Subject:
                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                              Sent by:
                                                              ptp-dev-bounces@xxxxxxxxxxx







                                                              Hi Dave,

                                                              What you're describing is a two-hop ssh connection. To use a proxy, you first need to click on "Remote Development" in the preferences and choose PROXY for the "Default connection type". Then open the "Remote Development" preference and click on "Remote Connections". Select PROXY from the "Remote Services" box. Now create a new connection using the "Add" button. You can either open the connection in this view or it will automatically open when you select it in a launch configuration. In either case, you need to select the new connection from the launch configuration. The first time the connection opens it will download a java agent which will handle the remote end of the connection.

                                                              Regards,
                                                              Greg
                                                                                                                              On Feb 12, 2018, at 8:08 AM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                                                                                                                              Greg
                                                                                                                              I always got a response from bqueues when I ran it as a ssh command, either a queue list of a timeout message.

                                                                                                                              I tried to create a proxy connection, but I'm not sure I know how to do that. I specified a ost and username as for a regular ssh connection, then in the advanced settings I clicked the 'remote' button under SSH proxy settings and picked an existing ssh connection. However, I'm thinking I'm still using the original ssh session in this case. I did get the same problem with no notification of command completion once with this connection though.

                                                                                                                              Dave

                                                                                                                              <graycol.gif>
                                                                                                                              Greg Watson ---02/09/2018 09:12:14 AM---Hi Dave, I'm not sure what else to suggest. Have you tried submitting the script multiple times with

                                                                                                                              From:
                                                                                                                              Greg Watson <g.watson@xxxxxxxxxxxx>
                                                                                                                              To:
                                                                                                                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                                                                                                                              Date:
                                                                                                                              02/09/2018 09:12 AM
                                                                                                                              Subject:
                                                                                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                                                                                              Sent by:
                                                                                                                              ptp-dev-bounces@xxxxxxxxxxx







                                                                                                                              Hi Dave,

                                                                                                                              I'm not sure what else to suggest. Have you tried submitting the script multiple times with a regular ssh command (i.e. ssh host -c your_script)? Does this work perfectly every time? You could also try using the PROXY connection type rather than SSH. This uses ssh only for the initial connection, but after that downloads a small agent on the remote machine to handle running the remote commands. If you see different behavior at least that will tell you that it is the JSch implementation that is causing the problem. If not, then it must be somewhere higher in the stack. If you want to try this, you'll need to update from the latest Oxygen build [1] as I fixed a number of bugs recently.

                                                                                                                              We can get your changes into Oxygen.3 which is early March. When you submit them let me know and I'll run another build and promote that for Oxygen.3.

                                                                                                                              Regards,
                                                                                                                              Greg

                                                                                                                              [1]
                                                                                                                              http://download.eclipse.org/tools/ptp/builds/oxygen/milestones
On Feb 8, 2018, at 2:59 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

Greg
Any other ideas about what's not working when the bqueues command hangs? I thought there might be something going on if the bqueues command was killed by 'kill -9', but when I tried that I still got notification back to my Eclipse code most times. It did fail once where there was no notification whatsoever.

I'm not surprised, since even in the 'kill -9' case, wherever invoked bqueues should get a return code back, in that case indicating bqueues was terminated.

At this point, I'm thinking I commit what changes I have since they do improve the LSF target system configuration behavior. Previously, the bqueues command would consistently block due to the command and the stdio/stderr stream readers all running on the same thread , and now it's only when the bqueues command fails to return status.

Also, any possibility of getting a rebuild of PTP once my changes are merged in? We need this to ship with our plugins since we depend on LSF target system configurations.

Thanks

Dave

----- Forwarded by David Wootton/Poughkeepsie/Contr/IBM on 02/08/2018 02:50 PM -----


From:
David Wootton/Poughkeepsie/Contr/IBM
To:
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:
02/05/2018 09:49 AM
Subject:
Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion






Greg
I had an echo to a file in the local filesystem, to my hack wrapper script, following the bqueues command, including the bqueues return code. In the case where everything hung the echo reported the bqueues command return code was zero, so the commands were definitely running.

Usually the hang lasted for something like 30 seconds. I logged into a second console session on the node where the bqueues command was running and repeatedly issue 'ps -u dwootton' commands and see the bqueues command and my wrapper until it eventually terminated with no notification back to my Eclipse session.

Dave

<graycol.gif>
Greg Watson ---02/02/2018 03:42:31 PM---Dave, It's quite an involved path from the thread you have reading from the input stream to the stdo

From:
Greg Watson <g.watson@xxxxxxxxxxxx>
To:
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:
02/02/2018 03:42 PM
Subject:
Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by:
ptp-dev-bounces@xxxxxxxxxxx




Dave,

It's quite an involved path from the thread you have reading from the input stream to the stdout of the command on the remote machine. It's possible that the command could complete on the remote machine before the thread even starts running, though I would have thought that isCompleted() would be false if that happened. Can you add something at the end of the script to check that it ran successfully (e.g 'echo "bqueue finished with status $?" > /tmp/script.out')?

There's not really anything that can "lose track", so I want to establish that the command is actually being run each time.

Regards,
Greg
      On Jan 30, 2018, at 7:36 AM, David Wootton <dwootton@xxxxxxxxxx> wrote:

      Greg
      I added a sleep just before the exit in the script and that makes no difference. I didn't expect any difference since this execution path should be all non-asynchrouous code. I expect sshd is issuing a fork, exec, and wait to invoke the hack script and then bash does the same when invoking the bqueues command.

      The only inconsistent behavior I'm seeing is that sometimes the bqueues command itself times out because LSF daemons apparently aren't responding. But that's all internal to the bqueues command and I do get completion status reported all the way back to my Eclipse code where the return status says the bqueues command exited with rc=255.

      I realize the bqueues command could be exiting with some off return code so added an echo statement to my hack script to write the return code to a file on the remote system and the return code was always zero.

      Dave

      <graycol.gif>
      Greg Watson ---01/29/2018 03:53:08 PM---Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script? Greg

      From:
      Greg Watson <g.watson@xxxxxxxxxxxx>
      To:
      Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
      Date:
      01/29/2018 03:53 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      ptp-dev-bounces@xxxxxxxxxxx








      Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script?

      Greg
              On Jan 29, 2018, at 2:57 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

              Greg
              I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.

              Dave

              <graycol.gif>
              Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

              From:
              Greg Watson <g.watson@xxxxxxxxxxxx>
              To:
              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
              Date:
              01/29/2018 12:17 PM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              ptp-dev-bounces@xxxxxxxxxxx








              Dave,

              What do you mean "when the bqueues command disappears"?

              Greg
                              On Jan 29, 2018, at 9:30 AM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                              Greg
                              That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

                              I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
                              #!/bin/sh
                              echo "Execute: " $*
                              $*
                              echo "EOF:$?"

                              And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

                              The idea is that the hack script just executes exactly what it is passed.

                              This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

                              It looks like something is just completely losing track of the remote command request.in this case.

                              Dave



                              <graycol.gif>
                              Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

                              From:
                              Greg Watson <g.watson@xxxxxxxxxxxx>
                              To:
                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                              Date:
                              01/26/2018 11:39 AM
                              Subject:
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by:
                              ptp-dev-bounces@xxxxxxxxxxx








                              What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

                              Greg
                                                              On Jan 25, 2018, at 5:05 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                                                              Greg
                                                              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

                                                              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

                                                              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

                                                              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

                                                              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

                                                              Dave

                                                              <graycol.gif>
                                                              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

                                                              From:
                                                              Greg Watson <g.watson@xxxxxxxxxxxx>
                                                              To:
                                                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                                                              Date:
                                                              01/24/2018 12:10 PM
                                                              Subject:
                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                              Sent by:
                                                              ptp-dev-bounces@xxxxxxxxxxx







                                                              Dave,

                                                              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

                                                              Another thought. Do you know if the remote process is using a PTY or not?

                                                              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

                                                              Greg
                                                                                                                              On Jan 24, 2018, at 7:24 AM, David Wootton <dwootton@xxxxxxxxxx> wrote:

                                                                                                                              Greg
                                                                                                                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                                                                                                                              Both my reader threads are waiting
                                                                                                                              <17443150.gif>
                                                                                                                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                                                                                                                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                                                                                                                              <17931618.gif>
                                                                                                                              The Session class is com.jcraft.jsch.Session

                                                                                                                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                                                                                                                              Dave



                                                                                                                              <graycol.gif>
                                                                                                                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                                                                                                                              From:
                                                                                                                              Greg Watson <g.watson@xxxxxxxxxxxx>
                                                                                                                              To:
                                                                                                                              Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
                                                                                                                              Date:
                                                                                                                              01/23/2018 10:57 PM
                                                                                                                              Subject:
                                                                                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                                                                                              Sent by:
                                                                                                                              ptp-dev-bounces@xxxxxxxxxxx








                                                                                                                              Hi Dave,

                                                                                                                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                                                                                                                              Regards,
                                                                                                                              Greg
On Jan 23, 2018, at 3:00 PM, David Wootton <dwootton@xxxxxxxxxx> wrote:

I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

The loop I have to wait for process completion is

for (;;) {
if (process.isCompleted()) {
break;
}
if (monitor.isCanceled()) {
process.destroy();
return new Status(IStatus.
CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
}
try {
Thread.
sleep(1000);
} catch (InterruptedException e) {
// Do nothing, sleep just ends early
}
}

I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

Any thoughts or suggestions about what might be going on?

Thanks

Dave



}
}

_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit

https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit

https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=dxgoY4bU6wDz6CiuhPX23jj1d3y_-UakhfG60stTwms&s=9CmjXEF7GzZnK3JhoadWrjkSGvjE24IWfXaUjxI-0kA&e=



_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit

https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=ahnrTor_j-3aCWLzVs9dKgrey9GWH5IMYBd_zQRlivg&s=G7UtaCR83Jdn14JrTjZojzjzbUq2Za8vhhX9uHPfeQg&e=


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


Back to the top