[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Remote tools now escapes all non-alphanumeric characters which causes problems for JAXB resource manager commands

Greg,
Some months ago I approached the mailing list with the problem below is relation to escaping characters in the JAXB RM's.  The conclusion was to leave this issue for the Juno release.  With planning for Juno now underway, I wanted to make sure this issue was put back on the radar and addressed in Juno.  I'm happy to be involved with other developers as to the best solution, some of which might be included in the email chain below.

Regards,
Simon Wail, Ph.D
HPC Specialist
IBM Research Collaboratory for Life Sciences - Melbourne


phone:
+61 3 9035-4341  fax: +61 3 8344-9130
address:
VLSCI, Gnd Floor, 187 Grattan St
Carlton   VIC   3010   Australia
email:
simon.wail@xxxxxxxxxxx







From:        Simon Wail/Australia/IBM@IBMAU
To:        Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:        07/02/2012 04:56 PM
Subject:        Re: [ptp-dev] Remote tools now escapes all non-alphanumeric        characters which causes problems for JAXB resource manager commands
Sent by:        ptp-dev-bounces@xxxxxxxxxxx




Greg,
I think what you've proposed would work if implemented with the "env" command on the remote system.  Env is part of the POSIX.2 standard so should be available on any Unix based remote system (I'm not sure about Windows).


So with your definition in the JAXB file, if "replaceEnvironment" is set to true, then "env -" (deprecated?) or "env -i" would be used as the command on the remote system, otherwise just "env" would be used.  The <environment> tags would then just be used to set additional environment variables as part of the "env" command.  I don't think you'd need to retrieve the values of the remote environment variables as this would be unnecessary overhead.  You could just add PATH=$PATH to the env command to achieve the same, but this would then need to bypass the escaping of non-alphanumeric characters.  This should still be reliable as it is the system doing the bypassing and not a user specified option - no need for the "resolve" attribute.  If "replaceEnvironment" and the <environment> tags are not used, then the command would revert back to what it is at the moment and the "env" command would not be used on the remote system.


I hope all this makes sense and agree that it's too much work for the SR2 release.  I'd like to see this targeted for Juno though.


Regards,
Simon Wail, Ph.D
HPC Specialist
IBM Research Collaboratory for Life Sciences - Melbourne


phone:
+61 3 9035-4341  fax: +61 3 8344-9130
address:
VLSCI, Gnd Floor, 187 Grattan St
Carlton   VIC   3010   Australia
email:
simon.wail@xxxxxxxxxxx








From:        
Greg Watson <g.watson@xxxxxxxxxxxx>
To:        
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:        
07/02/2012 03:54 AM
Subject:        
Re: [ptp-dev] Remote tools now escapes all non-alphanumeric        characters which causes problems for JAXB resource manager commands
Sent by:        
ptp-dev-bounces@xxxxxxxxxxx




Simon,

Would the resolve attribute even work? Assuming replaceEnvironment worked, if the environment is cleared then there will no longer be a PATH or MMCS_SERVER_IP to resolve. It seems like you need to be able to clear the environment except for some selected vars, which would be difficult to implement. I think you really need a combination of replaceEnvironment and being able to look up environment vars from the remote target prior to the environment being cleared. Then you could do something like:

<submit-batch …. replaceEnvironment="true>
...
<environment>PATH=${remote_get_env:PATH}</environment >  
<environment> MMCS_SERVER_IP =${remote_get_env:MMCS_SERVER_IP}</environment >

</submit-batch>

To get this to work we'd need to 1) check the implementation of replaceEnvrionment to ensure that it is working as documented, and 2) implement a variable resolver called "remote_get_env" (or similar) that would allow lookup of remote environment variables.

We could target this for Juno as I don't think we have time to implement and test this for SR2.

How does this sound?

Greg

On Feb 5, 2012, at 9:04 PM, Simon Wail wrote:

Roland,

I don't think the "replaceEnvironment" option will work as I need it to.  It doesn't seem to work as described in the documentation for remote shell commands.  The documentation says:


"If the environment set on the command should entirely replace the shell environment (the default behavior is to append the command environment), set replaceEnvironment to true."

This seems to describe what I want, but doesn't work.  I'm not sure even if it works for local commands (I haven't tested that).  Also to completely replace the remote shell environment, would mean the remote PATH variable would have to be defined in the launch configuration and the necessary system path values might not be known to the user.  Also the MMCS_SERVER_IP would definitely not be a user set value.  Therefore this is not really a viable option, unless there was some way to retain selected remote shell environment variables like what Greg originally suggested.


In terms of reliability, I think using the "resolve" attribute for each command line argument should be OK.  At some point you have to rely on the JAXB author knows what they're doing.  It is not as if general users will be defining their own resource managers (at least I wouldn't expect so).  I think this solution is the most viable and provides the JAXB author with the flexibility they need for the nuances of different resource managers and systems.


Regardless of any consensus on this issue, I assume the necessary code changes will not make the current SR2 build schedule :-(


Regards,
Simon Wail, Ph.D
HPC Specialist
<Mail Attachment.gif> IBM Research Collaboratory for Life Sciences - Melbourne
<Mail Attachment.gif>


phone:
+61 3 9035-4341  fax: +61 3 8344-9130
address:
VLSCI, Gnd Floor, 187 Grattan St
Carlton   VIC   3010   Australia
email:
simon.wail@xxxxxxxxxxx









From:        
Roland Schulz <roland@xxxxxxx>
To:        
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:        
01/02/2012 05:16 PM
Subject:        
Re: [ptp-dev] Remote tools now escapes all non-alphanumeric characters which causes problems for JAXB resource manager commands
Sent by:        
ptp-dev-bounces@xxxxxxxxxxx






On Tue, Jan 31, 2012 at 10:55 PM, Simon Wail <
simon.wail@xxxxxxxxxxx> wrote:
Unfortunately the only way to fix this is to allow the unescaping of arguments to the remote commands.  The problem with Greg's solution, even if it did work with remote environment variables, is that I need to use the "env -" command to entirely clear the remote shell environment before invoking the "sbatch" command.

Why wouldn't the replaceEnvironment option be enough?

Although there are several environment variables that I do need to pass to "sbatch" - PATH and MMCS_SERVER_IP, but once the "env -" command is executed, these will be empty as well.  The only way to do what I want is to set PATH and MMCS_SERVER_IP as part of the "env -" command which can use the shell environment values before they are cleared.  The reason I need to clear the entire shell environment is because the BG/P has a 2K limit on the size of the environment passed to it, and often a standard shell environment can be bigger than this.


In the JAXB specification, there is already the "resolve" attribute to enable/disable PTP variable substitution of command/script arguments (<arg type>).  Could this same attribute be used to allow the unescaping of the arguments in the remote command.  Then by default all special characters would be escaped, and only those specified by the JAXB author (who should know what they're doing) would not be escaped.  This hopefully solves the reliability issues raised by Roland.

Yes. If it would be per argument (and not per whole command) it could be made reliable. But I would still think it would be better to do it correctly. Because as soon as the next person needs:

<arg>PATH=$PATH:${ptp_rm:some_path#value}</arg>
 
it wouldn't be reliable, because it wouldn't be guaranteed that the user provided string doesn't contain special characters.


Roland


In terms of delaying my contribution, this problem only affects the Blue Gene/P RM.  I can still contribute the LML code and Blue Gene/Q RM.  Hopefully I can do this soon after some further testing.


Regards,
Simon Wail, Ph.D
HPC Specialist
IBM Research Collaboratory for Life Sciences - Melbourne


phone:
+61 3 9035-4341  fax: +61 3 8344-9130
address:
VLSCI, Gnd Floor, 187 Grattan St
Carlton   VIC   3010   Australia
email:
simon.wail@xxxxxxxxxxx










From:        
Roland Schulz <roland@xxxxxxx>
To:        
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date:        
01/02/2012 05:49 AM
Subject:        
Re: [ptp-dev] Remote tools now escapes all non-alphanumeric characters which causes problems for JAXB resource manager commands
Sent by:        
ptp-dev-bounces@xxxxxxxxxxx







On Tue, Jan 31, 2012 at 12:43 PM, Greg Watson <
g.watson@xxxxxxxxxxxx> wrote:
I don't think Simon can use IRemoteProcessBuilder#environment() directly, since this is being specified in the JAXB RM XML file.

Ideally, he should be able to do something like:


<submit-batch name="submit-batch" directory="${ptp_rm:directory#value}" waitForId="true" replaceEnvironment="true">
 

          <arg>sbatch</arg>

          <arg>${ptp_rm:managed_file_for_script#value}</arg>
 
          <environment name="PATH" value="${env_var:PATH}" />

          <environment name="MMCS_SERVER_IP" value="${env_var:MMCS_SERVER_IP}" />
 


However, there are a couple of problems with this. The first one is I found a bug in the JAXB environment handling code. Hopefully this is fixed now. The second is that even if ${env_var:xxx} worked (I haven't tried it), it would only look at the local environment. There would probably need to be a special variable handler, say "remote_env_var", to look up the remote environment.

I think this would be the "correct" way to do what Simon wants, but it would take some effort to implement. Other than this, I don't see any other way of fixing Simon's problem, apart from either providing some way to override the shell escaping, or reversing the change.

I'm strongly in favor of doing it the correct way - even if that causes a delay for Simon's contribution. Using shell character is unreliable. It is never avoidable that some string or argument contains special characters. Thus even with the suggestion of overriding the escaping, code which would use this option, would be unreliable. And I think it is much more important to be reliable than adding new features.

Roland


Greg


On Jan 31, 2012, at 1:46 AM, Roland Schulz wrote:

Simon,

we made that change on purpose to avoid unintended consequences if special shell characters are part of a directory or command argument.
E.g. the original approach wouldn't work if the  <generated batch file> would contain a dollar sign somewhere in its path.  


Why don't you use org.eclipse.ptp.remote.core.IRemoteProcessBuilder.environment() to set the environment variables?
 

Roland


On Tue, Jan 31, 2012 at 1:10 AM, Simon Wail <
simon.wail@xxxxxxxxxxx> wrote:
Dev Team,

I've been testing my SLURM - Blue Gene JAXB resource manager in preparation for code submission and hit a major problem.  My original development of the resource manager was done using PTP 5.0.1 and it was working fine.  Now I've been testing it under 5.0.4 and it doesn't work anymore.  I've investigated the problem and it seems to be the way remote commands are passed to ssh using the remotetools code.  The command I want to execute is "env - PATH=$PATH MMCS_SERVER_IP=$MMCS_SERVER_IP sbatch <generated batch file>" and this is defined in the JAXB resource manager file as:


      <submit-batch name="submit-batch" directory="${ptp_rm:directory#value}" waitForId="true">

          <arg>env</arg>

          <arg>-</arg>

          <arg>PATH=$PATH</arg>

          <arg>MMCS_SERVER_IP=$MMCS_SERVER_IP</arg>

          <arg>sbatch</arg>

          <arg>${ptp_rm:managed_file_for_script#value}</arg>

      ...


This all works fine under 5.0.1 and the remote shell command becomes:


tcsh -c /bin/sh -c 'echo "PID=$$ PIID=24" > /dev/pts/11; export "MESG2=World"; export "MESG=Hello World"; cd /vlsci/IBM/swail/; env - PATH=$PATH MMCS_SERVER_IP=$MMCS_SERVER_IP sbatch /vlsci/IBM/swail//3648df43-aae1-4226-ab56-2263458cb11cmanaged_file_for_script; '


Now when moving to 5.0.4 the remote shell command becomes:


tcsh -c /bin/sh -c 'echo "PID=$$ PIID=27" > /dev/pts/17; export "MESG2=World"; export "MESG=Hello World"; cd /vlsci/IBM/swail/&& env - PATH\=\$PATH MMCS_SERVER_IP\=\$MMCS_SERVER_IP sbatch /vlsci/IBM/swail//5252c4b2-de35-47e6-ade9-de855f30904dmanaged_file_for_script'


Notice the escape "\" before the equals and dollar signs for setting PATH and MMCS_SERVER_IP.  This is breaking the "env" command - the "env -" command allows you to execute a subsequent command with a cleared environment.  By using "env - PATH=$PATH sbatch ..." it allows me to clear the existing shell environment, set the PATH environment variable to the previous value (before the environment is cleared) and then execute the sbatch command.  This then provides a limited environment to the sbatch command which is required for the Blue Gene/P system.


With the escapes now in the command, the PATH environment variable is not set correctly and therefore the sbatch command is not found.  This in causing the job submission to fail.


I've looked at the recent differences in the PTP code between 5.0.1 and 5.0.4 and found the following in org.eclipse.ptp.remote.remotetools.core.RemoteToolsProcessBuilder:


1) In the class constructor a set of "trusted" characters is created - this include all alphanumeric characters plus / . _ -

2) In the "start" method the remote command is built and escapes any character NOT in the trusted set - see line 130


I believe this is how the "\"s are now appearing in the remote command and this is causing my job submission to fail.


I've also tried to set the "resolve=false" attribute for each of the "arg" lines in the JAXB file but this makes no difference.  I believe it might not be correct to escape all non-alphanumeric characters in the remote commands, but if this is felt to be necessary, then maybe the "resolve" attribute in the XML could override this to enable the remote command to be sent as desired.


Regards,
Simon Wail, Ph.D
HPC Specialist
IBM Research Collaboratory for Life Sciences - Melbourne


phone:
+61 3 9035-4341  fax: +61 3 8344-9130
address:
VLSCI, Gnd Floor, 187 Grattan St
Carlton   VIC   3010   Australia
email:
simon.wail@xxxxxxxxxxx










--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov

865-241-1537, ORNL PO BOX 2008 MS6309_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx

https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev