Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] Problem with PE proxy


Brett

The text field for advanced mode is supposed to accept a pathname to a file that contains a list of PE environment variable settings, for instance
MP_PROCS=2
MP_HOSTFILE=/tmp/hostfile

At the moment this is broken since the browse button navigates to the file on the remote system but the Eclipse code tries to find it on the local system. Please write a bugzilla bug for this.

If you use basic mode, the PE proxy is supposed to do what you want, where there is a field in the 'Nodes' tab of the resources pane where you can enter the pathname of a LoadLeveler command file. You should make sure that if you use this that any of the other PE option settings related to node selection or resource allocation are cleared so you don't have conflicting PE environment variable settings. Alternatively, leave the command file field blank and use the other node allocation parameters to specify your setup. The idea is that you set up the run configuration as if you were invoking the poe process interactively after setting the correct MP_* environment variables.

Unfortunately, my test system has a problem with it's LoadLeveler setup where I can't run any LoadLeveler/PE jobs. If what I suggest above doesn't help, then I will need to get someone from our LoadLeveler team to help me sort this out.

Dave


From: Brett Bode <bbode@xxxxxxxxxxxxx>
To: PTP User list <ptp-user@xxxxxxxxxxx>
Date: 08/27/2010 09:56 AM
Subject: [ptp-user] Problem with PE proxy
Sent by: ptp-user-bounces@xxxxxxxxxxx





Hello,
   I am attempting to use the PE proxy to run tasks remotely on a LOP system running PE and LL. This system is setup to discourage interactive jobs and thus appears to disallow poe invocations that specify a hostlist. I can run poe interactively via the command line using the -llfile keyword to specify an LL script file with a few LL commands as follows:
#@ job_type = parallel
#@ node_usage = not_shared
#@ environment = COPY_ALL
#@ tasks_per_node = 2
#@ node = 1
#@ wall_clock_limit = 0:15:00
#@ queue

The problem is I can't seem to figure out how to make this work via the PE proxy. I have tried various setups using the basic mode as well as using advanced mode. When I use advanced mode I have a simple file (on the remote system) containing a single PE keyword:
MP_LLFILE=llfile

This fails as well. Here is a debug trace from the Eclipse application that seems to indicate that it can't find the script file. However, the path specified is correct so I am not sure why it can't locate the file. Alternatively is there an "advanced" mode that simple uses a LL script file? By the way, I need to run interactively for other reasons so I don't think the LL proxy will work for my situation.

PE Environment setup file /home/bbode/WorkSpace/Test-Aug/script not found.
SEND:[0000013d 0005:00000002:00000008 00000009:queueId=2 00000027:execPath=/home/bbode/WorkSpace/Test-Aug 00000014:debugStopInMain=true 00000024:env=LD_LIBRARY_PATH=/opt/ibmcmp/lib/ 0000001b:jobSubId=JOB_12829167182584 00000012:launchedByPTP=true 00000018:execName=mpi-comm-test.x 00000029:workingDir=/home/bbode/WorkSpace/Test-Aug] -> Worker-27
RECEIVE:[000000aa ->  00dc:00000001:00000007 00000001:2 00000001:1 00000001:4 00000001:3 0000001b:jobSubId=JOB_12829167182584 0000001d:name=bbode.JOB_12829167182584 00000011:jobState=STARTING] -> Proxy Client Event Thread
RECEIVE:[00000017 ->  0000:00000002:00000000] -> Proxy Client Event Thread
RECEIVE:[000000be ->  00df:00000001:00000009 00000001:4 00000001:1 00000001:0 00000001:5 00000008:name=poe 00000014:processState=RUNNING 0000000f:processNodeId=0 0000000e:processIndex=0 00000010:processPID=25673] -> Proxy Client Event Thread
RECEIVE:[000000b4 ->  00e9:00000001:00000005 00000001:4 00000001:1 00000001:0 00000001:1 00000067:processStderr=ERROR: 0031-121  Invalid combination of settings for MP_EUILIB, MP_HOSTFILE, and MP_RESD
] -> Proxy Client Event Thread
RECEIVE:[00000054 ->  00e6:00000001:00000004 00000001:1 00000001:4 00000001:1 00000012:jobState=COMPLETED] -> Proxy Client Event Thread
RECEIVE:[00000063 ->  00e9:00000001:00000005 00000001:4 00000001:1 00000001:0 00000001:1 00000016:processState=COMPLETED] -> Proxy Client Event Thread


Thanks,
Brett
_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user



Back to the top