Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] Problem with PE proxy


Brett
I solved the problems that were preventing me from running interactive PE jobs using LoadLeveler and can now run a simple two task MPI application interactively as follows:
1) Identify the LoadLeveler job class defined for interactive jobs. On my setup it's inter_class.
2) Make sure the PE resource manager in PTP is set up to use LoadLeveler by clicking the checkbox in the resource manager options dialog in the resource manager wizard pages marked 'Use LoadLeveler'
3) start the resource manager
4) Create a run configuration where you change the settings for the following fields in the tabbed widget in the resources pane of teh run configuration
Tasks Tab: Number of Tasks: # of application tasks - I specified 2.
Tasks Tab: Number of Nodes: # of nodes to use: I specified 1
Tasks Tab: Tasks per Node: # application tasks / # nodes. I used 2
Nodes Tab:Resource Pool: LoadLeveler interactive jobs class. I used inter_class

This should work and is identical to setting the following poe environment variables then running the application manually
MP_RESD=yes
MP_PROCS=2
MP_RESD=yes
MP_RMPOOL=inter_class
MP_TASKS_PER_NODE=2

Let me know if that works for you as a starting point.
   
Dave


Re: [ptp-user] Problem with PE proxy

Dave Wootton to: PTP User list
08/30/2010 03:52 PM

Sent by: ptp-user-bounces@xxxxxxxxxxx

Please respond to PTP User list








Brett


The text field for advanced mode is supposed to accept a pathname to a file that contains a list of PE environment variable settings, for instance

MP_PROCS=2

MP_HOSTFILE=/tmp/hostfile


At the moment this is broken since the browse button navigates to the file on the remote system but the Eclipse code tries to find it on the local system. Please write a bugzilla bug for this.


If you use basic mode, the PE proxy is supposed to do what you want, where there is a field in the 'Nodes' tab of the resources pane where you can enter the pathname of a LoadLeveler command file. You should make sure that if you use this that any of the other PE option settings related to node selection or resource allocation are cleared so you don't have conflicting PE environment variable settings. Alternatively, leave the command file field blank and use the other node allocation parameters to specify your setup. The idea is that you set up the run configuration as if you were invoking the poe process interactively after setting the correct MP_* environment variables.


Unfortunately, my test system has a problem with it's LoadLeveler setup where I can't run any LoadLeveler/PE jobs. If what I suggest above doesn't help, then I will need to get someone from our LoadLeveler team to help me sort this out.


Dave


From: Brett Bode <bbode@xxxxxxxxxxxxx>
To: PTP User list <ptp-user@xxxxxxxxxxx>
Date: 08/27/2010 09:56 AM
Subject: [ptp-user] Problem with PE proxy
Sent by: ptp-user-bounces@xxxxxxxxxxx






Hello,
  I am attempting to use the PE proxy to run tasks remotely on a LOP system running PE and LL. This system is setup to discourage interactive jobs and thus appears to disallow poe invocations that specify a hostlist. I can run poe interactively via the command line using the -llfile keyword to specify an LL script file with a few LL commands as follows:
#@ job_type = parallel
#@ node_usage = not_shared
#@ environment = COPY_ALL
#@ tasks_per_node = 2
#@ node = 1
#@ wall_clock_limit = 0:15:00
#@ queue

The problem is I can't seem to figure out how to make this work via the PE proxy. I have tried various setups using the basic mode as well as using advanced mode. When I use advanced mode I have a simple file (on the remote system) containing a single PE keyword:
MP_LLFILE=llfile

This fails as well. Here is a debug trace from the Eclipse application that seems to indicate that it can't find the script file. However, the path specified is correct so I am not sure why it can't locate the file. Alternatively is there an "advanced" mode that simple uses a LL script file? By the way, I need to run interactively for other reasons so I don't think the LL proxy will work for my situation.

PE Environment setup file /home/bbode/WorkSpace/Test-Aug/script not found.
SEND:[0000013d 0005:00000002:00000008 00000009:queueId=2 00000027:execPath=/home/bbode/WorkSpace/Test-Aug 00000014:debugStopInMain=true 00000024:env=LD_LIBRARY_PATH=/opt/ibmcmp/lib/ 0000001b:jobSubId=JOB_12829167182584 00000012:launchedByPTP=true 00000018:execName=mpi-comm-test.x 00000029:workingDir=/home/bbode/WorkSpace/Test-Aug] -> Worker-27
RECEIVE:[000000aa ->  00dc:00000001:00000007 00000001:2 00000001:1 00000001:4 00000001:3 0000001b:jobSubId=JOB_12829167182584 0000001d:name=bbode.JOB_12829167182584 00000011:jobState=STARTING] -> Proxy Client Event Thread
RECEIVE:[00000017 ->  0000:00000002:00000000] -> Proxy Client Event Thread
RECEIVE:[000000be ->  00df:00000001:00000009 00000001:4 00000001:1 00000001:0 00000001:5 00000008:name=poe 00000014:processState=RUNNING 0000000f:processNodeId=0 0000000e:processIndex=0 00000010:processPID=25673] -> Proxy Client Event Thread
RECEIVE:[000000b4 ->  00e9:00000001:00000005 00000001:4 00000001:1 00000001:0 00000001:1 00000067:processStderr=ERROR: 0031-121  Invalid combination of settings for MP_EUILIB, MP_HOSTFILE, and MP_RESD
] -> Proxy Client Event Thread
RECEIVE:[00000054 ->  00e6:00000001:00000004 00000001:1 00000001:4 00000001:1 00000012:jobState=COMPLETED] -> Proxy Client Event Thread
RECEIVE:[00000063 ->  00e9:00000001:00000005 00000001:4 00000001:1 00000001:0 00000001:1 00000016:processState=COMPLETED] -> Proxy Client Event Thread


Thanks,
Brett
_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx

https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user


Back to the top