Re: [ptp-dev] on moab integration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [ptp-dev] on moab integration

From: Dave Wootton <dwootton@xxxxxxxxxx>
Date: Wed, 25 Jul 2007 08:38:15 -0400
Delivered-to: ptp-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/listinfo/ptp-dev>
List-help: <mailto:ptp-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/ptp-dev>, <mailto:ptp-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/ptp-dev>, <mailto:ptp-dev-request@eclipse.org?subject=unsubscribe>

Greg
There is no support for a callback facility within the LoadLeveler API. 
The only mechanism supported is direct query for the needed information, 
where it is up to the API user to determine an appropriate query/polling 
mechanism to get the data.
Dave

Greg Watson <g.watson@xxxxxxxxxxxx> 
Sent by: ptp-dev-bounces@xxxxxxxxxxx
07/24/2007 06:13 PM
Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

To
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
cc

Subject
Re: [ptp-dev] on moab integration

Dave,

Does that mean that the LoadLeveler API only supports polling, not 
callback? I guess this is going to be an issue with other job schedulers 
too.

Greg

On Jul 24, 2007, at 3:55 PM, Dave Wootton wrote:

Also, since the process state model for PE is so simple, we don't maintain 
any process state within the proxy to try to optimize the events sent to 
teh PTP gui.

We're also working on an implementation to support LoadLeveler (IBM batch 
job scheduler). In that model, LoadLeveler provides a C programming API 
which allows is to query many attributes of the cluster LoadLeveler is 
running on, including node state, job state, etc. as well as to submit 
jobs. Our implementation will use that API rather than using LoadLeveler 
commands to retrieve the data. For the LoadLeveler implementation, since 
the run enviroment and status tracking is more complex, we will be 
maintaining state within the proxy so we only send events for changes 
ratheer than sending complete state each time we want to update the GUI. 
We also need to be concerned with polling interval in that case, both due 
to concerns about CPU load on the proxy node as well as overloading the 
LoadLeveler daemons with status requests too frequently.
Dave
<graycol.gif>Dave Wootton/Poughkeepsie/IBM@IBMUS

Dave Wootton/Poughkeepsie/IBM@IBMUS 
Sent by: ptp-dev-bounces@xxxxxxxxxxx 
07/24/2007 05:45 PM 

Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

<ecblank.gif>
To
<ecblank.gif>
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
<ecblank.gif>
cc
<ecblank.gif>
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>, 
ptp-dev-bounces@xxxxxxxxxxx, "Canon, Richard Shane" <canonrs@xxxxxxxx>
<ecblank.gif>
Subject
<ecblank.gif>
Re: [ptp-dev] on moab integration

<ecblank.gif>
<ecblank.gif>

The Parallel Environment (PE) case is pretty simple since there is no 
programming interface for invoking PE applications and no methods to query 
status. When a user invokes a PE application from PTP, the proxy is sent a 
run command with PE invocation parameters as arguments to the run command. 
The proxy issues a fork and sets up the application's environment 
variables using the arguments from the run command and then invokes poe 
via exec(). (poe is the master process which is responsible for setting up 
and invoking all the tasks of the parallel application). 

At the point the proxy invokes the poe executable, it does not know 
anything about the job other than the pid of the poe process. So the proxy 
starts a thread whose only purpose is to watch for a file generated by the 
poe process that has the mapping of application tasks to hostname/pid 
pairs. Once this thread has the mapping file, it sends events to the PTP 
gui notifying it of the existence of the processes (tasks) and also that 
the job is now running.

The proxy has a thread whose only purpose is to watch for process 
termination of any child process of the proxy by issuing waitpid() with 
the W_NOHANG flag set. The only processes detected by this polling loop 
are the poe processes started by the proxy. When a poe process terminates, 
the proxt sents a job terminated event to the PTP GUI.

In the PE execution model, any application output written to stdout or 
stderr is captured by the PE runtime and sent to te poe process. The poe 
process simply echoes that output to its stdout and stderr file 
descriptore. For PTP, since the poe process is fork/execed by the proxy, 
at the point where the fork is issued, I set up pipe pairs for stdout and 
stderr and capture poe stdio that way. I register the proxy's pipe handles 
to the select() listening loop set up at proxy startup, and as stdio data 
becomes available, the proxy generates the events to send that data to the 
PTP gui. I also have an option to redirect stdout/stderr to files avoiding 
sending data to the PTP gui.

In this model I have two polling loops. The first is within the thread 
watching for the poe process to generate the task map file, and is 
normally a short-lived polling loop. The second polling loop is in the 
thread watching for poe process termination. In both cases, I sleep a few 
seconds before iterating the loop. I don't consider either of these 
polling loops to be heavy CPU users since the processing within these 
loops is fairly simple.

That's the basic concept. Let me know if you have questions.

Dave
<graycol.gif>Greg Watson <g.watson@xxxxxxxxxxxx>

Greg Watson <g.watson@xxxxxxxxxxxx> 
Sent by: ptp-dev-bounces@xxxxxxxxxxx 
07/24/2007 02:35 PM 

Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

<ecblank.gif>
To
<ecblank.gif>
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
<ecblank.gif>
cc
<ecblank.gif>
"Canon, Richard Shane" <canonrs@xxxxxxxx>
<ecblank.gif>
Subject
<ecblank.gif>
Re: [ptp-dev] on moab integration

<ecblank.gif>
<ecblank.gif>

Feiyi,

On Jul 24, 2007, at 11:54 AM, Feiyi Wang wrote:

> hi, folks -
>
> Moab document says it can interface with both java and c, but  
> looking into the API, I have two concerns:
>
> - It doesn't have callback function to update node and job status,  
> meaning if I want to update eclipse front, I have to probe  
> periodically.
> Is orte actively updating the front? how does it handle this?

ORTE gets it's information via callbacks that are generated only when  
state changes occur, which is much more efficient than polling. I'm  
not sure what Dave is doing for the PE case, but he might want to  
comment also.

Polling would probably be ok as long as you keep the frequency fairly  
low. This is going to be a tradeoff with the responsiveness you want  
to deliver to the user. You should also keep a snapshot of the state  
internally in your proxy so that you only need to send differences to  
Eclipse. That way you'll be able to minimize the load on Eclipse.

Does Moab have a tool to monitor status? How does that work?

>
> - It is strange that I couldn't find API/structure to query system  
> resource. Most API documented there are job related. One Moab  
> developer suggested me to use their so-called "XML api", some C  
> function will take a XML string, and return XML result, the same as  
> you would get from their command line tool.
>
> The issue with getting XML string *not* C structure is, I need to  
> re-parse the result and set up the correct structure again. As a  
> side note, these returned xml string can be very very large:
>
> For example, on ORNL jaguar system with over 11000 nodes, to  
> implement get_node_attributes(), a query to system yields 6.8M XML  
> string, the worst of it is, *a single string* - so far it render  
> Eclipse, Emacs, Gedit not responsive anymore when trying to load  
> it. Even if I parse it right eventually, it feels so ugly, do you  
> folks see this is a long term solution?

It sounds like this will generate a lot of overhead just parsing the  
string every time. Is there any way to only generate differences or  
do you just get a full dump every time?

You're going to be restricted by whatever API Moab provides. If it  
proves unworkable, then getting Moab to add a better interface might  
be one option.

Greg

>

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
<2F090181.gif>_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

<graycol.gif><pic09874.gif><ecblank.gif><2F090181.gif>_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

References:
- Re: [ptp-dev] on moab integration
  - From: Greg Watson

Prev by Date: Re: [ptp-dev] remote services
Next by Date: Re: [ptp-dev] on moab integration
Previous by thread: Re: [ptp-dev] on moab integration
Next by thread: Re: [ptp-dev] on moab integration
Index(es):
- Date
- Thread

Breadcrumbs