Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] on moab integration

Dave -

Thanks for describing PE case. One question: after you get the poe process id, if one or most task initiated by poe (I assume poe being the master process is responsible for distribute/run parallel tasks) died for whatever reason, or the node that task is running on crashed ... not the poe itself, will there be event sent to PTP UI?

Thanks

Feiyi

Dave Wootton wrote:
The Parallel Environment (PE) case is pretty simple since there is no programming interface for invoking PE applications and no methods to query status. When a user invokes a PE application from PTP, the proxy is sent a run command with PE invocation parameters as arguments to the run command. The proxy issues a fork and sets up the application's environment variables using the arguments from the run command and then invokes poe via exec(). (poe is the master process which is responsible for setting up and invoking all the tasks of the parallel application).

At the point the proxy invokes the poe executable, it does not know anything about the job other than the pid of the poe process. So the proxy starts a thread whose only purpose is to watch for a file generated by the poe process that has the mapping of application tasks to hostname/pid pairs. Once this thread has the mapping file, it sends events to the PTP gui notifying it of the existence of the processes (tasks) and also that the job is now running.

The proxy has a thread whose only purpose is to watch for process termination of any child process of the proxy by issuing waitpid() with the W_NOHANG flag set. The only processes detected by this polling loop are the poe processes started by the proxy. When a poe process terminates, the proxt sents a job terminated event to the PTP GUI.

In the PE execution model, any application output written to stdout or stderr is captured by the PE runtime and sent to te poe process. The poe process simply echoes that output to its stdout and stderr file descriptore. For PTP, since the poe process is fork/execed by the proxy, at the point where the fork is issued, I set up pipe pairs for stdout and stderr and capture poe stdio that way. I register the proxy's pipe handles to the select() listening loop set up at proxy startup, and as stdio data becomes available, the proxy generates the events to send that data to the PTP gui. I also have an option to redirect stdout/stderr to files avoiding sending data to the PTP gui.

In this model I have two polling loops. The first is within the thread watching for the poe process to generate the task map file, and is normally a short-lived polling loop. The second polling loop is in the thread watching for poe process termination. In both cases, I sleep a few seconds before iterating the loop. I don't consider either of these polling loops to be heavy CPU users since the processing within these loops is fairly simple.

That's the basic concept. Let me know if you have questions.

Dave
Inactive hide details for Greg Watson <g.watson@xxxxxxxxxxxx>Greg Watson <g.watson@xxxxxxxxxxxx>


                        *Greg Watson <g.watson@xxxxxxxxxxxx>*
                        Sent by: ptp-dev-bounces@xxxxxxxxxxx

                        07/24/2007 02:35 PM
                        Please respond to
                        Parallel Tools Platform general developers
                        <ptp-dev@xxxxxxxxxxx>

	

To
	
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

cc
	
"Canon, Richard Shane" <canonrs@xxxxxxxx>

Subject
	
Re: [ptp-dev] on moab integration

	


Feiyi,

On Jul 24, 2007, at 11:54 AM, Feiyi Wang wrote:

 > hi, folks -
 >
> Moab document says it can interface with both java and c, but > looking into the API, I have two concerns:
 >
> - It doesn't have callback function to update node and job status, > meaning if I want to update eclipse front, I have to probe > periodically.
 > Is orte actively updating the front? how does it handle this?

ORTE gets it's information via callbacks that are generated only when state changes occur, which is much more efficient than polling. I'm not sure what Dave is doing for the PE case, but he might want to comment also.

Polling would probably be ok as long as you keep the frequency fairly low. This is going to be a tradeoff with the responsiveness you want to deliver to the user. You should also keep a snapshot of the state internally in your proxy so that you only need to send differences to Eclipse. That way you'll be able to minimize the load on Eclipse.

Does Moab have a tool to monitor status? How does that work?

 >
> - It is strange that I couldn't find API/structure to query system > resource. Most API documented there are job related. One Moab > developer suggested me to use their so-called "XML api", some C > function will take a XML string, and return XML result, the same as > you would get from their command line tool.
 >
> The issue with getting XML string *not* C structure is, I need to > re-parse the result and set up the correct structure again. As a > side note, these returned xml string can be very very large:
 >
> For example, on ORNL jaguar system with over 11000 nodes, to > implement get_node_attributes(), a query to system yields 6.8M XML > string, the worst of it is, *a single string* - so far it render > Eclipse, Emacs, Gedit not responsive anymore when trying to load > it. Even if I parse it right eventually, it feels so ugly, do you > folks see this is a long term solution?

It sounds like this will generate a lot of overhead just parsing the string every time. Is there any way to only generate differences or do you just get a full dump every time?

You're going to be restricted by whatever API Moab provides. If it proves unworkable, then getting Moab to add a better interface might be one option.

Greg

 >

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


------------------------------------------------------------------------

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



Back to the top