Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[ptp-dev] Fwd: [O-MPI devel] Re: Identifying MPI programming problems

I believe the following message was rejected by the PTP list ....

Begin forwarded message:

From: Jeffrey Squyres <jsquyres@xxxxxxxxxxxx>
Date: June 1, 2005 3:31:16 PM MDT
To: Open MPI Developers <devel@xxxxxxxxxxxx>
Cc: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>, Craig Rasmussen <crasmussen@xxxxxxxx>, Beth Tibbitts <tibbitts@xxxxxxxxxx>, Justin Xue <xue@xxxxxxxxxx>
Subject: Re: [O-MPI devel] Re: Identifying MPI programming problems

I guess it depends on the exact definition of "MPI flow diagrams"...? (I'm not familiar with the term)

Note that there are tools that do message tracing for MPI applications (as I understand the problem) -- they generate tracefiles of all MPI message activity. Some generate tracefiles that can be viewed in real time, others generate tracefiles that can be viewed post-mortem. When viewed in a nice GUI and/or are applied in analysis tools, these kinds of tracefiles can show things like deadlock, livelock, tag mismatches, etc. The nice thing about these tools is that many of them are implemented at the MPI profiling layer, which means that they can be used with any number of MPI implementations -- they're not tied to any specific implementation.

Is this what you're talking about?

That being said, it would certainly be nice if the MPI implementation (or a tool) could print out at run-time "Hey, you just entered a deadlock situation and I'm going to hang until you hit ctrl-C". In many cases, as Nathan mentioned, this is quite difficult to determine (some entity would need to maintain a global state of all message passing). Needless to say, such runtime analysis would incur a performance cost, but would probably be acceptable for debugging scenarios. Hence, this run-time notification of at least some common types of MPI programming errors is "difficult but not impossible" for single-threaded MPI application scenarios. It could even be done at the same level as the tools described above -- at the MPI profiling layer, enabling the tool to work with any MPI implementation.

But this kind of strategy becomes much more problematic in multi-threaded application scenarios -- if the tool determines that it's in a deadlock situation (I'm waving my hands a bit here), it's impossible for it to know that another thread won't come along and break the deadlock. Hence, it's impossible for the tool to know when it's *really* in a deadlock situation (for example). You might be able to come up with some reasonable hueristics (e.g., all threads in all processes are blocking in MPI calls and no one is making any progress), but I don't can't think of any ways to do this conclusively off the top of my head (who knows if a signal handler won't create a new thread and break the deadlock, what's a reasonable timeout for "no progress", etc.).


On Jun 1, 2005, at 3:52 PM, Donald P Pazel wrote:


On Jun 1, 2005, at 4:31 PM, Craig Rasmussen wrote:
 >
>I think Nathan has hit on a great idea (MPI flow diagrams).  Do you
 >Open MPI guys think this would be possible?

I'd like to mention, that what would be most interesting is to see how MPI flow diagrams are represented from the practitioner viewpoint, as opposed high-level design diagrams.  I find that the kind of "white board" diagrams that engineers draw daily (e.g. blocks and arrows) and use to capture the essence of code problems are extremely interesting and helpful, and derive from extended experience.  (Then of course we usually erase those drawings, or leave them until than dry hard to the board.)

In any case, I think seeing these paradigmic drawings, and the problems they address, would be very helpful as input to think about for tools' features.

Thanks,

Don Pazel,






Craig Rasmussen <crasmussen@xxxxxxxx>

06/01/2005 04:31 PM
       
        To:        Nathan DeBardeleben <ndebard@xxxxxxxx>
        cc:        Greg Watson <gwatson@xxxxxxxx>, Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>, Donald P Pazel/Watson/IBM@IBMUS, Justin Xue/Watson/Contr/IBM@IBMUS, Beth Tibbitts/Watson/IBM@IBMUS, Open MPI Developers <devel@xxxxxxxxxxxx>
        Subject:        Re: Identifying MPI programming problems




 On Jun 1, 2005, at 11:47 AM, Nathan DeBardeleben wrote:
 >
> There are definitely things that can be done, and there are definitely
 > real codes out there that could take advantage of it.  But like
> anything else it can get exceedingly complicated.  I personally think
 > any steps that can be made towards making MPI flow diagrams (even
> partially accurate ones) would be huge steps in the right direction.

 I think Nathan has hit on a great idea (MPI flow diagrams).  Do you
 Open MPI guys think this would be possible?

 Cheers,
 Craig


_______________________________________________
devel mailing list
devel@xxxxxxxxxxxx
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/




Back to the top