Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] PTP debugging with SLURM (was mpich support in luna?)

hi Greg,

My sort of working setup is indeed a merge of parts of the slurm batch and openmpi  interactive.
I use salloc to setup my resources and launch my job with mpirun using openmpi. In this way i can recycle the start_job.pl script from the OPENMPI directory
this will restult in a command like this

salloc --time=01:00:00 --partition=gpu_short --ntasks=2 --ntasks-per-node=1 mpirun -np 2 -mca orte_show_resolved_nodenames 1 -display-map /nfs/home1/thomasge/.eclipsesettings/sdm --port=34822 --host=localhost --debugger=gdb-mi --debug=13 --routing_file=/nfs/home1/thomasge/source/testmpitype/routing_file

This does allow me to debug a parallel run using multiple nodes but there is an issues.
When my allocation ends the termination of the job is not "seen" by eclipse and it remains in an undefined state (that i cannot terminate)
I basically have to restart eclipse.

thanks
Thomas


On Fri, Jan 30, 2015 at 3:34 PM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:
The sdm supports slurm using the SLURM_PROCID environment variable. The tricky bit is getting a debug job launched. Schedulers like Torque, LSF, etc., provide an interactive mode that is used to launch the job using the appropriate mpirun commend for the MPI runtime (e.g. Open MPI or MPICH2), but it tends to be very system specific. For slurm, you would need to copy the slurm-generic.xml target system configuration and add a submit-interactive-debug command (or submit-batch-debug if no interactive is possible). Take a look at the edu.sdsc.trestles.torque.interactive.openmpi.xml  configuration for an example.

Greg


On Jan 30, 2015, at 8:50 AM, Thomas Geenen <geenen@xxxxxxxxx> wrote:

hi Beth,

could you give me an update on the slurm support for debugging?
i have hacked something together for my local setup that sort of works but i am not really happy with it.

best
Thomas

On Mon, Aug 11, 2014 at 8:49 PM, Beth Tibbitts <beth@xxxxxxxxxx> wrote:
Well you seem enthusiastic and assuming you can do some development with guidance, must have access to a SLURM system, and test on SLURM... thats a great start.
We will discuss with other PTP developers at this week's PTP hackathon.

>how to go about testing/building a developer version. 
May seem intimidating I know, but we have instructions for that and we can help.
https://wiki.eclipse.org/PTP  under Developer Resources > Environment Setup.

I would recommend a different Eclipse installation and workspace for PTP plugin work from
your normal PTP user install+workspace.


...Beth

Beth Tibbitts


On Mon, Aug 11, 2014 at 2:43 PM, Biddiscombe, John A. <biddisco@xxxxxxx> wrote:
Beth

If there was any way I can help, then I happily volunteer. Unfortunately, I have never found the time to look into the eclipse/ptp/plugin source and so have absolutely no idea what/how to go about testing/building a developer version. Many times I vowed to myself that I’d try to get it working, but never did.

If you want a volunteer, then I’m in. But unless you can make use of a clueless loser and tell me what to do, then I probably won’t be any help.

JB

From: Beth Tibbitts <beth@xxxxxxxxxx<mailto:beth@xxxxxxxxxx>>
Reply-To: "ptp-user@xxxxxxxxxxx<mailto:ptp-user@xxxxxxxxxxx>" <ptp-user@xxxxxxxxxxx<mailto:ptp-user@xxxxxxxxxxx>>
Date: Monday 11 August 2014 17:44
To: "ptp-user@xxxxxxxxxxx<mailto:ptp-user@xxxxxxxxxxx>" <ptp-user@xxxxxxxxxxx<mailto:ptp-user@xxxxxxxxxxx>>
Subject: [ptp-user] PTP debugging with SLURM (was mpich support in luna?)

John,
we lost our committer who did the support and testing for SLURM.
If you are willing to work on it and test it, I'm sure other committers can give you some direction.
We are having a PTP hackathon this week in Baton Rouge, would be a good topic for discussion.


...Beth

Beth Tibbitts
beth@xxxxxxxxxx<mailto:beth@xxxxxxxxxx>


On Mon, Aug 11, 2014 at 9:53 AM, Biddiscombe, John A. <biddisco@xxxxxxx<mailto:biddisco@xxxxxxx>> wrote:
Dear list

I had a look at http://wiki.eclipse.org/images/8/80/PTP-user-debug-20140129.pdf which gives info on mpich debugging in ptp luna.

Question : Would this work with slurm? I’ve not had success with debugging using slurm previously, so I wonder if it is supported now.

Thanks

JB

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx<mailto:ptp-user@xxxxxxxxxxx>
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-user


_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-user


_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-user


Back to the top