Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] PTP debugger JAXB/LML interactive mode

Hi Jean-Christophe,

For 5.0.4, you need an attribute called 'mpiCores' in the JAXB XML configuration that contains the number of processes that will be launched. Normally this would be displayed as a field in the launch configuration UI that allows the user to enter the number of processes. For the forge.pbs.interactive.openmpi RM, this is the ''MPI Number of Cores" field. The value of this attribute is used to populate the ParallelDebug view with icons. 

For the upcoming 5.0.5 release, I renamed this attribute to mpiNumberOfProcesses to be more descriptive. If you have a 5.0.4 RM, you'll need to remove and re-add the resource manager to pick up the change. I've also added a check that will display an error if this attribute is not set. 

Regards,
Greg

On Feb 8, 2012, at 6:02 AM, <Jean-Christophe.WEILL@xxxxxx> <Jean-Christophe.WEILL@xxxxxx> wrote:

Hello,
 
                For our system here, I am in the process of configurating Eclipse Indigo SR1 + PTP 5.0.4.
The system runs on a customized version of slurm packaged in our own RM.
 
I managed to get working the basis things : run both interactive or batch, analysis with Tau.
So far so good, two steps remains, Debugging and Remote use (I can run directly of the parallel computer but this is not what should be allowed for all users J )
 
For the debugging step, I compiled successfully sdm, I ran it with a modified version of the SLURM proxy and it works. We do have up to 128 cores per nodes so I choose not to show the all cores and only to show the nodes in PTP system monitoring, it was not scaling so well on 100000 cores computers !
 
 
But that is not what I want, I want to run on the JAXB/LML . So I used ….forge.pbs.interactive.openmpi.xml to see what I should do.
 
At the moment, it is nearly working : I’ve made a script  that generated the routing_file and just after doing that signals the right job ID and launch the process interactively.
 
 
The debugger is launched, the process is stopped, (netstat shows me the connection between the sdm master and the sdm slaves….) BUT, there is a BUT
I have nothing on the Parallel Debug view. I can only see Process 0 on the Debug view, so I cannot get pass the mpi_init since all the others process are not
Shown in the interface, so I cannot control them. Do you have any idea of what is going on ? What is triggering the Parallel Debug view ? How can I debug it ?
(The process is correctly displayed as running on the PTP system monitoring…)
 
Regards,
 
Jean-Christophe.
 
 
 
 
 
_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user


Back to the top