Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] Debug in PTP

Hi Greg,

thanks for the fast response, please find my answers below.

Greg Watson wrote:
> Hi Robert,
> 
> PTP 1.1 does not provide remote debugging. This is something we're
> hoping to add for the 2.0 release later this year.
> 
> However, there are enough hooks in 1.1 to run the debugger remotely,
> albeit with some manual steps.
> 
> 1. Open Preferences, the go to PTP->Debug->SDM. Change 'SDM client host'
> to the hostname of the machine running Eclipse (your local machine).
I have that set to "localhost". So I guess that should work. (No change
to what it was set before!)

> 2. Change 'Path to backend debugger' to the location of gdb on the
> remote nodes. If gdb is in your PATH on the nodes then skip this step.
The Backend debugger is set to "gdb-mi" and I cannot edit that. I cannot
locate that executable on my system. But since it works when I run a
local debug session, I wonder how...

> 3. Manually copy the application executable onto the remote nodes. If
> you're mounting your local directory on the nodes then skip this step.
Due to some problems with "locking" on our file system, I am running
everything out of /tmp for the moment. I have synchronized the local
/tmp with the /tmp directory on the nodes in the cluster, so all
executables are available. All paths are the same on all systems.

> 4. Create a debug launch configuration. If the location of the
> executable in step (3) is different from the location on the local
> machine then go to the 'Debugger' tab and enter the information in
> 'Remote executable path'. The 'Remote working directory' is the working
> directory on the remote machine. If you entered an absolute path for the
> executable then this shouldn't matter (I think).
I did that. But the path is the same on all machines anyway.

> 
> You should hopefully now be able to launch a debug session.
No, I am not. :-)

But since this is not supported today, I am not sure if we should
continue to work on the workaround?! I can always run 10 processes on my
local machine and debug it that way.

For your information, I have attached the console log of the eclipse, I
think that can provide a glue about the failure.

BTW, when clicking on "Cancel" in the dialog window that display
"Operation in progress...  Connecting to proxy server...." I get a
dialog saying that the "Session is not connected".

Regards,

Robert



> 
> 
> On Apr 23, 2007, at 5:27 AM, Robert Henschel wrote:
> 
>> Hi,
>>
>> I am new to PTP, so please let me know if this question has been
>> already answered and where to find the information.
>>
>> I have installed Eclipse and PTP on my notebook and on my workstation
>> in the office. By modifying "/usr/local/etc/openmpi-default-hostfile",
>> I am able to see both "nodes" in the PTP runtime perspective in
>> Eclipse. I can launch an application with 4 MPI processes, two per
>> node, and they run. I can see them in PTP Runtime perspective. Very
>> good. :-)
>>
>> Now I wanted to debug them. When I start them with the debugger, all I
>> get is a dialog saying "Operation in progress...  Connecting to proxy
>> server...."
>> Debugging does work when I run the process only on the local machine.
>> But as soon as I include a remote machine, (by modifying the
>> default-hostfile) debugging does no longer work.
>>
>> Any help would be appreciated, I guess I can also supply more
>> information if you let me know what to look for.
>>
>> Regards,
>>
>> Robert Henschel
>> _______________________________________________
>> ptp-user mailing list
>> ptp-user@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/ptp-user
> 
> _______________________________________________
> ptp-user mailing list
> ptp-user@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/ptp-user
> 

henschel@deimos104:/tmp/Robert/eclipse> ./eclipse
In getResourceManagerFactories
retrieved factory: Simulation, org.eclipse.ptp.simulation.core.resourcemanager
leaving getResourceManagerFactories
Your Control System Choice: 'Open Runtime Environment (ORTE)'
Your Monitoring System Choice: 'Open Runtime Environment (ORTE)'
In retrieveConfigurationWizardPageFactories
wizard page factory: org.eclipse.ptp.simulation.ui.wizards.SimulationRMConfigurationWizardPageFactory@cd74b31 for class: class org.eclipse.ptp.rmsystem.SimulationResourceManagerFactory
leaving retrieveConfigurationWizardPageFactories
OS = 'linux', Architecture = 'x86_64', OS_ARCH combo = 'org.eclipse.ptp.linux.x86_64'
PTP Version = 1.1.0
All Found Fragments:
        update@plugins/org.eclipse.ptp.linux.ppc/ [141]
        update@plugins/org.eclipse.ptp.linux.x86/ [142]
        update@plugins/org.eclipse.ptp.linux.x86_64/ [143]
        update@plugins/org.eclipse.ptp.macosx.ppc/ [144]
        update@plugins/org.eclipse.ptp.macosx.x86/ [145]
Testing fragment 1 with this OS/arch - path: '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.ppc/'
Testing fragment 2 with this OS/arch - path: '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.x86/'
Testing fragment 3 with this OS/arch - path: '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.x86_64/'
        Correct fragment for our OS & arch
        Searching for file in '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.x86_64/bin/sdm'
                **** FOUND IT!
XXXXXXXXXXX refreshRuntimeSystems(false), isInitialized():false
XXXXXXXXXXX refreshRuntimeSystems calling initialize(), force:false, isInitialized():false
refreshRuntimeSystems
SHUTTING DOWN CONTROL/MONITORING/PROXY systems where appropriate
OMPIProxyRuntimeClient - firing up proxy, waiting for connecting.  Please wait!  This can take a minute . . .
ORTE_SERVER path = '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.x86_64/bin/ptp_orte_proxy'
sessionCreate(0,0)
bind(0.0.0.0/0.0.0.0:0)
port=38979
accept thread starting...
Waiting on accept.
OMPIProxyRuntimeClient waiting on {201, 210}
RUNNING PROXY SERVER COMMAND: '/tmp/Robert/eclipse/plugins/org.eclipse.ptp.linux.x86_64/bin/ptp_orte_proxy --port=38979'
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_CONNECTED
OMPIProxyRuntimeClient notifying...
OMPIProxyRuntimeClient awoke!
event thread starting...
accept thread exiting...
<0000000b STARTDAEMON>
OMPIProxyRuntimeClient waiting on {200, 201, 211}
++++++++++ ptp_orte_proxy: proxy_svr_connect returned.
++++++++++ ptp_orte_proxy: PARENT: orted_pid = 32026
++++++++++ ptp_orte_proxy: proxy_svr_connect returned.
++++++++++ ptp_orte_proxy: StartDaemon(orted orted --scope public --seed --persistent --no-daemonize --universe PTP-ORTE-32026 --report-uri 5)
++++++++++ ptp_orte_proxy: CHILD: Starting execvp now!
++++++++++ ptp_orte_proxy: PARENT: URI = 0.0.0;tcp://192.168.93.242:57749;tcp://141.30.63.215:57749;tcp://192.168.97.242:57749;tcp://192.168.117.242:57749
++++++++++ ptp_orte_proxy: ORTEInit (PTP-ORTE-32026)
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_OK
OMPIProxyRuntimeClient notifying...
OMPIProxyRuntimeClient awoke!
OMPIMonitoringSystem startup()
OMPIMonitoringSystem: initiateDiscovery phase
<00000008 DISCOVER>
++++++++++ ptp_orte_proxy: Start daemon returning OK.
++++++++++ ptp_orte_proxy: DISCOVERY PHASE: end
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_NODEATTR [Ljava.lang.String;@6908af2a : [Ljava.lang.String;@39242445
ModelManager.runtimeNodeGeneralName - #keys = 14, #values = 14
                Unknown machine ID (0), adding to the model.
                Unknown node number (0), adding to the model.
                Unknown node number (1), adding to the model.
sessionCreate(0,0)
bind(0.0.0.0/0.0.0.0:0)
port=43599
accept thread starting...
XXXXXXXXXXXX   Waiting for Universe to Populate
MODEL MANAGER: newJob(1)
ModelManager.run() - new JobID = 1
JAVA OMPI: run() with args:
name:           Test1
path:           /tmp/Robert/workspace_ptp/Test1/Debug
cwd:            /tmp/Robert/workspace_ptp/Test1/Debug
machineName:    Machine0
#procs:         4
#proc/node:     1
firstNode#:     0
isDebug?                true
<000002d6 RUN 6:6a6f62494400 2:3100 9:657865634e616d6500 6:546573743100 b:70617468546f4578656300 26:2f746d702f526f626572742f776f726b73706163655f7074702f54657374312f446562756700 b:6e756d4f6650726f637300 2:3400 d:70726f63735065724e6f646500 2:3100 d:66697273744e6f64654e756d00 2:3000 b:776f726b696e6744697200 26:2f746d702f526f626572742f776f726b73706163655f7074702f54657374312f446562756700 d:64656275676765725061746800 41:2f746d702f526f626572742f65636c697073652f706c7567696e732f6f72672e65636c697073652e7074702e6c696e75782e7838365f36342f62696e2f73646d00 c:646562756767657241726700 11:2d2d686f73743d6c6f63616c686f737400 c:646562756767657241726700 12:2d2d64656275676765723d6764622d6d6900 c:646562756767657241726700 d:2d2d706f72743d343335393900>
++++++++++ ptp_orte_proxy: (debug ? 1) Spawning 4 processes of job '/tmp/Robert/workspace_ptp/Test1/Debug/Test1'
++++++++++ ptp_orte_proxy:      program name '/tmp/Robert/workspace_ptp/Test1/Debug/Test1'
++++++++++ ptp_orte_proxy: SPAWNED [error code 0 = 'Success'], now unlocking
++++++++++ ptp_orte_proxy: NEW JOBID = 2
++++++++++ ptp_orte_proxy: registering IO forwarding - name = ''
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_JOBSTATE (jobid=1) state=1
*********** JOB STATE CHANGE: starting (job = job1)
++++++++++ ptp_orte_proxy: Returning from ORTERun
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_PROCATTR job=1 {0}:<>  [0]:<ATTRIB_PROCESS_NODE_NAME=p2s224>
*********** PROC ATTRIBUTE CHANGE: (job = job1)
setting node[job1_process0]=p2s224(0)
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_PROCATTR job=1 {0}:<>  [2]:<ATTRIB_PROCESS_NODE_NAME=p2s224>
*********** PROC ATTRIBUTE CHANGE: (job = job1)
setting node[job1_process2]=p2s224(0)
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_PROCATTR job=1 {0}:<>  [1]:<ATTRIB_PROCESS_NODE_NAME=p2s207>
*********** PROC ATTRIBUTE CHANGE: (job = job1)
setting node[job1_process1]=p2s207(1)
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_PROCATTR job=1 {0}:<>  [3]:<ATTRIB_PROCESS_NODE_NAME=p2s207>
*********** PROC ATTRIBUTE CHANGE: (job = job1)
setting node[job1_process3]=p2s207(1)
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_JOBSTATE (jobid=1) state=3
*********** JOB STATE CHANGE: exited (job = job1)
accept failed... :(
accept thread exiting...
OMPIMonitoringSystem shutdown()
OMPIControlSystem: shutdown() called
OMPIProxyRuntimeClient shutting down server...
<00000003 QUI>
OMPIProxyRuntimeClient waiting on {200, 201}
++++++++++ ptp_orte_proxy: ORTEQuit called!
++++++++++ ptp_orte_proxy: ORTEShutdown() called.  Telling daemon to turn off.
++++++++++ ptp_orte_proxy: ORTEShutdown() - told ORTEd to exit.
OMPIProxyRuntimeClient got event: EVENT_RUNTIME_OK
OMPIProxyRuntimeClient notifying...
OMPIProxyRuntimeClient awoke!
OMPIProxyRuntimeClient shut down.
event thread exiting...
++++++++++ ptp_orte_proxy: proxy_svr_finish returned.
Exception in thread "Proxy Client Event Thread" java.lang.NoClassDefFoundError: org/eclipse/ptp/core/proxy/event/ProxyDisconnectedEvent
        at org.eclipse.ptp.core.proxy.AbstractProxyClient$2.run(AbstractProxyClient.java:240)

Back to the top