Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] Eclipse & Remote Tools

Ok, I've generated ssh key and connected to the cluster.
Now there's another issue.
locally my helloworld code works fine. When I added the nodes, it connects.
(at least it doesn't give connection errors as before).
The paths that I've installed openmpi are the same. ( /usr/local/openmpi-1.3.3/lib/... )
I've uploaded the same code, same directory, same executable.

When I write mpirun -np 2 MPI_Final - my executable, which is working fine on local mode,
it gives the following error:

hadoop@sertan-desktop:~/MPI_Deneme$ mpirun -np 2 MPI_Final
[sertan-desktop:04745] *** Process received signal ***
[sertan-desktop:04745] Signal: Segmentation fault (11)
[sertan-desktop:04745] Signal code: Address not mapped (1)
[sertan-desktop:04745] Failing at address: 0x6c
[sertan-desktop:04745] [ 0] /lib/libpthread.so.0 [0x7fb2bb3d5190]
[sertan-desktop:04745] [ 1] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x230) [0x7fb2bc12ec70]
[sertan-desktop:04745] [ 2] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0 [0x7fb2bbea4f28]
[sertan-desktop:04745] [ 3] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0(opal_progress+0x99) [0x7fb2bbe990d9]
[sertan-desktop:04745] [ 4] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_trigger_event+0x42) [0x7fb2bc111522]
[sertan-desktop:04745] [ 5] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x22d) [0x7fb2bc12ec6d]
[sertan-desktop:04745] [ 6] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0 [0x7fb2bbea4f28]
[sertan-desktop:04745] [ 7] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0(opal_progress+0x99) [0x7fb2bbe990d9]
[sertan-desktop:04745] [ 8] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x23d) [0x7fb2bc12fa3d]
[sertan-desktop:04745] [ 9] /usr/local/openmpi-1.3.3/lib/openmpi/mca_plm_rsh.so [0x7fb2ba642a41]
[sertan-desktop:04745] [10] mpirun [0x403a79]
[sertan-desktop:04745] [11] mpirun [0x402ee4]
[sertan-desktop:04745] [12] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fb2bb075abd]
[sertan-desktop:04745] [13] mpirun [0x402e09]
[sertan-desktop:04745] *** End of error message ***
Segmentation fault
hadoop@sertan-desktop:~/MPI_Deneme$

But the locations of the libraries stated above, they all exist there... I couldn't solve the problem. I'm still trying to make "HelloMPI" code work.. :/ couldn't manage it yet, for a week..

Saygin

On Mon, Apr 19, 2010 at 5:14 PM, Saygin Arkan <saygenius@xxxxxxxxx> wrote:
I'Ve copied the plugins, cource code etc.
I've added nodes and made the necessary corrections on the following files:
/usr/local/etc/openmpi-default-hostfile
/etc/hosts
/usr/local/etc/openmpi-mca-params.conf

But when I launch my application, through command line, (no eclipse no parallel tools platform) it asks for passwords.
on the cluster, yes there are 4 nodes and usually I connect with ssh & username/password.
But here I don't give any username, it still asks and don't accept (as expected) any password.
if I give node names in host list as username@os221 it still does not accept.
I also removed the modifications on host configuration files and wrote my hostfile, it gave the same error, as follows:

sertan@sertan-desktop:~/workspace/MPI_Final/src$ mpirun --hostfile my_hostfile -np 4 MPI_Final
Password: Password:
Password:
Password:
Password:
Password:
Permission denied (publickey,keyboard-interactive).
--------------------------------------------------------------------------
A daemon (pid 8587) died unexpectedly with status 255 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--------------------------------------------------------------------------
    os221.(my server name)- daemon did not report back when launched
    os222.(my server name) - daemon did not report back when launched



If I launch project on eclipse, still the same thing:
some dialog box appears, and asks for password, I enter, it rejects etc. It goes on like this every time.

I've asked to somebody and told me about generating a public key on server. Is that something related with this? I've no idea how to make it possible :/

Thanks a lot for your concern

Saygin



On Mon, Apr 12, 2010 at 5:54 PM, Steven R. Brandt <sbrandt@xxxxxxxxxxx> wrote:
I think RemoteTools with port forwarding is the better way to go. You don't have to worry about firewalls that way.

Also, make sure the unix command "stat" is in your path on the remote machine.

Cheers,
Steve


On 04/12/2010 08:09 AM, Greg Watson wrote:
Saygin,

I'm assuming that you have Open MPI installed on the cluster. It shouldn't matter if the versions on your computer and cluster are different, except that you will have to compile the program on the cluster in order to run it there.

Because remote support is not fully implemented in PTP 3.0, I'd suggest you set up your project as follows:

1. Once you've installed Eclipse and PTP locally, copy the "plugins" directory to your cluster. Build the SDM by running "sh BUILD" in the appropriate directory on the cluster.
2. Create a CVS repository on the cluster containing your source code.
3. Check out a copy of the source on the cluster. This will be your build copy.
4. Check out a copy of the source into your Eclipse workspace (using the extssh service).
5. When you modify your source locally, check the changes back into CVS and build the program manually by running "cvs update; make all" on the cluster.
6. Create an Open MPI resource manager using a "Remote Tools" connection to your cluster. Make sure it starts correctly.
7. Create a launch configuration. Choose the RM you just created. Set the application program to the executable on the cluster. Set the SDM path to the location of the "sdm" executable in step #1.

You should now be ready to go.

Regards,
Greg

On Apr 5, 2010, at 10:17 AM, Saygin Arkan wrote:

 
Hi,
I'd like to run a job on my cluster.
I couldn't decide which options to select, which modifications to make. (ok at least not local :) )
I don't know if I should make RSE or Remote Tools when I'm creating a Resource Manager.
if I'll create one or another, which one will be local, which one will be the remotely accessed.
I have 4 computers, os221 os222 os223 and os224.
I've modified etc/hosts, and the necessary additions to the files in usr/local/etc

But I made those modifications only on my localhost. I don't know which way to run a job on my cluster. Should I copy them (executable files) by hand?
or should I add sdm debugger to every of those. The nodes don't have eclipse, and the versions on my computer&  cluster differ (1.4.1&  1.3.4), does that cause problems?
and I'm just trying to make a simple Hello World application, the code is correct.

Thanks for help

Saygin
_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user
   
_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user
 

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

Back to the top