[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-user] Eclipse & Remote Tools

Ok now the problem solved.
I've installed the same versions. Now it works on command line.
Greg, I know this is not the mailing list for this purpose but without solving that problem, I couldn't pass to PTP.
If I come up with any other issue, related with OpenMPI, I ensure you that I'll not write here from now on.
Thanks a lot for your recommendations.

Now a problem with PTP

I tried to make my HelloWorldMPI, work on PTP. I have 4 clusters, os221, 222,223 and 224.
I've tried several ways to make it work. First resource manager:
Remote Service provider: Local
Remote Location: Local
Local address: localhost
I've added the nodes to configuration files in openmpi/etc/... & etc/hosts

now when I start resource manager, I see these 4 nodes, no problem. I made the run configuration, added number of processes 4, necessary application file etc.
(it doesn't copy executable from local file system, so I copied them by ssh, manually).
When I execute, it gives the following ONLY in job0:0. It doesn't pass to other 3 nodes, seems only working on os221, isn't it? When I double click to first node on machines view, I select the job0:0 from process info, and I got the result below. But job0:1, job0:2, job0:3 gives empty... nothing... :
<map>
    <host name="os221.uzay.tubitak.gov.tr" slots="1" max_slots="0">
        <process rank="0"/>
    </host>
    <host name="os222.uzay.tubitak.gov.tr" slots="1" max_slots="0">
        <process rank="1"/>
    </host>
    <host name="os223.uzay.tubitak.gov.tr" slots="1" max_slots="0">
        <process rank="2"/>
    </host>
    <host name="os224.uzay.tubitak.gov.tr" slots="1" max_slots="0">
        <process rank="3"/>
    </host>
</map>
<stdout rank="2">my rank : 2</stdout>
<stdout rank="1">my rank : 1</stdout>
<stdout rank="3">my rank : 3</stdout>
<stdout rank="0">my rank : 0</stdout>

Another approach:
Remote Service provider: RSE
Remote Location: ssh_connection
Local address: localhost
I've added the nodes to configuration files in openmpi/etc/... & etc/hosts

Everything seems fine. I added another run configuration, again the same thing. Only os221 produces output.
<stdout rank="1">my rank : 1</stdout>
<stdout rank="3">my rank : 3</stdout>
<stdout rank="2">my rank : 2</stdout>
<stdout rank="0">my rank : 0</stdout>
logout

and the machines on the screen, I see "!" Error sign on the nodes.


Another approach:
Remote Service provider: RSE
Remote Location: local
Local address: localhost
I've added the nodes to configuration files in openmpi/etc/... & etc/hosts
couldn't start resouce manager.

Another approach:
Remote Service provider: Remote
Remote Location: local
Local address: localhost
I've added the nodes to configuration files in openmpi/etc/... & etc/hosts
couldn't start resouce manager. Gives internal error, null argument. nothing else


Saygin

On Tue, Apr 20, 2010 at 3:36 PM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:
Saygin,

I'd recommend trying a later version (1.4.1 or 1.3.4) as older versions have known bugs. Also, we're not Open MPI experts, so there may not be anyone on this list who can help. I'd suggest for these types of Open MPI installation/configuration questions you use one of the mailing lists on http://www.open-mpi.org/community/lists/ompi.php. We're happy to help out with problems specific to PTP.

Regards,
Greg

On Apr 20, 2010, at 8:03 AM, Saygin Arkan wrote:

Ok, I've generated ssh key and connected to the cluster.
Now there's another issue.
locally my helloworld code works fine. When I added the nodes, it connects.
(at least it doesn't give connection errors as before).
The paths that I've installed openmpi are the same. ( /usr/local/openmpi-1.3.3/lib/... )
I've uploaded the same code, same directory, same executable.

When I write mpirun -np 2 MPI_Final - my executable, which is working fine on local mode,
it gives the following error:

hadoop@sertan-desktop:~/MPI_Deneme$ mpirun -np 2 MPI_Final
[sertan-desktop:04745] *** Process received signal ***
[sertan-desktop:04745] Signal: Segmentation fault (11)
[sertan-desktop:04745] Signal code: Address not mapped (1)
[sertan-desktop:04745] Failing at address: 0x6c
[sertan-desktop:04745] [ 0] /lib/libpthread.so.0 [0x7fb2bb3d5190]
[sertan-desktop:04745] [ 1] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x230) [0x7fb2bc12ec70]
[sertan-desktop:04745] [ 2] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0 [0x7fb2bbea4f28]
[sertan-desktop:04745] [ 3] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0(opal_progress+0x99) [0x7fb2bbe990d9]
[sertan-desktop:04745] [ 4] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_trigger_event+0x42) [0x7fb2bc111522]
[sertan-desktop:04745] [ 5] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x22d) [0x7fb2bc12ec6d]
[sertan-desktop:04745] [ 6] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0 [0x7fb2bbea4f28]
[sertan-desktop:04745] [ 7] /usr/local/openmpi-1.3.3/lib/libopen-pal.so.0(opal_progress+0x99) [0x7fb2bbe990d9]
[sertan-desktop:04745] [ 8] /usr/local/openmpi-1.3.3/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x23d) [0x7fb2bc12fa3d]
[sertan-desktop:04745] [ 9] /usr/local/openmpi-1.3.3/lib/openmpi/mca_plm_rsh.so [0x7fb2ba642a41]
[sertan-desktop:04745] [10] mpirun [0x403a79]
[sertan-desktop:04745] [11] mpirun [0x402ee4]
[sertan-desktop:04745] [12] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fb2bb075abd]
[sertan-desktop:04745] [13] mpirun [0x402e09]
[sertan-desktop:04745] *** End of error message ***
Segmentation fault
hadoop@sertan-desktop:~/MPI_Deneme$

But the locations of the libraries stated above, they all exist there... I couldn't solve the problem. I'm still trying to make "HelloMPI" code work.. :/ couldn't manage it yet, for a week..

Saygin