Re: [ptp-user] Problem with the PTP Target System Configuration on Mac O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X

From: "Dr. Martin Steinhauser" <martin.steinhauser@xxxxxxxxxxxxxxxxx>
Date: Mon, 24 Jun 2013 15:58:59 +0200
Delivered-to: ptp-user@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/ptp-user>
List-help: <mailto:ptp-user-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/ptp-user>, <mailto:ptp-user-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/ptp-user>, <mailto:ptp-user-request@eclipse.org?subject=unsubscribe>

Dear Beth,

THANK YOU VERY MUCH !

You actually gave the decisive clue to solve the problem. In fact, everything was indeed as simple as ensuring that the PATH is correct also for a non-interactive shell.

I only had the correct path in my .profile, which apparently is not read by a non-interactive shell. Now it works as I've copied the path also into .bashrc.

Problem solved and you made my day.

Thanks to all.

Martin.

On Jun 24, 2013, at 2:52 PM, Beth Tibbitts <beth@xxxxxxxxxx>

wrote:

Martin, see this:
http://wiki.eclipse.org/PTP/FAQ#Q:_My_remote_or_synchronized_project_doesn.27t_find_the_remote_environment_variables_correctly_.28Interactive_vs._non-interactive_shell.29

Maybe it's as simple as making sure the PATH is correct as seen from non-interactive shell.

Also, the new release of PTP and Eclipse Kepler will be out on Wednesday 6/26... the connection to the local machine not longer requires ssh and is more direct in this release.

On Mon, Jun 24, 2013 at 6:14 AM, Dr. Martin Steinhauser <martin.steinhauser@xxxxxxxxxxxxxxxxx> wrote:

Dear Greg,

I think I found out something more:

When I type on the command shell:
"ssh localhost opmi_info"

then I get the following output:

Password:<I type in my password here>

Hi, welcome to this Mac shell

bash: opmi_info: command not found

Does that mean, that for some weired reason, the command opmi_info is NOT in the PATH, when I connect with my local Macbook Pro via "ssh localhost"?

As this is exactly the way how eclipse seems to locally connect with my laptop when trying to run the mpi program, it would explain the error. But then, how can I resolve this, i.e
how can I make sure, that after typing ssh localhost, the program ompi_info is known to the system?

With best regards,
Martin

On Jun 24, 2013, at 10:52 AM, Dr. Martin Steinhauser <martin.steinhauser@xxxxxxxxxxxxxxxxx> wrote:

Dear Greg,

following your advice I have picked "Open MPI-Generic-Interactive", as Target System Config, but then there pops up the following error message (see attached screenshot).

"Failed to execute command: ompi_info -a --parseable
Cannot run program "ompi_info": Unknown reason"

What does that mean and why is some command "ompi_info -a --parseable" called? This really beats me.

Unfortunately the error message does not help me in solving this problem as it doesn't give a real hint as to what to do.
Do you understand this and know a solution what I could do here?

When I execute the above command on a shell I get a couple of hundred of lines of output, starting with:
ompi:version:full:1.6.4
ompi:version:svn:r28081
ompi:version:release_date:Feb 19, 2013
orte:version:full:1.6.4
orte:version:svn:r28081
orte:version:release_date:Feb 19, 2013
opal:version:full:1.6.4
opal:version:svn:r28081
opal:version:release_date:Feb 19, 2013
mpi-api:version:full:2.1
ident:1.6.4
mca:backtrace:execinfo:version:mca:2.0
mca:backtrace:execinfo:version:api:2.0
mca:backtrace:execinfo:version:component:1.6.4
mca:paffinity:hwloc:version:mca:2.0
mca:paffinity:hwloc:version:api:2.0
mca:paffinity:hwloc:version:component:1.6.4
mca:carto:auto_detect:version:mca:2.0
mca:carto:auto_detect:version:api:2.0
mca:carto:auto_detect:version:component:1.6.4
mca:carto:file:version:mca:2.0
mca:carto:file:version:api:2.0
mca:carto:file:version:component:1.6.4
mca:shmem:mmap:version:mca:2.0
mca:shmem:mmap:version:api:2.0
mca:shmem:mmap:version:component:1.6.4
mca:shmem:posix:version:mca:2.0
mca:shmem:posix:version:api:2.0

Does this mean anything to you? Thus, the command ompi_info seems to be just a parser, but why it is called any why it produces an error instead of running the
mpi code, I don't understand.

When I grep the output for "error" I get :

378:options:mpi-max-error-string:256
409:mca:mca:base:param:mca_component_show_load_errors:value:1
410:mca:mca:base:param:mca_component_show_load_errors:data_source:default value
411:mca:mca:base:param:mca_component_show_load_errors:status:writable
412:mca:mca:base:param:mca_component_show_load_errors:help:Whether to show errors for components that failed to load or not
413:mca:mca:base:param:mca_component_show_load_errors:deprecated:no
422:mca:mca:base:param:mca_verbose:help:Specifies where the default error output stream goes (this is separate from distinct help messages). Accepts a comma-delimited list of: stderr, stdout, syslog, syslogpri:<notice|info|debug>, syslogid:<str> (where str is the prefix string for all syslog notices), file[:filename] (if filename is not specified, a default filename is used), fileappend (if not specified, the file is opened for truncation), level[:N] (if specified, integer verbose level; otherwise, 0 is implied)
473:mca:mpi:base:param:mpi_keep_peer_hostnames:help:If nonzero, save the string hostnames of all MPI peer processes (mostly for error / debugging output messages). This can add quite a bit of memory usage to each MPI process.
510:mca:mpi:base:param:mpi_warn_on_fork:help:If nonzero, issue a warning if program forks under conditions that could cause system errors
979:mca:hwloc:base:param:hwloc_base_mem_bind_failure_action:value:error
982:mca:hwloc:base:param:hwloc_base_mem_bind_failure_action:help:What Open MPI will do if it explicitly tries to bind memory to a specific NUMA location, and fails. Note that this is a different case than the general allocation policy described by hwloc_base_alloc_policy. A value of "warn" means that Open MPI will warn the first time this happens, but allow the job to continue (possibly with degraded performance). A value of "error" means that Open MPI will abort the job if this happens.
1478:mca:bml:r2:param:bml_r2_show_unreach_errors:value:1
1479:mca:bml:r2:param:bml_r2_show_unreach_errors:data_source:default value
1480:mca:bml:r2:param:bml_r2_show_unreach_errors:status:writable
1481:mca:bml:r2:param:bml_r2_show_unreach_errors:help:Show error message when procs are unreachable
1482:mca:bml:r2:param:bml_r2_show_unreach_errors:deprecated:no
2112:mca:rmaps:base:param:rmaps_base_no_oversubscribe:help:If true, then do not allow oversubscription of nodes - mpirun will return an error if there aren't enough nodes to launch all processes without oversubscribing
2426:mca:notifier:command:param:notifier_command_cmd:value:/sbin/initlog -f $s -n "Open MPI" -s "$S: $m (errorcode: $e)"
2429:mca:notifier:command:param:notifier_command_cmd:help:Command to execute, with substitution. $s = integer severity; $S = string severity; $e = integer error code; $m = string message
[105] <10:47:50> martin@SteinhauserMac: ~/Installations/openmpi-install$>bin/ompi_info -a --parseable |grep "error"
378:options:mpi-max-error-string:256
409:mca:mca:base:param:mca_component_show_load_errors:value:1
410:mca:mca:base:param:mca_component_show_load_errors:data_source:default value
411:mca:mca:base:param:mca_component_show_load_errors:status:writable
412:mca:mca:base:param:mca_component_show_load_errors:help:Whether to show errors for components that failed to load or not
413:mca:mca:base:param:mca_component_show_load_errors:deprecated:no
422:mca:mca:base:param:mca_verbose:help:Specifies where the default error output stream goes (this is separate from distinct help messages). Accepts a comma-delimited list of: stderr, stdout, syslog, syslogpri:<notice|info|debug>, syslogid:<str> (where str is the prefix string for all syslog notices), file[:filename] (if filename is not specified, a default filename is used), fileappend (if not specified, the file is opened for truncation), level[:N] (if specified, integer verbose level; otherwise, 0 is implied)
473:mca:mpi:base:param:mpi_keep_peer_hostnames:help:If nonzero, save the string hostnames of all MPI peer processes (mostly for error / debugging output messages). This can add quite a bit of memory usage to each MPI process.
510:mca:mpi:base:param:mpi_warn_on_fork:help:If nonzero, issue a warning if program forks under conditions that could cause system errors
979:mca:hwloc:base:param:hwloc_base_mem_bind_failure_action:value:error
982:mca:hwloc:base:param:hwloc_base_mem_bind_failure_action:help:What Open MPI will do if it explicitly tries to bind memory to a specific NUMA location, and fails. Note that this is a different case than the general allocation policy described by hwloc_base_alloc_policy. A value of "warn" means that Open MPI will warn the first time this happens, but allow the job to continue (possibly with degraded performance). A value of "error" means that Open MPI will abort the job if this happens.
1478:mca:bml:r2:param:bml_r2_show_unreach_errors:value:1
1479:mca:bml:r2:param:bml_r2_show_unreach_errors:data_source:default value
1480:mca:bml:r2:param:bml_r2_show_unreach_errors:status:writable
1481:mca:bml:r2:param:bml_r2_show_unreach_errors:help:Show error message when procs are unreachable
1482:mca:bml:r2:param:bml_r2_show_unreach_errors:deprecated:no
2112:mca:rmaps:base:param:rmaps_base_no_oversubscribe:help:If true, then do not allow oversubscription of nodes - mpirun will return an error if there aren't enough nodes to launch all processes without oversubscribing
2426:mca:notifier:command:param:notifier_command_cmd:value:/sbin/initlog -f $s -n "Open MPI" -s "$S: $m (errorcode: $e)"
2429:mca:notifier:command:param:notifier_command_cmd:help:Command to execute, with substitution. $s = integer severity; $S = string severity; $e = integer error code; $m = string message

Does this mean anything to you?
Any ideas what I can do?

Best regards,

Martin.

On Jun 21, 2013, at 6:07 PM, Greg Watson <g.watson@xxxxxxxxxxxx>
wrote:

Martin,

Your choice of target configuration determines which scripts get run. As you guessed, the scripts are used to formulate the command that gets issued to run the application, so the target configuration must match the type of installation you're trying to run the application on. So if you choose IBM Parallel Environment, then the system needs to have IBM Parallel Environment installed on it. If you have a system with Torque installed on it, then you would use the Torque-Generic-Batch configuration, etc.

We provide two target configurations that are most suited to running locally: Open MPI-Generic-Interactive and MPIC2-Generic-Interactive. Use the former of these if you have Open MPI installed on your Mac, use the latter if you have MPICH2. I would recommend using Open MPI as it has been more thoroughly tested.

Regards,
Greg

On Jun 21, 2013, at 10:28 AM, Dr. Martin Steinhauser <martin.steinhauser@xxxxxxxxxxxxxxxxx> wrote:

Dear all,

I'm having trouble to run the provided eclipse MPI template code "MPI PI C Project" on my brand-new Macbook Pro (8 cores) with OS X (latest version).
I wanted to check this easy sample code to see whether it works in eclipse on my Mac before I start parallelising my own programs using mpi.

The point is: On the shell, the code compiles just fine. All libraries are there, include files are found and I can run a command such as
"mpirun -n 5 ./Debug/TEST". The code then runs smoothly.

The code compiles fine in eclipse and on the shell I can run the binary (here: TEST), however when I try to run the binary in eclipse, I'm not quite sure what the problem is and why it does not run.

1. First of all, I am not quite sure which Target System Configuration I should actually pick. The tutorial is outdated and does not help here at all.

To get something running I tried out every possible choice and found out that there seems to be no error when I pick IBM PARALLEL ENVIRONMENT or IBM PLATFORM MPI.
So I picked randomly IBM PARALLEL ENVIRONMENT.

When I then run the code, an ERROR message appears:

f5109c56-8fcb-4f1c-b7a5-ce5aee21dfb6: FAILED
Can't exec "mpirun": No such file or directory at /Users/martin/.eclipsesettings/rms/PLATFORM_MPI/start_job.pl line 155.
#PTP job_id=5502
running command mpirun -np 3 -hostlist i3 /Users/martin/Projects/TEST/Debug/TEST

I don't understand at all, what this cryptic message is trying to tell me. Of course I've looked into this perl script file start_job.pl but I don't know why this script is important, why it is used at all
and why it fails. It seems to me, that the PTP system works in such a way, that -- depending on the choice of Target System Config -- one of these scripts in the directory .eclipsesettings is called which then is apparently supposed to generate either a batch file or something that finally should run the command line "mpirun …..".

Apparently, I don't really understand the logic behind this ptp system, so could anybody explain this to me in plain words ? Why are this odd scripts called and which Target system should I
actually pick. Should I create my own target system?
My plan is simply to use the default mpi installation on my Mac (or I could also use MPICH) to test my own mli codes with eclipse. But unfortunately, not even the simple sample code runs.

Can anybody help me to get this running?

Beast regards,
Martin.

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

--
...Beth

Beth Tibbitts
beth@xxxxxxxxxx

_______________________________________________
ptp-user mailing list
ptp-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-user

References:
- [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
  - From: Dr. Martin Steinhauser
- Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
  - From: Greg Watson
- Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
  - From: Dr. Martin Steinhauser
- Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
  - From: Dr. Martin Steinhauser
- Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
  - From: Beth Tibbitts

Prev by Date: Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
Next by Date: [ptp-user] PTP User Call: Wednesday, June 26, 2013, 12:00PM CDT
Previous by thread: Re: [ptp-user] Problem with the PTP Target System Configuration on Mac OS X
Next by thread: [ptp-user] PTP User Call: Wednesday, June 26, 2013, 12:00PM CDT
Index(es):
- Date
- Thread

Breadcrumbs