Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [science-iwg] project proposals

Erwin, 

Sorry for taking so long to get back to you. PTP is a workbench for developing C/C++ and Fortran parallel (MPI and OpenMP) applications. In addition to the normal development activities, such as coding, debugging, etc., we also provide a module to launch and monitor applications on a remote target system from within the workbench. 

The launching capability uses a framework that supports arbitrary resource managers/job schedulers. Support for resource managers can be added by customizing an XML description file. Users submit their jobs via the normal Eclipse launch configuration mechanism that has been tailored to suit the target system resource manager. Support is currently available for PBS/Torque, LSF, LoadL, GE, SLURM, interactive Open MPI, interactive MPICH2, and generic remote launch.

The monitoring framework provides the ability to monitor job queue and system information from within the workbench. It provides customizable views that show system a job information, and supports a wide variety of target resource managers. It is easily customizable to support other systems.

We did look at using DRMAA a few years ago, but decided to “roll our own” as support was very limited at the time. In addition, by using a simple framework we are able to provide a very easy out-of-the-box experience for users, as no additional installation or configuration is necessary to begin submitting and monitoring jobs.

Currently both the launching and monitoring frameworks are loosely integrated with PTP. It would be fairly easy to extract them as separate libraries that other tools could use if that would be of interest. I think it would also be fairly easy to add DRMAA support, but I’m not sure what the advantage of doing that would be.

I’d be happy to discuss this more and explore options for combining the two projects.

Regards,
Greg



On Apr 21, 2015, at 12:12 PM, Erwin de Ley <erwin.de.ley@xxxxxxxxxx> wrote:

Hi Greg,

With Diamond LS we've built a Java-based API to SGE via DRMAA, packaged as OSGi bundles.
DRMAA seemed to be the cleanest approach to have a standardized API on different resource management systems.
(but most implementations seem to be C-based :-( + a JNI binding for some of them)

Besides being able to submit jobs etc. from an application running on a grid node (which is the default for a DRMAA/SGE implementation), we've added a JMX-based remote-submission "gateway" (provider+consumer) and a REST-based service-wrapper on it.

The goal is to be able to submit jobs from automated processes/workflows. A typical use case would be to slice scientific data sets in chunks, submit them all as separate jobs where each job runs a data analysis workflow, wait for the results and then merge them etc.

If you want to dig a bit deeper, you can review the available code at https://github.com/DawnScience/dawn-hpc

SLURM-integration could become relevant at another site. I would negotiate with them if they're willing to deliver the code in open-source, and then we would develop it as a DRMAA implementation and offer it here. As far as I know there is no Java-based SLURM DRMAA "binding" available yet. I did find a C-based one in Poland...

If I understand the docs in PTP correctly, this is oriented to C&Fortran-based jobs and the configuration of resource management systems in a workbench.
It could indeed be great if PTP already offers resource management and job submission services and we could find a way to integrate them behind a Java/DRMAA facade!
Is that correct?

Would the above be of interest?

regards
erwin



Greg Watson schreef op 21/04/2015 om 16:57:
What sort of APIs are you proposing accessing computing grids? e.g Java APIs or Eclipse APIs? PTP already has a target system configuration framework for controlling, launching, and monitoring resource management systems like job schedulers. Are you thinking of something along those lines? PTP already supports SLURM and SGE (as well as PBS, Torque, LSF, etc.) If you wanted to open up the APIs or make the implementation more portable, I’m sure we’d be interested in that.

Regards,
Greg

On Apr 21, 2015, at 8:54 AM, Erwin de Ley <erwin.de.ley@xxxxxxxxxx> wrote:

Dear all,

I would like to propose 2 new projects :
- a set of APIs and impls in the domain of HPC, computing grids etc
- a move of our Passerelle process engine from eclipselabs to a formal eclipse project

Would this be of interest for science IWG? If so, read on ;-)...
Any feedback is of course welcome!

kind regards,
erwin

More info :
=========

1. in the HPC domain :
- APIs and impls for accessing computing grids (cfr DRMAA, SGE, SLURM, ...)
- other clustering-related tools
- memory grids
- etc

This would contain an initial code drop from the DAWN repos at github with just the DRMAA and SGE grid access.
Matt Gerring (DAWN lead) supports this move.

In the near future we would be extending this to a DRMAA-implementation for SLURM.
Another next task is upgrading from DRMAA v1 to v2.
Other topics are less/not concrete right now, and would depend on requests from science IWG or from additional committers.

Open questions :
- What would be a good name? Can we claim a generic name like "science HPC" or so? (and then hope that there's sufficient participation to enlarge the scope to other HPC-related tools)
- Where to put this? Is this a sufficient scope for a new eclipse project? Or should it be a component of a parent project?
- I guess this would become a technology project?


2. a move of our Passerelle workflow engine&workbench from the current eclipselabs@Google hosting to a formal eclipse project
The initial code drop would include a minimized Passerelle core, built on top of a new OSGi-ified version of Ptolemy (the underlying actor-based hybrid modeling software of UC Berkeley)
(Ptolemy is and would remain hosted by UC Berkeley, so their sources would not move to eclipse)

The current GEF-based Passerelle model editor would no longer be maintained, and would be replaced by an EMF&Graphiti-based one that would become the future Ptolemy model editor.
(i.e. the editor would no longer be specific for Passerelle)

For the longer term we would be extending Passerelle towards the needs for "reproducible science" (cfr http://www.reproduciblescience.org).

Open questions/issues :
- as Passerelle is the basis for our production software, with frequent releases, we can't be blocked for too long in an incubation phase (which seems to prevent formal releases).
- would this be a technology or a tools project?


--

Met vriendelijke groeten - Bien à vous - Kind regards

Erwin De Ley


Tel. +32 9 335 22 10
fax: +32 9 335 22 19
erwin.de.ley@xxxxxxxxxx
iSencia Belgium
Voorhavenlaan 31 bus 11

B-9000 Gent

www.isencia.be
<iSencia_-wit.png>

 <signature-1.gif>


                                                        



_______________________________________________
science-iwg mailing list
science-iwg@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/science-iwg



_______________________________________________
science-iwg mailing list
science-iwg@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/science-iwg



Back to the top