Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [science-iwg] Data Structures - part deux

Actually a NumPy (plus many aspects of SciPy extension) for Java already exists, it's called JScience ;-)

While not trying to reinvent some special number types JScience does, the Science/Units parts (pretty much all relevant SciPy.constants) are defined in Eclipse UOMo, too. JScience 5 (currently non-final) and UOMo 0.6 (Incubation but a stable Milestone implementing all of Unit-API 0.6) both implement the same API.

JScience 4 did so with the rejected JSR 275. And pending that CQ we stopped earlier for license issues UOMo.next will do so for JSR 363, the Java standard which passed Public Review stage that 275 failed.

Where UOMo either by itself or with other modules may help January the same way SciPy does NumPy, that's exactly what we're interested in offering. 

Werner 


On Wed, Jan 27, 2016 at 4:03 PM, <science-iwg-request@xxxxxxxxxxx> wrote:
Send science-iwg mailing list submissions to
        science-iwg@xxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit
        https://dev.eclipse.org/mailman/listinfo/science-iwg
or, via email, send a message with subject or body 'help' to
        science-iwg-request@xxxxxxxxxxx

You can reach the person managing the list at
        science-iwg-owner@xxxxxxxxxxx

When replying, please edit your Subject line so it is more specific
than "Re: Contents of science-iwg digest..."


Today's Topics:

   1. Re: Data Structures - part deux (Jay Jay Billings)
   2. Re: Data Structures - part deux (Jay Jay Billings)
   3. Re: Data Structures - part deux (Greg Watson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 27 Jan 2016 09:56:31 -0500
From: Jay Jay Billings <jayjaybillings@xxxxxxxxx>
To: Science Industry Working Group <science-iwg@xxxxxxxxxxx>
Subject: Re: [science-iwg] Data Structures - part deux
Message-ID:
        <CAE3ybv4vER4d3qHtYBfBtDaoOxZiGgvo1PumSsHkhe71YXZ+yw@xxxxxxxxxxxxxx>
Content-Type: text/plain; charset="utf-8"

Tracy,

Thanks for getting this going. First, let me say that if we want January to
be just about 'numpy for Java,' that is completely OK with me. We should
just make that clear in the scope. In that case, we would be looking more
at ICE and EAVP using January instead of the data structures from ICE and
EAVP being moved into January.

I just shared a description of our data structures with Matt on the other
thread. I have expanded it and share it below.

Jay

-----

Here's the code:

https://github.com/eclipse/ice/tree/master/org.eclipse.ice.datastructures

The goal of this package is to create general purpose data classes,
structures and pattern realizations that can be mapped to a wide range of
scientific problems while also maintaining metadata about that information.
They are also all bound with JAXB so that they can be persisted to XML.
Their design is verbose so that developers can almost immediately know how
to pack their data into the classes.

They are, in a sense, the exact opposite of IDataSet because they are
design to store "higher-level" quantities meant for direct consumption by
users (as opposed to reduction into a plot, etc.) We store all raw,
n-dimensional data, in files and link to those files through our
ResourceComponent.

Our long term goals with this are to switch this to an EMF model, optimize
the way metadata is stored, use IDataSet to back structures like
MatrixComponent and ResourceComponent (ILazyDataSet in this case), and
allow developers to create their own Component implementations simply
through annotations.

Consider, for example, a battery. If the state of that battery would be
represented on disk by five quantities - say a string, two integers and two
floats - and each of those quantities has associated metadata such as
descriptions, ids, names, etc., then we could map them as follows:

Battery --> 1 instance DataComponent
Quantities 1-5 --> 5 instances of Entry

Let's consider another example: a 3D geometry. In this case, the developer
would use a GeometryComponent and the associated CSG tree (which is moving
to EAVP) to create a 3D geometry constructed from shapes and boolean
operations on those shapes. Alternatively, they could construct that
geometry purely from a mesh using a MeshComponent and Edges, Vertices, etc.

Other classes, such as ListComponent, offer Generic solutions to storing
whatever data structure a user can come up with so long as they provide
JAXB bindings on that class so that it can be written to disk.

After that, any collection of Components, etc. are stored in a root class
called Form that is processed by the workflow engine and the UI. All of
this creates a single gigantic tree structure that can be walked in O(N)
time by smartly implementing the IComponentVisitor interface.


On Wed, Jan 27, 2016 at 8:34 AM, Tracy Miranda <tracy@xxxxxxxxxxxxxxxx>
wrote:

> Hi all,
>
> Following on from feedback for the January project proposal
> <https://projects.eclipse.org/proposals/january> this is a thread for
> clarifying the scope and what the project should encompass.
>
> As a sort-of self-appointed product manager I'm looking at it from the
> user perspective trying to answer these questions:
> *- What is it all about?*
> *- What problems does it solve?*
> *- Who really gives a damn?*
>
> For the initial proposal, touted as a 'numpy for Java' I have good answers
> for all those questions (mainly from the proposal itself, and work on
> python integration with Java).
>
> When it comes to expanding the scope, I'm guilty of getting excited about
> integrating all the tools and not necessarily understanding what the
> structures are or are good for and how they all fit together.
>
> I am certainly aware of specific use-cases beyond the current nd-array
> implementation, especially for the Triquetrum project, but it's pretty
> limited.
>
> So maybe best to start with both the ICE and EAVP data structures first -
> do some good knowledge transfer from Jay on the types of structures we are
> talking about and the usecases for these...
>
> Tracy
>
>
>
> _______________________________________________
> science-iwg mailing list
> science-iwg@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe
> from this list, visit
> https://dev.eclipse.org/mailman/listinfo/science-iwg
>



--
Jay Jay Billings
Oak Ridge National Laboratory
Twitter Handle: @jayjaybillings
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dev.eclipse.org/mailman/private/science-iwg/attachments/20160127/16d5f1d1/attachment.html>

------------------------------

Message: 2
Date: Wed, 27 Jan 2016 10:01:36 -0500
From: Jay Jay Billings <jayjaybillings@xxxxxxxxx>
To: Science Industry Working Group <science-iwg@xxxxxxxxxxx>
Subject: Re: [science-iwg] Data Structures - part deux
Message-ID:
        <CAE3ybv4dz5t7=Joihk6Tv8PuTX8fdNA7sZ=iCV2N8xyfA7ob=g@xxxxxxxxxxxxxx>
Content-Type: text/plain; charset="utf-8"

Tracy,

Also, if you are the "self-appointed project manager" and will be
contributing to the work, then you should be as a committer on the proposal.

Jay

On Wed, Jan 27, 2016 at 9:56 AM, Jay Jay Billings <jayjaybillings@xxxxxxxxx>
wrote:

> Tracy,
>
> Thanks for getting this going. First, let me say that if we want January
> to be just about 'numpy for Java,' that is completely OK with me. We should
> just make that clear in the scope. In that case, we would be looking more
> at ICE and EAVP using January instead of the data structures from ICE and
> EAVP being moved into January.
>
> I just shared a description of our data structures with Matt on the other
> thread. I have expanded it and share it below.
>
> Jay
>
> -----
>
> Here's the code:
>
> https://github.com/eclipse/ice/tree/master/org.eclipse.ice.datastructures
>
> The goal of this package is to create general purpose data classes,
> structures and pattern realizations that can be mapped to a wide range of
> scientific problems while also maintaining metadata about that information.
> They are also all bound with JAXB so that they can be persisted to XML.
> Their design is verbose so that developers can almost immediately know how
> to pack their data into the classes.
>
> They are, in a sense, the exact opposite of IDataSet because they are
> design to store "higher-level" quantities meant for direct consumption by
> users (as opposed to reduction into a plot, etc.) We store all raw,
> n-dimensional data, in files and link to those files through our
> ResourceComponent.
>
> Our long term goals with this are to switch this to an EMF model, optimize
> the way metadata is stored, use IDataSet to back structures like
> MatrixComponent and ResourceComponent (ILazyDataSet in this case), and
> allow developers to create their own Component implementations simply
> through annotations.
>
> Consider, for example, a battery. If the state of that battery would be
> represented on disk by five quantities - say a string, two integers and two
> floats - and each of those quantities has associated metadata such as
> descriptions, ids, names, etc., then we could map them as follows:
>
> Battery --> 1 instance DataComponent
> Quantities 1-5 --> 5 instances of Entry
>
> Let's consider another example: a 3D geometry. In this case, the developer
> would use a GeometryComponent and the associated CSG tree (which is moving
> to EAVP) to create a 3D geometry constructed from shapes and boolean
> operations on those shapes. Alternatively, they could construct that
> geometry purely from a mesh using a MeshComponent and Edges, Vertices, etc.
>
> Other classes, such as ListComponent, offer Generic solutions to storing
> whatever data structure a user can come up with so long as they provide
> JAXB bindings on that class so that it can be written to disk.
>
> After that, any collection of Components, etc. are stored in a root class
> called Form that is processed by the workflow engine and the UI. All of
> this creates a single gigantic tree structure that can be walked in O(N)
> time by smartly implementing the IComponentVisitor interface.
>
>
> On Wed, Jan 27, 2016 at 8:34 AM, Tracy Miranda <tracy@xxxxxxxxxxxxxxxx>
> wrote:
>
>> Hi all,
>>
>> Following on from feedback for the January project proposal
>> <https://projects.eclipse.org/proposals/january> this is a thread for
>> clarifying the scope and what the project should encompass.
>>
>> As a sort-of self-appointed product manager I'm looking at it from the
>> user perspective trying to answer these questions:
>> *- What is it all about?*
>> *- What problems does it solve?*
>> *- Who really gives a damn?*
>>
>> For the initial proposal, touted as a 'numpy for Java' I have good
>> answers for all those questions (mainly from the proposal itself, and work
>> on python integration with Java).
>>
>> When it comes to expanding the scope, I'm guilty of getting excited about
>> integrating all the tools and not necessarily understanding what the
>> structures are or are good for and how they all fit together.
>>
>> I am certainly aware of specific use-cases beyond the current nd-array
>> implementation, especially for the Triquetrum project, but it's pretty
>> limited.
>>
>> So maybe best to start with both the ICE and EAVP data structures first -
>> do some good knowledge transfer from Jay on the types of structures we are
>> talking about and the usecases for these...
>>
>> Tracy
>>
>>
>>
>> _______________________________________________
>> science-iwg mailing list
>> science-iwg@xxxxxxxxxxx
>> To change your delivery options, retrieve your password, or unsubscribe
>> from this list, visit
>> https://dev.eclipse.org/mailman/listinfo/science-iwg
>>
>
>
>
> --
> Jay Jay Billings
> Oak Ridge National Laboratory
> Twitter Handle: @jayjaybillings
>



--
Jay Jay Billings
Oak Ridge National Laboratory
Twitter Handle: @jayjaybillings
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dev.eclipse.org/mailman/private/science-iwg/attachments/20160127/c9d42b3b/attachment.html>

------------------------------

Message: 3
Date: Wed, 27 Jan 2016 10:03:05 -0500
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Science Industry Working Group <science-iwg@xxxxxxxxxxx>
Subject: Re: [science-iwg] Data Structures - part deux
Message-ID: <4B90F415-E7CE-44AD-8828-337F125BA2B9@xxxxxxxxxxxx>
Content-Type: text/plain; charset="utf-8"

There seems to be enough commonality between this proposal and the ICE work to warrant it being in the same project. It is very easy to structure a project so that components are separate, and allow people to obtain only the functionality they are interested in. This would also not preclude using the ?NumPy for Java? tag, which is a good way of enhancing interest in the project.

You really want to avoid having a project with a small group of developers who have too much ?ownership?. Projects like this tend to alienate other contributors, and have a lifespan that depends on the initial developers' ability to continue contributing. It is much more advantageous to have a broader community at the beginning, because you will be able to leverage the enthusiasm and expertise of a larger group in order to create an ecosystem around the project.

Notwithstanding this, clarifying the goals and objectives of the project is essential. Having a clearly articulated scope will help both contributors and users understand how they can get value out of it.

My 2 cents worth.

Greg

> On Jan 27, 2016, at 9:56 AM, Jay Jay Billings <jayjaybillings@xxxxxxxxx> wrote:
>
> Tracy,
>
> Thanks for getting this going. First, let me say that if we want January to be just about 'numpy for Java,' that is completely OK with me. We should just make that clear in the scope. In that case, we would be looking more at ICE and EAVP using January instead of the data structures from ICE and EAVP being moved into January.
>
> I just shared a description of our data structures with Matt on the other thread. I have expanded it and share it below.
>
> Jay
>
> -----
>
> Here's the code:
>
> https://github.com/eclipse/ice/tree/master/org.eclipse.ice.datastructures <https://github.com/eclipse/ice/tree/master/org.eclipse.ice.datastructures>
>
> The goal of this package is to create general purpose data classes, structures and pattern realizations that can be mapped to a wide range of scientific problems while also maintaining metadata about that information. They are also all bound with JAXB so that they can be persisted to XML. Their design is verbose so that developers can almost immediately know how to pack their data into the classes.
>
> They are, in a sense, the exact opposite of IDataSet because they are design to store "higher-level" quantities meant for direct consumption by users (as opposed to reduction into a plot, etc.) We store all raw, n-dimensional data, in files and link to those files through our ResourceComponent.
>
> Our long term goals with this are to switch this to an EMF model, optimize the way metadata is stored, use IDataSet to back structures like MatrixComponent and ResourceComponent (ILazyDataSet in this case), and allow developers to create their own Component implementations simply through annotations.
>
> Consider, for example, a battery. If the state of that battery would be represented on disk by five quantities - say a string, two integers and two floats - and each of those quantities has associated metadata such as descriptions, ids, names, etc., then we could map them as follows:
>
> Battery --> 1 instance DataComponent
> Quantities 1-5 --> 5 instances of Entry
>
> Let's consider another example: a 3D geometry. In this case, the developer would use a GeometryComponent and the associated CSG tree (which is moving to EAVP) to create a 3D geometry constructed from shapes and boolean operations on those shapes. Alternatively, they could construct that geometry purely from a mesh using a MeshComponent and Edges, Vertices, etc.
>
> Other classes, such as ListComponent, offer Generic solutions to storing whatever data structure a user can come up with so long as they provide JAXB bindings on that class so that it can be written to disk.
>
> After that, any collection of Components, etc. are stored in a root class called Form that is processed by the workflow engine and the UI. All of this creates a single gigantic tree structure that can be walked in O(N) time by smartly implementing the IComponentVisitor interface.
>
>
> On Wed, Jan 27, 2016 at 8:34 AM, Tracy Miranda <tracy@xxxxxxxxxxxxxxxx <mailto:tracy@xxxxxxxxxxxxxxxx>> wrote:
> Hi all,
>
> Following on from feedback for the January project proposal <https://projects.eclipse.org/proposals/january> this is a thread for clarifying the scope and what the project should encompass.
>
> As a sort-of self-appointed product manager I'm looking at it from the user perspective trying to answer these questions:
> - What is it all about?
> - What problems does it solve?
> - Who really gives a damn?
>
> For the initial proposal, touted as a 'numpy for Java' I have good answers for all those questions (mainly from the proposal itself, and work on python integration with Java).
>
> When it comes to expanding the scope, I'm guilty of getting excited about integrating all the tools and not necessarily understanding what the structures are or are good for and how they all fit together.
>
> I am certainly aware of specific use-cases beyond the current nd-array implementation, especially for the Triquetrum project, but it's pretty limited.
>
> So maybe best to start with both the ICE and EAVP data structures first - do some good knowledge transfer from Jay on the types of structures we are talking about and the usecases for these...
>
> Tracy
>
>
>
> _______________________________________________
> science-iwg mailing list
> science-iwg@xxxxxxxxxxx <mailto:science-iwg@xxxxxxxxxxx>
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> https://dev.eclipse.org/mailman/listinfo/science-iwg <https://dev.eclipse.org/mailman/listinfo/science-iwg>
>
>
>
> --
> Jay Jay Billings
> Oak Ridge National Laboratory
> Twitter Handle: @jayjaybillings
> _______________________________________________
> science-iwg mailing list
> science-iwg@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> https://dev.eclipse.org/mailman/listinfo/science-iwg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dev.eclipse.org/mailman/private/science-iwg/attachments/20160127/3d1dbf3b/attachment.html>

------------------------------

_______________________________________________
science-iwg mailing list
science-iwg@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/science-iwg

End of science-iwg Digest, Vol 36, Issue 22
*******************************************


Back to the top