User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
Jay, Philip, all,
Permit me to also add some 2-cents, based on what I've seen at the
(mainly synchrotron) institutions that we've been working with. I'm
sure Matt and others have a much deeper knowledge on this, and they
can correct me, add more info etc. I just want to trigger some
thoughts here ;-)
At the synchrotrons I've seen ideas and efforts that not only want
to allow sharing datasets between different users, but also between
different tools (interactive and automated).
Another topic is that datasets are becoming ever bigger, and the
volume of them as well. Insofar that engineering has to provide
storage and access services for the (visiting) scientists and that
the moment is getting near that the raw data is no longer easily
transportable, e.g. to a scientists home institution.
Some aspects of this :
1. common / standard storage formats, i.e. the technical format /
filetype
2. within those technical "containers", trying to standardize
nomenclature & functional structures
and provide abstraction tools that can work with different storage
variations in a common way
3. large storage infrastructures and corresponding tools/APIs to
allow storing huge volumes of datasets and make them accessible in a
controlled manner
4....
E.g. :
1. is about HDF5, Nexus...
2. DAWN's loaders, Soleil's CDMA, I remember that Philip also
described ChemClips's related features to work with many different
file formats..
3. e.g. ICAT, http://pan-data.eu/about, ...
The Science IWG would be a great place to collect cross-domain
requirements and to extract knowhow, APIs, libraries, ... to
integrate some of these aspects!
cheers
erwin
Op 15/03/2016 om 13:14 schreef Jay Jay
Billings:
Philip,
That's an interesting idea. Will you write a little
more about what you have in mind?
And a definite +1 for Tobias' and Philip's suggestions.
cheers
erwin
Op 15/03/2016 om 08:38 schreef Philip Wenig:
Jay, everyone,
it would be great if we could include under section
"Scope" also:
* Storing scientific data and enable an easy exchange
between researchers.
Best,
Philip
Am 15.03.2016 um 08:33 schrieb Tobias Verbeke:
Hi Jay,
Thanks for the revisions.
From:
"Jay Jay Billings" <jayjaybillings@xxxxxxxxx> To: "Science Industry Working Group" <science-iwg@xxxxxxxxxxx> Sent: Tuesday, March 15, 2016 4:27:02 AM Subject: Re: [science-iwg] Science Top
Level Project draft
Tobias, everyone,
Here is an updated version with most of
your changes worked in one way or the
other. Please let me know what you think.
Two things:
*BSD is already mentioned next to EDL.
*Exporting cryptographic algorithms is a
violation of US export control laws. So we
need to explicitly say that we will not work
on cryptography or its applications.
OK... Adding a sentence that provides the
background (along 'In order to comply with ..., ')
may be useful.
Best,
Tobias
Doing compression or anonymization are
separate subjects, regardless of the origins
of the techniques.
- modeling and simulation
is only one way to collect (in
this case generate) and
analyze scientific data (be it
in the physical or social
sciences); this could be
broadened to collection and
analysis of sample survey data
and experimental data
- maybe add economics to
the social sciences ?
- tools and
libraries for statistics,
machine learning, artificial
intelligence, data mining, text
mining
- data structures should
not be limited to 3D; we want
to live in more dimensions
- visualization: a detail,
but 4D is not uncommon
(including e.g. time
dimension)
- I don't understand why we
should explicitly exclude (the
mathematics of) cryptography;
certain anonymization or
privacy-protecting procedures
as applied to scientific data
can use cryptographic
techniques
- I would use Revised BSD
or 3-clause BSD next to (or
rather than) EDL; I understand
the inclination to use EDL,
but the name is not widely
known (e.g. Wikipedia knows
EPL but not EDL) and, if it
does not make any practical
difference, the other names
are much more recognizable and
therefore reassuring to
people.
Just my two eurocents.
Best,
Tobias
From:
"Jay Jay Billings" <jayjaybillings@xxxxxxxxx> To: "Science
Industry Working Group" <science-iwg@xxxxxxxxxxx> Sent: Monday, March
14, 2016 7:09:44 PM Subject: Re:
[science-iwg] Science Top
Level Project draft
Everyone,
This draft is under
review by the Steering
Committee too and we are
going to review it one
final time on Wednesday.
We will share our
thoughts with you then.
Ideally we would have
as much community
feedback addressed in
the document as possible
before we submit it to
the Foundation. So
please speak up if you
have ideas!
A draft of the
Science Top Level
Project can be viewed
and commented on
here. If
you're interested in
edit access, just
ask and I'll grant
it to you.
A few folks asked
what a Top Level
Project (TLP) is in
our annual meeting.
It's a code-less
project that
provides vetting for
important practices
of the projects
hosted beneath it.
The members of the
TLP are called the
Project Management
Committee (PMC).
Here's a page
listing the kinds of
things the PMC does:
https://wiki.eclipse.org/PMC
A couple of key ones
are:
reviewing,
discussing, and
approving/rejecting
CQ requests
before they go
to the
intellectual
property (IP)
team.
some checks
& balances
related to
committer
elections
--
~~~~~~~~~~~~~~~~~~~~~~~~
OpenChrom - the open source alternative for chromatography / mass spectrometry
Dr. Philip Wenig » Founder » philip.wenig@xxxxxxxxxxxxx » http://www.openchrom.net
~~~~~~~~~~~~~~~~~~~~~~~~