Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[stem-dev] Antw: Changes to the earthscience data format

Hi Jan,

point 1): very good idea
point 2): in case the data represent an aggregation then I would
suggest to keep the field 6 and 7 as oly with these values you are able
to judge, whether an average makes sense at all, i.e. whether you have a
gaussian distribution of values. 
point 3): I would  suggest to include fields "median" and "N"= sample
size.

Best wishes,

Matthias

Matthias Filter
 
Bundesinstitut für Risikobewertung 
Fachgruppe Epidemiologie und Zoonosen
Abteilung 4  - Biologische Sicherheit
 
Federal Institute for Risk Assessment
Unit Epidemiology and Zoonoses
Department 4  - Biological Safety

Thielallee 88-92, 14195 Berlin, Germany
Tel. +49 30 18412-2209
Fax +49 30 18412-2952
matthias.filter@xxxxxxxxxxx ( mailto:matthias.filter@xxxxxxxxxxx ) 


>>> Jan H Pieper <jhpieper@xxxxxxxxxxxxxxx> 13.04.2011 02:07 >>>


Hi everyone,

I am working on adding ten years of earthscience data into STEM (the
org.eclipse.stem.internal.data.geography.earthscience plugin).
I'd like to propose a few changes to the data format as follows:

1) Rounding: The current data files contain very precise numbers with
many
digits in the fractional part. I suggest to round most numbers to two
fractional digits, which should be sufficient for temperature
(celcius),
rainfall (millimeters), and elevation (meters). For vegetation, I
suggest
we use three fractional digits because the values range from -0.1 to
0.9.

2) Data Fields: The current data files contain eight fields for each
point
in time:
# Field 1: Average
# Field 2: Standard Deviation
# Field 3: Maximum
# Field 4: Minimum
# Field 5: Range
# Field 6: Skewness
# Field 7: Kurtosis
# Field 8: Root Mean Square

I propose to remove skewness and kurtosis (fields 6 and 7). They are
measurements for statistical distributions and cannot really be applied
to
these data sets.

3) Lastly, I was wondering if there are any other derived values that
should be added to the list above.

Thanks,

  Jan :-)

Jan H. Pieper
IBM Almaden Research Center
mail: jhpieper@xxxxxxxxxxxxxxx


Back to the top