Hi folks,
This mail intends to
a)
share a subtle encoding issue
b)
start a discussion on how we
want to treat the matter in SMILA.
Here goes the description of the encoding
issue I ran into:
The scenario is the writing of a test case
with a converter pipelet; but that is just the setting where it happened
to me and might happen again elsewhere to s.o. else.
The expected result for the extracted item
is “Microsoft® Office PowerPoint® 2007” (note the (R) char!)
As with tests, I hard coded this value in source
code as it is sufficiently short and as soon as the converter worked the unit
test (UT) was green – in the IDE!!
When I built from the command line the
junit test would fail complaining that expected and actual value weren’t the
same.
After some time of debugging and not
getting anywhere, I switched the default encoding from my IDE to my system’s
(cp1252, and it similarly works when setting the project’s encoding for
the test bundle).
Having done this, eclipse recompiled the (whole)
workspace – et voi là - the UT failed the same as it did on the
console.
Vica versa I was also able to get it green
on the console by setting this env var:
set JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8
(note that u need the UTF-8 and not UTF8 as
I have seen on a webpage)
Reason:
The source file is written by the IDE in
the encoding that is set. However, javac uses the encoding that is determined
by the environment; in the IDE this is the same as for writing the files -- on
the console this might be different since javac doesn’t know the encoding
I have in the IDE. Usually this isn’t a problem as we seldom use special
/ non-ascii chars in our java code, but in this case it happened for a god reason
and as a consequence it mattered with which encoding the compiler reads the
source files.
In the light of this and our recommendation
to use UTF-8 in our IDE as default encoding, I suggest that we do our builds also
in UTF-8.
Any thoughts and comments on your end?
If we agree on this: where will we write
this down for fellow developers?
Thomas Menzel
brox IT-Solutions GmbH
An der
Breiten Wiese 9
30625 HANNOVER (Germany)
Mobil: +49 (173) 369 86 76
Tel: +49 (5 11) 33 65 28
– 76
eFax: +49 (5 11) 33 65 28 – 98 76
Fax: +49 (5 11) 33 65 28 – 29
Mail: tmenzel@xxxxxxx
Web: www.brox.de
==================================
According to Section 80 of the German Corporation Act brox IT-Solutions GmbH
must indicate the following information.
Address: An der Breiten Wiese 9, 30625 Hannover Germany
General Manager: Hans-Chr. Brockmann
Registered Office: Hannover, Commercial Register Hannover HRB 59240
========== Legal Disclaimer ==========