[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [equinox-dev] Equinox and UTF-8


On Thu, 10 Jul 2008 03:01:40 -0400, BJ Hargrave <hargrave@xxxxxxxxxx>
> Well you should not be getting bytes from a String. A String is a set of 
> Characters. Some characters may fit into bytes, but some are wider.

that is correct. 
> Also, remember that the length of  a String is the number of characters 
> not the number of bytes into which those characters may be encoded.

I agree with you. The output

=== cut ===
 length() = 1
 cast to byte = -89 
 getBytes() = -62 -89
=== cut ===

is correct. But I am getting a wrong output when running the same code
without Eclipse 
as Equinox standalone application.

=== cut ===
+Ã-Â length() = 2
+Ã-Â cast to byte = -62 -89
+Ã-Â getBytes() = -61 -126 -62 -89
=== cut ===

The result of length() is wrong. And also the result of getBytes(). 

I discovered the behaviour while encoding Strings as Base64. The Base64
class in Apache-Commons Codec uses byte[] as input. However the resulting
Base64 String differs in both execution environments because of the
results of getBytes() in both cases.

Holger Mense