Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [equinox-dev] Equinox and UTF-8

Hello,

On Thu, 10 Jul 2008 03:01:40 -0400, BJ Hargrave <hargrave@xxxxxxxxxx>
wrote:
> Well you should not be getting bytes from a String. A String is a set of 
> Characters. Some characters may fit into bytes, but some are wider.

that is correct. 
 
> Also, remember that the length of  a String is the number of characters 
> not the number of bytes into which those characters may be encoded.

I agree with you. The output

=== cut ===
§ length() = 1
§ cast to byte = -89 
§ getBytes() = -62 -89
=== cut ===

is correct. But I am getting a wrong output when running the same code
without Eclipse 
as Equinox standalone application.

=== cut ===
+é-º length() = 2
+é-º cast to byte = -62 -89
+é-º getBytes() = -61 -126 -62 -89
=== cut ===

The result of length() is wrong. And also the result of getBytes(). 

I discovered the behaviour while encoding Strings as Base64. The Base64
class in Apache-Commons Codec uses byte[] as input. However the resulting
Base64 String differs in both execution environments because of the
different
results of getBytes() in both cases.

-- 
Holger Mense



Back to the top