I also don’t think we need to match
Java’s internal representation of strings. The only things we need
to worry about are:
1. If the data is received via a Java socket,
how easily can it be translated into a Java string?
2. If the data is sent via a Java socket,
how easily can it be gotten from a Java string?
3. If the data is received via a C socket,
how easily can in be translated into a standard C format?
4. If the data is sent via a C socket, how
easily can in be gotten from a standard C format?
5. If the data is sent to an agent which
is primarily implemented in Java, how easily can it make the JNI transition
between the C stub and the Java agent?
6. If the data is sent from an agent which
is primarily implemented in Java, how easily can it make the JNI transition
between the Java agent and the C stub?
In the case of the third and fourth
questions, a lot depends on how we actually plan to represent character strings
in the HCE code itself. It seems likely that we’ll be translating
out of the wire format and into some “local” format early on in the
process. So we need to have a strategy for having the HCE code
internationalizable.
Does the current RAC worry about that?
-Andy
-----Original Message-----
From: hyades-dev-admin@xxxxxxxxxxx
[mailto:hyades-dev-admin@xxxxxxxxxxx] On
Behalf Of Nguyen, Hoang M
Sent: Wednesday, August 18, 2004
3:47 PM
To: hyades-dev@xxxxxxxxxxx
Subject: [hyades-dev] More info on
Java UTF-8
Hello all,
If we want to adopt the Java UTF-8
form, we may want to consider adopting its data structure as well.
Here is the spec of UTF-8 data
structure in Java Virtual Machine (JVM)
http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#7963
·
2-byte length
·
followed by the UTF-8 byte stream
·
length does not contain the null character
In addition, I have verified that
Java handles the single null byte translation as well.
Please see attached programs.
We can discuss more and get some
resolution on this issue in our weekly meeting.
Regards,
/*
This demo program shows
that Java can handle
UTF-8
file with null byte is translated as one byte.
*/
import java.io.* ;
public class MyUTF8Output
{
public
static void main(String args[])
{
FileOutputStream fos ;
OutputStreamWriter osw ;
char[] msg = {'A', '\u0000', 'B', '\u0080', 'C', '\u0000'} ;
try
{
String s = new String(msg) ;
fos = new FileOutputStream("myoutput.txt");
osw = new OutputStreamWriter(fos, "UTF-8");
osw.write(s) ;
osw.flush() ;
fos.close();
System.out.println("See \"myoutput.txt\" file.") ;
}
catch (Exception e) { }
}
}
/*
This demo program shows
that Java UTF-8 format is:
- 2-byte leng of the UTF-8 buffer
- null byte is mapped into two bytes
*/
import java.io.* ;
public class MyUTF8Conversion
{
public
static void main(String args[])
{
FileOutputStream fos ;
DataOutputStream dos ;
char[] msg = {'A', '\u0000', 'B', '\u0080', 'C', '\u0000'} ;
try
{
String s = new String(msg) ;
fos = new FileOutputStream("myoutput2.txt");
dos = new DataOutputStream(fos);
dos.writeUTF(s) ;
dos.flush() ;
fos.close();
System.out.println("See \"myoutput2.txt\" file.") ;
}
catch (Exception e) { }
}
}