Bug 92039 - DBCS3.1: Hello World plug-in fails for some DBCS strings
Summary: DBCS3.1: Hello World plug-in fails for some DBCS strings
Status: RESOLVED INVALID
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.1 RC1   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL:
Whiteboard:
Keywords: vm
Depends on:
Blocks:
 
Reported: 2005-04-20 03:30 EDT by Henry Huang CLA
Modified: 2005-08-24 12:58 EDT (History)
7 users (show)

See Also:


Attachments
setting the Action Class Name (116.07 KB, image/jpeg)
2005-04-20 03:32 EDT, Henry Huang CLA
no flags Details
Running the Eclipse Application (127.40 KB, image/jpeg)
2005-04-20 03:35 EDT, Henry Huang CLA
no flags Details
A zip archive of the project (6.51 KB, application/octet-stream)
2005-04-20 23:18 EDT, Henry Huang CLA
no flags Details
An updated .zip archive of the project, using Export (5.77 KB, application/octet-stream)
2005-04-21 04:55 EDT, Henry Huang CLA
no flags Details
DBCS .java file for Action (1.74 KB, application/octet-stream)
2005-04-22 01:35 EDT, Henry Huang CLA
no flags Details
DBCS .class file for Action (1.29 KB, application/octet-stream)
2005-04-22 01:36 EDT, Henry Huang CLA
no flags Details
A screenshot of a similar error using the same Chinese string (79.57 KB, image/jpeg)
2005-04-26 01:17 EDT, Henry Huang CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Henry Huang CLA 2005-04-20 03:30:56 EDT
OS:	windowsXP
Language:	Traditional Chinese
Build level: 	apr 18
JDK version:  IBM JDK 1.4.2 SP1a
Test case #: 4.3   

Summary: DBCS3.1: Hello World plug-in fails for some DBCS strings

Steps to recreate problem:
1-Use the Plug-In Template to create a Hello World plug-in.
2-For Action Class Name, enter the DBCS string "ÎÒÊÇüSÐÀÁ¢" (in Traditional 
Chinese).

3-Run As an Eclipse Application.

 ....

Error:  The Hello message is inaccessable, and an error message appears in the 
console saying that ÎÒÊÇüSÐÀÁ¢ cannot be found.


Expected Result:  The Hello message should be accessible.
Comment 1 Henry Huang CLA 2005-04-20 03:32:19 EDT
Created attachment 20105 [details]
setting the Action Class Name
Comment 2 Henry Huang CLA 2005-04-20 03:35:42 EDT
Created attachment 20106 [details]
Running the Eclipse Application
Comment 3 Wassim Melhem CLA 2005-04-20 07:01:15 EDT
the problem is that you entered illegal (ie. Chinese) characters as java class 
name.
Comment 4 Steven Wasleski CLA 2005-04-20 09:11:53 EDT
Wassim, Chinese characters are valid in Java identifiers 
(http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#40625)
 as are most characters.
Comment 5 Wassim Melhem CLA 2005-04-20 09:15:08 EDT
thanks.  that was news to me.
So is the encoding of the workspace set correctly?
Comment 6 Henry Huang CLA 2005-04-20 21:59:58 EDT
I have tested with both UTF-8 and MS950, under Preferences, Editors, and also 
under Project Properties, and there seems to be no difference in the results.

What does seem to make a difference, however, is the length of the entire 
identifier, including the Project Name, the word "actions", and the actual 
class name.

It seems that for certain lengths of this particular identifier, the Hello 
World plug-in fails, and for certain other lengths, the plug-in succeeds.  I 
will try to attach an archive of the actual workspace I used to test this.

Comment 7 Henry Huang CLA 2005-04-20 23:18:45 EDT
Created attachment 20167 [details]
A zip archive of the project

I have attached an archive of the project.
Comment 8 Henry Huang CLA 2005-04-21 04:55:20 EDT
Created attachment 20175 [details]
An updated .zip archive of the project, using Export

This .zip file contains the Project related to the test case.

It seems that not only DBCS length, but also choice of DBCS character helps
determine the outcome of running the Hello World plug-in.  Project name might
be unimportant.
Comment 9 Wassim Melhem CLA 2005-04-21 22:51:01 EDT
I import the project into my workspace and the java compiler complains about 
the name of the java class.

Does the project actually compile in your workspace? if so, do you have any 
special settings?
Comment 10 Henry Huang CLA 2005-04-22 01:26:17 EDT
I believe you don't need special settings for this project.

The .zip archive I added contains .class and .java files in the actions 
subdirectory.  Both of these files have corrupted filenames after the Export is 
performed in Eclipse.  On the other hand, in the Workspace copy of the project,
both the .class and .java files have valid filenames, in both the working and 
non-working versions of the Hello World plug-in.  This may be an independent 
problem, by the way, because the .java and .class files were corrupted 
immediately after perfoming the Export.

I will try to attach files from the Workspace copy of the project.

Comment 11 Henry Huang CLA 2005-04-22 01:35:18 EDT
Created attachment 20221 [details]
DBCS .java file for Action
Comment 12 Henry Huang CLA 2005-04-22 01:36:51 EDT
Created attachment 20222 [details]
DBCS .class file for Action
Comment 13 Wassim Melhem CLA 2005-04-22 02:34:51 EDT
correct.  the files got corrupted during the export due to bug 81918
Comment 14 Wassim Melhem CLA 2005-04-22 02:43:55 EDT
since I can't read chinese characters, could you please verify for me that the 
content of the plugin.xml is correct?  thanks.
Comment 15 Henry Huang CLA 2005-04-22 04:00:45 EDT
The Chinese characters are uncorrupted in the plugin.xml file.

I believe this problem has nothing to do with being a Plug-In project.
I have verified that the .class file generated by the successful compile of 
the .java file is rejected by the JRE, regardless of whether running from 
Eclipse or the Windows command line.

My JRE version is 1.4.2, build cn142sr1a-20050209.

It seems that for some reason, the JRE considers the .class file invalid, even 
though certain other .class files with Traditional Chinese names run with no 
problem.

Comment 16 Wassim Melhem CLA 2005-04-22 09:45:36 EDT
as long as the plugin.xml content and the generated *.java file is correct, 
everything is ok from a PDE standpoint.

Moving to JDT to comment on the *.class generation and JRE issue.
Comment 17 Henry Huang CLA 2005-04-26 01:17:07 EDT
Created attachment 20343 [details]
A screenshot of a similar error using the same Chinese string

This latest attachment shows the error occurring in a normal Java Project with
a generated Java Class.  As in the "Hello World" Plug-In, the Chinese string is
wrongly interpreted by the JRE to mean:

^Q/A#E

after finding the .class file.	This causes the JRE to reject the .class file.

The problem, as it stands now, is that the Chinese class name used in the .java
file, after being successfully compiled, results in a .class file that is
rejected by the JRE.

It seems to me that the JRE, by mistake, decides to interpret the first
character as a "^", rather than the first character in the Chinese string.  And
everything that follows is also wrong.
Comment 18 Tod Creasey CLA 2005-04-26 07:48:19 EDT
There are two issues to look at first here

1) 1.4.2 does not support DBCS class names. Have you tried this with a 1.5 VM?
2) If you are using any GB18030-2 characters we don't support them yet.
Comment 19 Steven Wasleski CLA 2005-04-26 08:31:22 EDT
Tod, where did you hear or read about your point 1 in comment 18?  Please see 
the link in my comment 4.  That link is into the second edition (2000) of the 
JLS.  I believe these identifiers should work.
Comment 20 Henry Huang CLA 2005-04-26 23:22:53 EDT
We are not using any GB18030-2 characters in our Traditional Chinese 
environment.  Karen Peng is now looking into the 1.5 VM issue.

As of now, we have already seen the problem on multiple Windows machines.
We have narrowed it down to a single Traditional Chinese Character, whose 
value, in different encodings, is:

Unicode 
-1,-2,47,102

UTF-8
-26,-104,-81

Big5
-84,79

These are all decimal values.   When running native2ascii on the same character,
I get the value:
\u662f

In hexadecimal, 66 translates to 102 in decimal, and 2f translates to 47 in 
decimal.  This seems to match the Unicode sequence, somewhat.

I am suspicious that this is a bug in our current JRE, which is an IBM build.  
I will try to test the same case with one offered by Sun.



Comment 21 Henry Huang CLA 2005-04-27 03:24:00 EDT
We now have 3 new versions of the JRE:

Sun 1.5.0_02
IBM 1.4.2 sr2a
IBM 1.5.0

I tested on all 3 versions using a .java file with a class name consisting of a 
single Chinese character (the one that causes failure).

My test was broken into 2 parts:
1) compile the .java file using the appropriate javac comipiler
2) run the .class file using the appropriate java JRE

The IBM 1.5.0 test was the only one that resulted in success.

Both the IBM 1.4.2 sr2a and Sun 1.5.0_02 tests resulted in errors.

This seems to suggest that the 1.5.0 IBM version fixed a bug not fixed in the 
Sun version of the JRE.


Comment 22 Tod Creasey CLA 2005-04-27 08:02:54 EDT
Adding Philippe. I know these issues were part of a readme at some point -
Philippe can comment more.
Comment 23 Philipe Mulet CLA 2005-04-27 08:28:51 EDT
I suspect it had to do with 1.3 runtime back then. Olivier is the right person
to find some info in bug databases.
Comment 24 Olivier Thomann CLA 2005-05-17 16:55:04 EDT
I could not retrieve a bug from the sun bug database, but it would be worth
reporting the problem to them with steps to reproduce.
Closing as VM bug.
Comment 25 Tod Creasey CLA 2005-08-24 12:58:08 EDT
Steve here is something about 18030-2000.

http://developers.sun.com/dev/gadc/technicalpublications/articles/gb18030.html