Bug 14435

Summary: GB18030: Text Editor cannot read files saved in unicode or UTF8 with GB18030 characters
Product: [Eclipse Project] Platform Reporter: Tod Creasey <Tod_Creasey>
Component: UIAssignee: Kai-Uwe Maetzel <kai-uwe_maetzel>
Status: RESOLVED DUPLICATE QA Contact:
Severity: normal    
Priority: P3 Keywords: nl
Version: 1.0   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 13591    

Description Tod Creasey CLA 2002-04-23 13:01:01 EDT
The Text editor cannot read files with GB18030 characters even if they are 
saved as unicode or UTF8.

STEPS
1) Create a text file in Notepad on a GB18030 enabled machine
2) Save the file as unicode, big endian unicode and UTF8.
3) Import all 3 files into Eclipse
4) Set your text font to a GB18030 font
5) Open all 3 - none will have the correct contents
6) Exit eclipse and open the files using Notepad - all will be correct
7) Start Eclipse again
8) Create a new text file with GB18030 characters
9) Save the file
10) Reopen it in Eclipse - content is ??????
11) Exit in Eclipse and open in Notepad - content is ???????

The text editor cannot read or save unicode characters in a GB18030 system
Comment 1 Tod Creasey CLA 2002-04-23 13:11:32 EDT
Not just a GB18030 issue - the same thing happens in Japanese
Comment 2 Nick Edgar CLA 2002-04-24 15:42:41 EDT
These are two separate problems:
1. lack of UTF8 or UTF16 support
2. characters lost even when sticking to the filesystem's encoding

What is the filesystem encoding in this case?  Is it GB18030?
Can you reconstruct the problem using:
String str = "{Some string with GB18030 chars}";
byte[] bytes = s.getBytes();
String newStr = new String(bytes);
str.equals(newStr)?

Comment 3 Kevin Haaland CLA 2002-09-03 14:10:02 EDT

*** This bug has been marked as a duplicate of 5399 ***