Bug 50292 - [Welcome] NLS: Kannada text + Punjabi text do not get displayed in Welcome.xml
Summary: [Welcome] NLS: Kannada text + Punjabi text do not get displayed in Welcome.xml
Status: RESOLVED INVALID
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 2.1.2   Edit
Hardware: PC Windows XP
: P2 major (vote)
Target Milestone: ---   Edit
Assignee: Tod Creasey CLA
QA Contact:
URL:
Whiteboard:
Keywords: nl, vm
Depends on:
Blocks:
 
Reported: 2004-01-20 14:26 EST by Cam-Thu Le CLA
Modified: 2004-03-22 12:06 EST (History)
6 users (show)

See Also:


Attachments
welcome.xml with Kannada text (55.69 KB, application/x-zip-compressed)
2004-01-20 16:10 EST, Cam-Thu Le CLA
no flags Details
RHEL screenshot of welcome.xml kannada text (75.21 KB, image/jpeg)
2004-03-19 16:22 EST, Jonathan Simpson CLA
no flags Details
Print Screen of latest WSWB 3.0 build on Windows XP (65.55 KB, image/jpeg)
2004-03-22 12:02 EST, David W Hare CLA
no flags Details
Print Screen of latest WSWB 3.0 build on Windows 2003 (88.87 KB, image/jpeg)
2004-03-22 12:06 EST, David W Hare CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cam-Thu Le CLA 2004-01-20 14:26:13 EST
When testing on Kannada and Punjabi locale, Kannada and Punjabi text connot be 
displayed in the welcome.xml, but they seems to work fine evrywhere else.

Operating system= Windows XP
eclipse 2.1.2
Sun JRE 1.4.2
Arial Unicode font is used.

Step to recreated:
1- Install Arial Unicode font to the System font
2- Add Kannda text to the welcome.xml
3- Launch eclipse and view welcome.xml
=> expect to see Kannada text in welcome.xml, but blank is displayed.

Will attach in this bug the Arial Unicode font, the Kannada text and the 
updated welcome.xml with Kannada text in it.
Comment 1 Cam-Thu Le CLA 2004-01-20 16:10:55 EST
Created attachment 7494 [details]
welcome.xml with Kannada text

Arial Unicode MS font is too large to be attached here. To recreate the Kannada
welcome problem, you do not need the Arial Unicode font. The zip file contains
screen captures of welcome.xml displayed through eclipse and welcome.xml
displayed through Mozilla. A Kannada text file is also attached.
Comment 2 Nick Edgar CLA 2004-01-21 09:50:15 EST
Tod, can you comment on the NL issues here?
Is the welcome page not defaulting to the correct font?
Comment 3 Tod Creasey CLA 2004-01-21 10:09:31 EST
This file is not shown because it is not valid xml. When I replace the 
welcome.xml in the platform with it I get the following error.

Taking a quick look at the file it looks to me like the non-English text is 
not UTF 8 encoded when the file states that as its encoding.

org.xml.sax.SAXParseException: Document root element is missing.
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3339)
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3327)
	at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:635)
	at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
	at org.apache.crimson.parser.XMLReaderImpl.parse
(XMLReaderImpl.java:448)
	at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeParser.parse
(WelcomeParser.java:278)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.read
(WelcomeEditor.java:903)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.readFile
(WelcomeEditor.java:918)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.createPartControl
(WelcomeEditor.java:647)
	at org.eclipse.ui.internal.PartPane$4.run(PartPane.java:166)
	at org.eclipse.core.internal.runtime.InternalPlatform.run
(InternalPlatform.java:842)
	at org.eclipse.core.runtime.Platform.run(Platform.java:458)
	at org.eclipse.ui.internal.PartPane.createChildControl
(PartPane.java:162)
	at org.eclipse.ui.internal.PartPane.createControl(PartPane.java:211)
	at org.eclipse.ui.internal.EditorWorkbook.createPage
(EditorWorkbook.java:157)
	at org.eclipse.ui.internal.EditorWorkbook.add(EditorWorkbook.java:98)
	at org.eclipse.ui.internal.EditorArea.addEditor(EditorArea.java:57)
	at org.eclipse.ui.internal.EditorPresentation.openEditor
(EditorPresentation.java:351)
	at org.eclipse.ui.internal.EditorManager$2.run(EditorManager.java:549)
	at org.eclipse.swt.custom.BusyIndicator.showWhile
(BusyIndicator.java:84)
	at org.eclipse.ui.internal.EditorManager.createEditorTab
(EditorManager.java:538)
	at org.eclipse.ui.internal.EditorManager.openInternalEditor
(EditorManager.java:634)
	at org.eclipse.ui.internal.EditorManager.openEditorFromDescriptor
(EditorManager.java:437)
	at org.eclipse.ui.internal.EditorManager.openEditor
(EditorManager.java:425)
	at org.eclipse.ui.internal.WorkbenchPage.busyOpenEditor
(WorkbenchPage.java:2052)
	at org.eclipse.ui.internal.WorkbenchPage.access$6
(WorkbenchPage.java:1995)
	at org.eclipse.ui.internal.WorkbenchPage$9.run(WorkbenchPage.java:1982)
	at org.eclipse.swt.custom.BusyIndicator.showWhile
(BusyIndicator.java:84)
	at org.eclipse.ui.internal.WorkbenchPage.openEditor
(WorkbenchPage.java:1977)
	at org.eclipse.ui.internal.WorkbenchPage.openEditor
(WorkbenchPage.java:1960)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.openWelcomeEditor
(IDEWorkbenchAdvisor.java:915)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.openWelcomeEditors
(IDEWorkbenchAdvisor.java:711)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.postStartup
(IDEWorkbenchAdvisor.java:259)
	at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1490)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench
(Workbench.java:265)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:139)
	at org.eclipse.ui.internal.ide.IDEApplication.run
(IDEApplication.java:47)
	at org.eclipse.core.internal.runtime.PlatformActivator$1.run
(PlatformActivator.java:248)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run
(EclipseStarter.java:85)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:324)
	at org.eclipse.core.launcher.Main.basicRun(Main.java:279)
	at org.eclipse.core.launcher.Main.run(Main.java:742)
	at org.eclipse.core.launcher.Main.main(Main.java:581)
Comment 4 Cam-Thu Le CLA 2004-01-21 15:20:14 EST
Welcome.xml is saved in UTF8 with Notepad. Through the welcome.xml, Russian 
text, Hungarian text,Vietnamese text all come up correctly. Only Kannada text 
did not get displayed all at. Problem is reproducable in eclipse 3.0 M6.
Comment 5 Tod Creasey CLA 2004-01-22 16:08:32 EST
Here are the steps I followed

1) Took your example
2) Downloaded 20030123
3) Replaced welcome.xml with the welcome.xml you provided in 
plugins/org.eclipse.platform
4) Started Eclipse
5) The welcome editor opened empty and the following was in the log.

Once we get a valid welcome.xml file we can see if this is really an issue.

org.xml.sax.SAXParseException: Document root element is missing.
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3339)
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3327)
	at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:635)
	at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
	at org.apache.crimson.parser.XMLReaderImpl.parse
(XMLReaderImpl.java:448)
	at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeParser.parse
(WelcomeParser.java:278)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.read
(WelcomeEditor.java:903)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.readFile
(WelcomeEditor.java:918)
	at org.eclipse.ui.internal.ide.dialogs.WelcomeEditor.createPartControl
(WelcomeEditor.java:647)
	at org.eclipse.ui.internal.PartPane$4.run(PartPane.java:166)
	at org.eclipse.core.internal.runtime.InternalPlatform.run
(InternalPlatform.java:842)
	at org.eclipse.core.runtime.Platform.run(Platform.java:458)
	at org.eclipse.ui.internal.PartPane.createChildControl
(PartPane.java:162)
	at org.eclipse.ui.internal.PartPane.createControl(PartPane.java:211)
	at org.eclipse.ui.internal.EditorWorkbook.createPage
(EditorWorkbook.java:157)
	at org.eclipse.ui.internal.EditorWorkbook.add(EditorWorkbook.java:98)
	at org.eclipse.ui.internal.EditorArea.addEditor(EditorArea.java:57)
	at org.eclipse.ui.internal.EditorPresentation.openEditor
(EditorPresentation.java:351)
	at org.eclipse.ui.internal.EditorManager$2.run(EditorManager.java:549)
	at org.eclipse.swt.custom.BusyIndicator.showWhile
(BusyIndicator.java:84)
	at org.eclipse.ui.internal.EditorManager.createEditorTab
(EditorManager.java:538)
	at org.eclipse.ui.internal.EditorManager.openInternalEditor
(EditorManager.java:634)
	at org.eclipse.ui.internal.EditorManager.openEditorFromDescriptor
(EditorManager.java:437)
	at org.eclipse.ui.internal.EditorManager.openEditor
(EditorManager.java:425)
	at org.eclipse.ui.internal.WorkbenchPage.busyOpenEditor
(WorkbenchPage.java:2052)
	at org.eclipse.ui.internal.WorkbenchPage.access$6
(WorkbenchPage.java:1995)
	at org.eclipse.ui.internal.WorkbenchPage$9.run(WorkbenchPage.java:1982)
	at org.eclipse.swt.custom.BusyIndicator.showWhile
(BusyIndicator.java:84)
	at org.eclipse.ui.internal.WorkbenchPage.openEditor
(WorkbenchPage.java:1977)
	at org.eclipse.ui.internal.WorkbenchPage.openEditor
(WorkbenchPage.java:1960)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.openWelcomeEditor
(IDEWorkbenchAdvisor.java:915)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.openWelcomeEditors
(IDEWorkbenchAdvisor.java:711)
	at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.postStartup
(IDEWorkbenchAdvisor.java:259)
	at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1490)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench
(Workbench.java:265)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:139)
	at org.eclipse.ui.internal.ide.IDEApplication.run
(IDEApplication.java:47)
	at org.eclipse.core.internal.runtime.PlatformActivator$1.run
(PlatformActivator.java:248)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run
(EclipseStarter.java:85)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:324)
	at org.eclipse.core.launcher.Main.basicRun(Main.java:279)
	at org.eclipse.core.launcher.Main.run(Main.java:742)
	at org.eclipse.core.launcher.Main.main(Main.java:581)
Comment 6 Cam-Thu Le CLA 2004-01-23 16:03:41 EST
I tried to reproduce with eclipse N20040123 and it is true that the platform  
welcome.xml came up blank with the updated welcome.xml I gave. But this is also 
true when one take any English welcome.xml and save it in UTF8. Eclipse will 
also display blank screen for this build.

For the purpose of reproducing this problem and understand what needs to be 
done, I suggest we use eclipse 3.0 M6. With this build I can see the 
welcome.xml I gave displayed. Just don't understand why Kannada text does not 
get displayed while Vietnamese does. Both these languages are complex scripts 
and must be saved in UTF8.

The welcome.xml I gave should have valid UTF8 encoding. The file is saved using 
Notepad encoding from the OS. To verify the file type encoding, open the 
welcome.xml with notepad, then select file -> save as, and notice that the 
encoding previously saved is UTF8. 

Here are the steps I took to recreate the problem.
- Download eclipse 3.0 M6
- copy the welcome.xml I gave to eclipse\plugins\org.eclipse.platform
-Launch eclipse
-From ressource perspective pane, create a simple project "A"
-Import eclipse\plugins\org.eclipse.platform\welcome.xml to the project "A" 
(From project "A":File->import->File System->browse my computer for 
eclipse\plugins\org.eclipse.platform->select welcome.xml and import it ). 
-From project "A", display the welcome.xml with eclipse using both 1)eclipse 
text editors 2)Default editor:
1)open with text editor: From project "A", hilite welcome.xml, right mouse 
click ->Open with->Text Editor => Welcome text are displayed with corrupted 
foreign text. The text editor should check for existing file encoding prior to 
open the file.
2)open with Default editor: hilite welcome.xml, right mouse click ->open with-
>default editor => welcome text displays correct Kannada and Vietnamese text in 
browser. When view page source, text are also correct.
- Fix welcome.xml file encoding with eclipse: click on the welcome text page to 
give focus. From menu bar, select File->Encoding, select UTF8. Welcome text  
displays properly converted foreign text, except Kannada text. Why can Eclipse 
successfully convert Vietnamese text and not Kannada text, when both are in 
UTF8 ?. 
Comment 7 Tod Creasey CLA 2004-01-23 16:10:38 EST
Correct me if I am wrong but Kannada is a double byte text which means that 
you should need to use UTF 16 correct? Vietnamese is single byte if I not 
mistaken.

Either way there is an issue with the xml parsing of this file. I will add DJ 
to the list to see if he has nay insights.
Comment 8 Cam-Thu Le CLA 2004-01-25 14:50:59 EST
Both Kannada and Vietnamese are categorized as complex script. Kannada is not a 
DBCS language, it is a dialect of India. Vietnamese has been reclassified from 
SBCS to complex script.
Comment 9 Tod Creasey CLA 2004-01-26 08:05:23 EST
Right - so this means that UTF8 should be fine for it then.
Comment 10 David W Hare CLA 2004-02-04 14:33:09 EST
Problem occured in Windows XP, but not in Windows 2003.  The Welcome.xml with 
Punjabi and Kannada text displays properly in Windows 2003.  However, in XP, 
they show up as blank.
Comment 11 Tod Creasey CLA 2004-03-19 11:49:47 EST
I have checked the encoding of the file and it is correct using an action with 
the following code snippet:

System.out.println(String.valueOf(selected.getEncoding()));

So it appears that the apache parser just cannot handle the text. This is 
supplied by the virtual machine.

I am not sure that there is anything we can do it about this if the SAX parser 
provided by the virtual machine cannot handle this file.

John any comments or further suggestions?
Comment 12 John Arthorne CLA 2004-03-19 13:41:56 EST
I did a hexdump on the file and it looks like a valid UTF8 encoding.  The
Kannadian text is in the U+0800 to U+0FFF range, so it is encoded as triple byte
sequences. It renders fine in IE and Netscape, but Mozilla is not able to render
it.  This looks like a bug in the Crimson XML parser... I have seen forums
discussing exactly the same symptoms:

http://forum.java.sun.com/thread.jsp?forum=34&thread=499493&start=0&range=15&tstart=0&trange=15

I suggest trying with an IBM VM, which uses a different XML parser
implementation to see if that solves the problem.

Comment 13 Tod Creasey CLA 2004-03-19 15:53:20 EST
Sure enough it is a virtual machine problem with the Sun 1.4.2 VM. If you use 
the IBM 1.4.1 jre then this works without a problem.
Comment 14 Jonathan Simpson CLA 2004-03-19 16:19:57 EST
I tried it on RHEL 3.0 and the kannada text displays.  I'm not sure if the text 
is 100% correct, but it does display.  I believe the RHEL build uses the IBM 
VM.  I'll attach a jpeg so that someone else can take a look at it to verify 
its correctness.
Comment 15 Jonathan Simpson CLA 2004-03-19 16:22:03 EST
Created attachment 8711 [details]
RHEL screenshot of welcome.xml kannada text
Comment 16 Jonathan Simpson CLA 2004-03-22 10:12:33 EST
I tried it on Windows 2000 with WSWB build I20040318 and the Kannada text 
doesn't display.  Doesn't WSWB use the IBM JRE?  I'm using the Arial Unicode MS 
font.

Comment 17 Tod Creasey CLA 2004-03-22 10:34:47 EST
3.0 requires JDK 1.4.2 or higher and currently there is only a Sun VM for 
that. I wouldn't expect that WSWB has a 1.4.2 VM to work with
Comment 18 Jonathan Simpson CLA 2004-03-22 11:24:10 EST
The java.exe included with the I20040318 build of WSWB 3.0 appears to be an IBM 
executable(checked in properties->version).

It also doesn't work with 2.1.3-RC3 on my windows 2000 system.

It has only worked for me on RHEL with 3.0.
Comment 19 David W Hare CLA 2004-03-22 11:50:38 EST
WSWB 3.0 is being shipped with IBM JRE 1.4.2
Comment 20 David W Hare CLA 2004-03-22 12:02:18 EST
Created attachment 8748 [details]
Print Screen of latest WSWB 3.0 build on Windows XP

It appears the Kannada text is	displaying correctly now.
Comment 21 David W Hare CLA 2004-03-22 12:06:36 EST
Created attachment 8749 [details]
Print Screen of latest WSWB 3.0 build on Windows 2003

still works on 2003