Bug 20641 - [NL] Editors preferences - 'Default' Text file detected incorrectly on W2K and XP
Summary: [NL] Editors preferences - 'Default' Text file detected incorrectly on W2K an...
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Text (show other bugs)
Version: 2.0   Edit
Hardware: PC Windows 2000
: P3 normal (vote)
Target Milestone: 3.0   Edit
Assignee: Platform-Text-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: nl
Depends on: 22016
Blocks:
  Show dependency tree
 
Reported: 2002-06-19 09:34 EDT by Christophe Cornu CLA
Modified: 2006-05-12 07:34 EDT (History)
5 users (show)

See Also:


Attachments
Readme for PR20641 (1.67 KB, text/plain)
2002-06-20 18:44 EDT, Christophe Cornu CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christophe Cornu CLA 2002-06-19 09:34:56 EDT
Build F3
In Window > Preferences > Editors > Text file encoding > Default
On W2K and XP, the value can be incorrect.

The problem appears to be due to the JRE not returning the correct value. So 
the bug does not reside directly inside Eclipse, but this impacts Eclipse and 
may need to be worked around. I used IBM 1.3.1 jre.

Apparent JRE bug:
// on W2K/XP, returns user's locale, not the system's locale
String fileEncoding = System.getProperty("file.encoding");

W2K/XP define multiple locales.
Microsoft explains what user's locale and system's locale mean at:
http://www.microsoft.com/globaldev/faqs/locales.asp

How to reproduce:
.Use a W2K or XP box with multiple locales installed. For example Hebrew and 
English.
.Set the system locale to Hebrew (code page 1255)
.Set the user's locale to English (code page for latin languages: 1252)
(See PR20373 for explanations on how to set the system and user's locales on 
W2K and XP)
Eclipse will display 1252 as the default codepage for file encoding. This is 
wrong, it should be 1255.

A simple workaround for the user is to ensure that the user's locale is 
matching their system locale - Eclipse will display the correct default file 
encoding as a result.

PR20373 contains more info related to bidi issues/confusions.
Comment 1 Nick Edgar CLA 2002-06-19 09:58:03 EDT
Should investigate whether there is a more appropriate system property we can 
use.
Comment 2 Tod Creasey CLA 2002-06-19 10:45:07 EDT
I think we actually have it right. The purpose of setting the user locale in 
Win2000 is to allow for switching of input and save modes.

If I want to edit a Chinese file on my Englsh machine I would switch my user 
Locale to Chinese and then edit the file (for instance to get the Gb18030  
characters). When I was ready to switch back I would change the user locale 
back to English.

Regardless Java does not provide another way to request the encoding other than 
System.getProepery("file.encoding"). Our current encoding support is not 
designed for people who change locales on the fly.
Comment 3 Lynne Kues CLA 2002-06-19 11:59:29 EDT
I have Win2k with Hebrew installed.  I have Hebrew as my System locale and 
English as my user locale.  I start Eclipse, type some Hebrew - it is not 
displayed correctly.  The workaround for this is to change the user locale to 
Hebrew.  I exit Eclipse and change this and restart my workspace (according to 
Tod I guess I shouldn't even have to do this).  Note that on reentry Hebrew is 
still NOT displayed correctly.  I must create a new workspace after I change 
the user locale.  This is something that Eclipse is doing.
Comment 4 Nick Edgar CLA 2002-06-19 12:08:13 EDT
What does file.encoding indicate before and after switching?
Unless you have changed the text file encoding preference to something other 
than Default, we should just be using file.encoding.
Comment 5 Tod Creasey CLA 2002-06-19 12:14:00 EDT
The issue is a virtual machine issue. The IBM VM returns the system rather than 
user encoding. Christophe is going to assist Kevin_Haaland in logging a PR to 
IBM about this.

System.getProperty("file.encoding") is returning the wrong value.
Comment 6 Lynne Kues CLA 2002-06-19 12:17:03 EDT
1252, as indicated by Chris, in the instance where your user locale is English.
1255 when you restart Eclipse after you change your user locale to Hebrew, but 
things still do not work until you create a new workspace, which indicates that 
something is being cached somewhere that shouldn't be or am I missing something?

Tod I don't understand your remarks above unless you meant to say it in the 
reverse (IBM returns user locale not system locale).
Comment 7 Tod Creasey CLA 2002-06-19 12:24:07 EDT
I likely have it backwards. It should be the locale you get from Regional 
Options-> General -> Settings for Current User.
Comment 8 Nick Edgar CLA 2002-06-19 12:34:36 EDT
Are you typing in an editor?  If so, do you close and reopen it after restart 
(shouldn't have to, just curious if the editor is hanging onto the locale it 
used to read the file the first time).

I also wonder if it's a problem with the wrong fonts being used rather than 
the wrong encoding.

Comment 9 Lynne Kues CLA 2002-06-19 12:52:31 EDT
You don't need to leave the editor open.  If you close all editors, change user 
locale, restart Eclipse, open an editor on a new file things are still screwed 
up.  You have to start with a new workspace.
Comment 10 Lynne Kues CLA 2002-06-19 12:58:45 EDT
Tod, I think you still have it backwards.  The user locale ** IS ** being 
returned.  System locale is what needs to be returned (Regional Options --> 
General --> Language settings for the system --> Set default).
Comment 11 Nick Edgar CLA 2002-06-19 13:03:20 EDT
If you go to Help / About / Configuration Details, what does it show for 
file.encoding before and after?
Comment 12 Tod Creasey CLA 2002-06-19 13:05:24 EDT
So question here is why do we think that the system locale is the correct 
answrer? It is harder to find in the UI - I know that I switch usually via the 
user locale.

Given that we do not have full encoding mode switching support anyways I think 
perhaps our expectations are too high for this simple support.

How does this work using the Sun VM? Is the IBM VM consistent?
Comment 13 Tod Creasey CLA 2002-06-19 14:31:08 EDT
I have just checked this with a Sun VM and Sun also uses the user rather than 
system settings.
Comment 14 Christophe Cornu CLA 2002-06-19 17:06:23 EDT
To reply to Tod about why "file.encoding" should return the default system 
locale and should NOT return the user locale:

The link I gave above (http://www.microsoft.com/globaldev/faqs/locales.asp 
states:
"Although available user locales are often listed as a language (sometimes in 
combination with a country), a user locale is NOT a language setting, and has 
nothing to do with input languages, keyboard layouts, codepages or user 
interface languages. The Hebrew user locale, for example, only contains data 
related to the standard regional settings of Israel, not to the Hebrew 
language."

The link also contains the definition of system locale, which matches 
what "file.encoding" is expected to return.
"The system locale (sometimes referred to as the system default locale), 
determines which ANSI, OEM and MAC codepages and associated bitmap font files 
are used as defaults for the system."
(see the complete definition)
Comment 15 Lynne Kues CLA 2002-06-19 17:21:22 EDT
The behavior I am seeing is font related as Nick indicates.  The font for the 
StyledText widget must be set to the correct script (i.e., Hebrew) and this 
will not be the case if you use a workspace that was created when the system 
was not in a Hebrew state.
Comment 16 Steve Northover CLA 2002-06-20 12:02:16 EDT
Talked to KH.  We agreed that this particular problem is not SWT (although 
there are related SWT problems that have separate PR numbers).  The resolution 
is that Chris will provide documentation for the read.me describing the 
problem, how it appears in the Eclipse UI and what the user can do to work 
around it.
Comment 17 Christophe Cornu CLA 2002-06-20 18:44:52 EDT
Created attachment 1538 [details]
Readme for PR20641
Comment 18 Nick Edgar CLA 2002-06-21 10:00:41 EDT
Thanks Chris.
Comment 19 Tod Creasey CLA 2002-08-06 15:15:11 EDT
Added to documentation for the new encoding support. This Bug can be closed 
when Bug 22016 is closed.
Comment 20 Dani Megert CLA 2005-05-09 06:12:06 EDT
This has been fixed some time ago.