34421 – [Encodings] Encodings inconsistent between pref and editor

Bug 34421 - [Encodings] Encodings inconsistent between pref and editor

Summary: [Encodings] Encodings inconsistent between pref and editor

Status:	VERIFIED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	UI (show other bugs)
Version:	2.1
Hardware:	PC Windows XP

Importance:	P2 normal (vote)
Target Milestone:	3.1
Assignee:	Tod Creasey
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	22016
	Show dependency tree

Reported:	2003-03-10 14:44 EST by Nick Edgar
Modified:	2022-01-28 10:49 EST (History)
CC List:	8 users (show)

See Also:	578332

Attachments
list of commonly used charset names (1.93 KB, text/plain) 2004-07-15 06:08 EDT, David Williams	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nick Edgar

2003-03-10 14:44:10 EST

RC1 build I20030307

The list of encodings in the Workbench / Editors pref page is:
Cp1252 
ISO-8859-1
US-ASCII
UTF-16
UTF-16BE
UTF-16LE
UTF-8

The Edit/Encoding menu has:
Cp1252 (Default)
ASCII
Latin 1
UTF-8
UTF-16 (big endian)
UTF-16 (little endian)
UTF-16

The pref uses the machine-readable encoding names.  It would be better to use 
human-readable names like the Edit/Encoding menu

Cp1252 is the default here, and may differ in other locales.
The combo in the pref does not indicate (Default) like the Edit menu does.

We should also add more default encodings.

Comment 1 Nick Edgar

2003-03-10 14:44:32 EST

Defer to 2.2.

Comment 2 Nick Edgar

2003-03-10 14:47:24 EST

Should contribute the available encodings (and their human-readable names) via 
XML.  That way, translation packs could add extra entries.
UI and Text would then always be consistent.

Comment 3 Michael Van Meekeren

2004-05-25 11:53:50 EDT

Kai, where do you get your list of encodings from?
Tod has anyone complained about this?

Comment 4 Tod Creasey

2004-05-25 12:19:12 EDT

Yes - there is a lot of buzz around encodings. Andre has restored some of this 
of late so we should recheck.

Comment 5 Nick Edgar

2004-05-26 14:20:19 EDT

There really should be an extension point for the supported encodings, used in
both the Editors pref page and the text editors' encoding menu.
Should consider this for post-3.0.

Comment 6 Tod Creasey

2004-05-27 08:26:14 EDT

Recheck and then mark later

Comment 7 Tod Creasey

2004-05-27 13:34:21 EDT

These are the same now except that the editor uses Latin-1 as the title for 
8859-1.

Comment 8 Nick Edgar

2004-05-27 16:41:39 EDT

Reopening to address the extension point and naming consistency problem for
post-3.0.

Comment 9 Nick Edgar

2004-05-27 16:42:29 EDT

Reassigning to Text component owner since they have owned the encoding problem
of late.  Will be happy to discuss a solution.

Comment 10 Dani Megert

2004-05-28 06:31:48 EDT

Someone needs to define the default set of charsetNames. The NLSed display
string can then be obtained via Charset.forName(String).displayName().

Since a plug-in writer can create a text editor without using our Platform/Text
framework this list should be provided by Platform/UI component as did the
encoding preference UI in 2.1. Platform/Text simply copied the list because
there was no API.

Comment 11 Tod Creasey

2004-05-28 08:42:46 EDT

We should add this API in 3.1. The list is the same currently(except for the 
label in the text editor list).

Comment 12 Tod Creasey

2004-05-28 08:43:11 EDT

Marking LATER as this is an API request

Comment 13 Nick Edgar

2004-05-28 14:07:32 EDT

Dani is referring to the Charset type in java.nio.charset (new in 1.4), not
CharSet in java.text.

Comment 14 Tod Creasey

2004-06-28 11:28:23 EDT

Reopening now that 3.0 has shipped

Comment 15 Tod Creasey

2004-07-14 14:37:24 EDT

Dani is the workbench low enough for you - i.e. do you need this list in 
jface.text? If not we will have to put it in Core.

Comment 16 Dani Megert

2004-07-15 04:33:40 EDT

It's OK to have it an UI layer since this is really just an incomplete list of
most important encodings to be presented to the user and it makes no sense to
have it in a non-UI layer since it is not the complete list of valid encodings.

There should be an extension point which enables clients to add encodings to
that list.

Comment 17 Andre Weinand

2004-07-15 05:05:27 EDT

I'm not really sure why we need an extension point for contributing charset names:

- there seems to be API for getting the complete list of charset names of a Java 
   implementation: java.nio.charset.availableCharsets() and as Dani has pointed out
   Charset.forName(String).displayName() would return the UI name.

- it only makes sense to contribute more charset names, if it is also possible to contribute
  the implementation of a charset too. Without this we would always get
  UnsupportedEncodingExceptions.

Or am I missing something?

Comment 18 Dani Megert

2004-07-15 05:23:52 EDT

The list you get is too long in my opinion. As for the extension point: assume
there's a plug-in for some programming language or tool or editor that
needs/uses one or several specific encodings heavily (e.g. the encoding
specified by Java for *.properties files) but they are not in our list. The
extension point enables them to add those encodings.

Comment 19 Andre Weinand

2004-07-15 05:38:27 EDT

If the encoding is not in the list returned by availableCharsets, then we cannot use it.

Comment 20 David Williams

2004-07-15 06:07:08 EDT

I would agree the list from nio.charsets is *way* too long ... several hundred 
for some VM's!. Though an extension seems like overkill. 

Would it help to have a list of "common charset names"? I'll attach a list we 
use, as a property file. Seems this list covers 99.9% of needs. (No one's 
complained). Plus, I've found VM's don't usual provide translated versions 
(though its spec'd that way) so we allow translations of property file.

Comment 21 David Williams

2004-07-15 06:08:32 EDT

Created attachment 13287 [details]
list of commonly used charset names

Comment 22 Tod Creasey

2004-07-15 08:34:44 EDT

Here is a use case Andre.

Some users in the Far East frequently run with several code pages - usually a 
simplified and complex form of thier spoken language plus a European one 
(usually English) as they may have code, a document written by someone else 
etc. all within thier workbench.

Defining a fragment that adds a couple of popular code pages to the list for a 
particular locale would be very useful to them - especially as many code pages 
are just a number and don't say what characters they are for.

Adding on to that a meaningful label for code pages that are just a number 
would be pretty useful too. 8859-1 is much less meaningful to most people than 
Latin.

Thanks for your input everyone. I am leaning towards Danis suggestion myself.

Comment 23 Tod Creasey

2004-07-21 13:59:44 EDT

I am going to add this to the Workbench as this seems to be the consensus.

Comment 24 Tod Creasey

2004-07-22 16:17:13 EDT

It is actually going to go in IDE as the Core encoding support is in 
core.resources.

Comment 25 Tod Creasey

2004-08-18 10:06:04 EDT

Released to HEAD. I have created both and WorkbenchEncoding and IDEEncoding 
class for the encodoing support at both levels.

Comment 26 Dani Megert

2004-08-27 09:35:32 EDT

I looked at this today and to me it looks as if the list and display strings 
are hard-coded and it's not possible to supply a different list via
extension-point e.g. when installing Eclipse in China or Switzerland. Is the
intended way to configure this by overriding WorkbenchEncoding and IDEEncoding
via fragment or did I miss something?

Comment 27 Tod Creasey

2004-08-27 09:51:06 EDT

We have not added any support for adding via extension point - the only locale 
specific ones you will get is for your current encoding setting - basically 
all we have right now is the 3.0 support in API.

If you think we need the extension point as well then please log a pr to that 
effect.

Comment 28 Dani Megert

2004-08-27 10:22:54 EDT

And how is it NLSed?
There are already bugs targeting in that direction (e.g. bug 21195)

Comment 29 Tod Creasey

2004-11-02 15:06:36 EST

Marking verified as this now has an extension point in M3