Bug 68270 - [Preferences] Encoding of property files
Summary: [Preferences] Encoding of property files
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows 2000
: P3 normal (vote)
Target Milestone: 3.1 M6   Edit
Assignee: Kim Horne CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 93491 (view as bug list)
Depends on:
Blocks: 82986
  Show dependency tree
 
Reported: 2004-06-23 03:16 EDT by Radim CLA
Modified: 2006-12-28 04:11 EST (History)
4 users (show)

See Also:


Attachments
org.eclipse.core.runtime preferences (66 bytes, text/plain)
2005-02-17 18:05 EST, Rafael Chaves CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Radim CLA 2004-06-23 03:16:55 EDT
If I change the text encoding for the whole folder, the encoding for the
property files (with extension .properties) still remains "determined from
content: ISO-8859-1". This is a problem because we use the property files that
have different encoding (Cp1250). We have made our own subclass of
java.util.Properties, which can work with files in different encodings than
ISO-8859-1. 

The only workaround for this is to set the encoding for each of our property
files separately, which is really annoying. We didn't have to do this in
previous versions of Eclipse.

Radim
Comment 1 Rafael Chaves CLA 2004-06-23 11:43:03 EDT
Java property files are ISO-8859-1. If you use a different encoding, they are
not Java property files. The problem is that there is a clash between two
different types of files (your custom property file type and Java properties as
defined by JDT/Core) that seem to share the same file name specification.

The ideal solution would be to prevent the clash by using a different file
extension, or a more specific file name (mycustom.properties), and declare a
content type (sub-type of org.eclipse.core.runtime.txt) for that file
extension/file name that does not define a default encoding.

A more extreme one (and likely to cause incompatibilities with other tools that
assume Java properties are ISO-8859-1) would be to provide your custom content
type that inherited from the existing one
(org.eclipse.core.jdt.core.javaProperties, or if you don't require JDT,
org.eclipse.core.runtime.properties):

<extension point="org.eclipse.core.runtime.contentTypes">
	<content-type id="customProperties" name="MyCustomProperties" 
		base-type="org.eclipse.core.runtime.properties"
		default-charset=""/>
</extension>   

If you had some kind of signature in your property files that allowed you to
programmatically distinguish between your custom properties and standard Java
properties, it would be better: you could provide a content describer in your
content type that would have an opportunity to evaluate whether the file
contents corresponded to your custom content type. This would then be a good
solution as well.
Comment 2 Rafael Chaves CLA 2004-06-23 16:53:53 EDT
Marking as invalid. Please reopen if the workaround doesn't work for you.
Comment 3 Radim CLA 2004-06-28 01:55:18 EDT
No, Java property files don't have to be ISO-8859-1 if you override the 
java.util.Properties class to accept different encodings. I agree that the 
DEFAULT encoding for the property files is ISO-8859-1, but the setting of 
encoding for the whole folder exists to be able to override the default 
encoding of the files in the whole directory, so the user should not need to 
set the encoding for the each file individually. But this doesn't work, I am 
still forced to set the encoding for every one of the files individually. 
Comment 4 Rafael Chaves CLA 2004-06-28 10:50:30 EDT
The workspace/project/folder settings are only used if an encoding can not be
determined from the file contents/file type (not the other way around). So this
is working as intended.

For Java property files (those managed by java.util.Properties), the encoding is
always ISO-8859-1. See spec for that class:

http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html

Of course, you can have a subclass of Java properties that breaks the spec and
does whatever you want, even changing the format of the file (e.g. using XML).
But the files being manipulated are *not* Java property files (ask yourself: can
the regular Properties class read files written by your custom class without
errors/loss of data due to wrong encoding?).

Have you tried any of the workarounds? Didn't they work?
Comment 5 Radim CLA 2004-06-29 09:12:27 EDT
I am sorry but I don't understand the workaround, because I don't use PDE. Could
you please describe in more detail where should I put:

<extension point="org.eclipse.core.runtime.contentTypes">
	<content-type id="customProperties" name="MyCustomProperties" 
		base-type="org.eclipse.core.runtime.properties"
		default-charset=""/>
</extension>   

I know that officialy the java property files SHOULD be ISO-8859-1. But in
practice, it is not as easy as it sounds. For example, if you send a property
file to customer for internationalization, the customer doesn't need to know the
native2ascii utility, so he usually sends the file back in the encoding that he
uses. The same is when you write your language texts - you don't usually convert
it using native2ascii after each save. Another example is what I was talking
about before - the developer might want to develop a custom implementation of
java.util.Properties. 
Anyway files with .properties extension don't have to be java property files and
I think that Eclipse is developed to be a universal IDE, not just Java IDE. 

Radim
Comment 6 Rafael Chaves CLA 2004-07-08 14:02:36 EDT
Agreed. 

One workaround would be to redefine the default encoding for the Java properties
content type to be the one you want. But due to bug 68894, that does not work.

Another workaround (which I already mentioned) would be to use a different file
name pattern(such as *.myproperties or my.properties), and provide a custom
content type associated to this file pattern. I understand that this might not
be something you will be willing to do, but thought I would mention. In the
other hand, it makes sense to use a special or different file name than the one
conventionally associated to Java properties, since they are not properly
related (your properties implementation does not qualify as a java properties
sub-type, since the superclass contract has been broken).
Comment 7 Rafael Chaves CLA 2004-12-16 13:01:27 EST
This looks like a use case for project-specific content type preferences. Users
(like Radim) might want to redefine the default encoding for Java properties,
but they probably won't want to do that for the whole workspace.

The case of .properties files completely unrelated to Java properties is a
different issue, and could be handled by providing another text-based content
type also associated to the .properties extension and make it the default for a
given project nature (see bug 69640). 
Comment 8 Rafael Chaves CLA 2005-02-17 17:57:31 EST
Could not make it happen for M5. Upgrading milestone.

One of the possible solutions for Radim's problem is to change the default
charset for the Java properties content type, now that bug 68894 is fixed.
However, we don't have UI for that yet. As an workaround only, Radim (if this is
still a problem for you), you could manually change the preferences in your
workspace, and that should work. I will explain how to do that shortly.
Comment 9 Rafael Chaves CLA 2005-02-17 18:05:11 EST
Created attachment 18072 [details]
org.eclipse.core.runtime preferences

Just save this file at:

<workspace>\.metadata\.plugins\org.eclipse.core.runtime\.settings\

And start Eclipse. The default content type for your properties files (and
project preferences) should now be Cp1250. Note this does not work for Eclipse
3.0 (but should work in 3.1/3.0.1).
Comment 10 Rafael Chaves CLA 2005-03-23 14:40:25 EST
As of today's i-build, Eclipse has a new preference page that allows end users
to change the default encoding for a content type (General > Editors > Content
types). You can the change the default encoding for properties files.
Comment 11 Rafael Chaves CLA 2005-03-23 14:41:17 EST
Hurrah!
Comment 12 Kim Horne CLA 2005-03-30 11:10:33 EST
Verified existance of preference page in I20050330-0500
Comment 13 Rafael Chaves CLA 2005-05-04 11:39:57 EDT
*** Bug 93491 has been marked as a duplicate of this bug. ***
Comment 14 Shachar Iphraimov CLA 2006-12-27 14:00:51 EST
I have the same problem on 3.2, all my files at utf8 , when ide restarts i can't write hebrew it's says cannot map to cp1251 encoding, but all encoding is in UTF8, I had reched all setting there all in UTF8
what should I do?
Comment 15 Dani Megert CLA 2006-12-28 04:11:52 EST
Either the file, project or workspace encoding is wrong (set to cp1251 instead of UTF-8).