Bug 21161

Summary: [Encoding] Character encoding preferences per file type.
Product: [Eclipse Project] Platform Reporter: Andreas Krüger <andreas.krueger>
Component: UIAssignee: Nick Edgar <n.a.edgar>
Status: RESOLVED DUPLICATE QA Contact:
Severity: enhancement    
Priority: P3 CC: kai-uwe_maetzel, Mike_Wilson
Version: 2.0   
Target Milestone: ---   
Hardware: PC   
OS: Windows 2000   
Whiteboard:

Description Andreas Krüger CLA 2002-07-01 06:32:16 EDT
I would like to use ISO-8859-1 for my .java - files and UTF-8 for my .xml - files.

I would like to enter this preference of mine through
Window / Preferences / Workbench / File Associations.

Workaround: Open Editor on a paricular file and use Edit / Encoding.

This is as good as such workarounds sometimes are, namely, "not very":

 * I have to do this again after every new "open" on the file.

 * Every other team member has to do it again and again, too,
   as nothing is forwarded through the repository, neither do any
   team standards help.

 * There's also bug 21160.
Comment 1 Nick Edgar CLA 2002-07-04 14:22:22 EDT
Need to consider improved encoding support post 2.0.
Comment 2 Mike Wilson CLA 2002-07-04 15:58:32 EDT
Agree that the current story is um... minimal. To do this properly, we should 
associate an encoding with each resource. There are potentially significant 
performance issues with doing this however, so we may have to fall back to a 
simpler strategy (like the one described in this PR).

We should begin the discussion early this time.
Comment 3 Bob Foster CLA 2002-08-09 23:02:17 EDT
The current encoding support (the parts of it that work) are completely
inadequate for XML files. It must be possible to set the encoding on a
per-resource basis and to dynamically detect the encoding from file contents.
This behavior is required by the standard and certainly every user has the right
to expect it of their editor. Worrying that it might cost extra to detect
encoding from file contents is simply not productive.

However, I can suggest a strategy that will reduce the cost. Implement a
rewindable input stream (that buffers the initial contents read up to some
limit). Provide that for encoding detection (or other types of file content
sniffing) and as the IEditorInput's input stream. This will minimize file and
potentially network I/O at the cost of a small buffer which, since InputStreams
don't hang around that long, is negligible.

For my own XML editor, I allow the user to decide, on a workbench or resource
basis, whether to use the default workbench encoding or specify an alternate
default encoding for XML files, and orthogonally whether to auto-detect. (The
correct default encoding for text and XML files are often different, as XML
requires the default encoding to be UTF-8.)
Comment 4 Kevin Haaland CLA 2002-09-03 14:14:15 EDT

*** This bug has been marked as a duplicate of 5399 ***