Bug 66898 - refactor-rename: encoding is not preserved
Summary: refactor-rename: encoding is not preserved
Status: RESOLVED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.0   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: 3.0 RC3   Edit
Assignee: Frederic Fusier CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-14 06:13 EDT by Tom Hofmann CLA
Modified: 2005-09-28 06:49 EDT (History)
2 users (show)

See Also:


Attachments
Fix for both move and rename issue (1.97 KB, patch)
2004-06-15 16:52 EDT, Frederic Fusier CLA
no flags Details | Diff
Complete workaround for this problem (4.17 KB, patch)
2004-06-18 11:45 EDT, Frederic Fusier CLA
no flags Details | Diff
Additional test added to EncodingTests (8.36 KB, patch)
2004-06-18 11:49 EDT, Frederic Fusier CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Hofmann CLA 2004-06-14 06:13:16 EDT
RC2

1. have this folder structure:

src // src folder
|- encoding // package - encoding set to UTF-8
    |- LATIN1Class.java // java source file, encoding: ISO-8859-1 (Latin1)

Containing this code:

====================

package encoding;

public class LATIN1Class {
    /* (non-Javadoc)
     * @see java.lang.Object#toString()
     */
    public String toString ( )
    {
        // TODO Auto-generated method stub
        return "הצ�: " + הצ�();
    }

    /**
     * @return
     */
    private String הצ� ( )
    {
        // TODO Auto-generated method stub
        return "הצ�";
    }
}
===========================

2. Refactor -> Rename the class to LATIN1Class2

Expected: class is renamed, keeps encoding ISO-8859-1, file content the same but
for the class name
Actual: class is renamed, keeps encoding ISO-8859-1, file content changed to
UTF-8 encoding.

Note:
- When Refactor->Copying a class the file gets properly copied and saved in the
folder's encoding (UTF-8).
- Rename is probably a special case of 'Move': a new file is created, with the
inherited encoding. When renaming, the encoding should be kept.
Comment 1 Dirk Baeumer CLA 2004-06-15 06:06:55 EDT
JDT/UI calls ICompilationUnit.rename to do the actual rename. Moving to 
JDT/Core.

Refactor->Copy uses a different change. So this isn't related.
Comment 2 Frederic Fusier CLA 2004-06-15 09:45:31 EDT
I cannot reproduce your problem... My experience is that reftacor->rename is ok
but problem occurs while copying the class...

Here's the fully described scenario for Linux-GTK:
1- Create project b66898
2- Create package encoding
3- Add following line to org.eclipse.core.resources.prefs file in .settings
   directory using Resource perspective: "encoding//encoding=ISO-8859-1"
4- Outside eclipse, copy existing LATIN1Class.java to <wksp dir>/b66898/encoding
5- Refresh project => class is displayed in package explorer
6- Edit file => I can see הצ�'s and class has no error
   (encoding for this file: "Default (inherited from container: ISO-8859-1)")
7- Refactor->Rename class to "Test" => class is renamed and still have no error!
8- Copy class and paste to "CopyOfTest" => I cannot see הצ�'s in CopyOfTest and
class has 3 compile errors (one on each הצ�...)

Back to JDT/UI as problem is located while writing file and JDT/Core does not
write source file (only .class)...

Note that similar problem occurs also on WinXP, you just have to change specific
and default encodings...
Comment 3 Tom Hofmann CLA 2004-06-15 10:13:49 EDT
The steps in comment 2 describe a different setting: a Latin1 file within a
Latin1 folder.

Comment 0 says that the folder has UTF-8 encoding, but the file's encoding set
to Latin1. Renaming then picks up the container's encoding instead of keeping
the original one.
Comment 4 Dirk Baeumer CLA 2004-06-15 10:31:51 EDT
Frederic,

the copy case is covered by the bug 66479. JDT/Core does write source files. 
The method ICompilationUnit#rename changes the class name inside the CU as 
well (e.g. changes "class LATIN1Class" two "class LATIN2Class") and updates 
constructor names if there are any. I double checked and JDT/UI isn't touching 
the content of the file in this scenario.

May be bug 66480 can give some more insights. The problem seems to be 
different on Sun and IBM VM. I tested this using IBM VM 

java version "1.4.1"
J9 - VM for the Java(TM) platform (build 2.1)
IBM J9SE VM (build 2.1, J2RE 1.4.1 IBM J9 build 20040510 (JIT enabled))

Moving back to JDT/Core to invetigate if the class name & constructor update 
can caused the problem.

Comment 5 Frederic Fusier CLA 2004-06-15 13:39:24 EDT
Dirk,
You're right. For rename and move the problem comes from JDT/Core...
We're losing encoding while executing RenameResourceElementsOperation...
Comment 6 Frederic Fusier CLA 2004-06-15 16:52:59 EDT
Created attachment 12191 [details]
Fix for both move and rename issue
Comment 7 Philipe Mulet CLA 2004-06-16 05:40:15 EDT
I see we were using the target encoding instead of the source one.
Comment 8 Philipe Mulet CLA 2004-06-16 05:50:41 EDT
Dirk approved fixing it for RC3.
Change reviewed, pls release it for integration.
Comment 9 Frederic Fusier CLA 2004-06-16 08:03:26 EDT
Fixed.

Now rename and move operation keep file encoding.

[jdt-core-internal]
Change done in method processCompilationUnitResource(ICompilationUnit,
IPackageFragment) of CopyResourceElementsOperation.
Test case added to EncodingTests
Comment 10 Philipe Mulet CLA 2004-06-17 06:54:11 EDT
We should undo this fix once bug 67606 is addressed.
Comment 11 Frederic Fusier CLA 2004-06-17 07:26:48 EDT
Reopen as we still need to do something here for 3.0:
1) Undo fix if bug 67606 is fixed
2) explicitely set encoding for destination file if bug is not fixed
Comment 12 Frederic Fusier CLA 2004-06-17 14:06:07 EDT
Warning, in case of 2), we had to use getCharset(false) in order to know whether
source file has a specific encoding (returned value would be not null) and then
set charset only in this case...
 
Comment 13 Philipe Mulet CLA 2004-06-18 06:19:40 EDT
We should also check that the charset is available then immediately for 
further queries in subsequent actions of batched operation. Imagine that 
someone has wrapped in a runnable a copy operation (now made consistent) and a 
file read, would the copied resource read ok ? i.e. would the charset we 
hammered be taken into account at this stage, or is its update also queued 
until notification has occurred ?
Comment 14 Philipe Mulet CLA 2004-06-18 10:31:14 EDT
We should release the full workaround, and may reconsider it for a subsequent 
service release, when platform issue is solved.

Please enter a separate defect for remembering to reconsider it, and make this 
new one dependent on platform bug.
Comment 15 Frederic Fusier CLA 2004-06-18 11:02:48 EDT
I've opened bug 67823 to address bug 67606 dependency...
Comment 16 Frederic Fusier CLA 2004-06-18 11:45:46 EDT
Created attachment 12488 [details]
Complete workaround for this problem

1) Get encoding from source file using getCharset(false) => is null if no
specific encoding was set on file.
2) Set encoding on destination file only if there's one specific on source file

3) Back to initial implementation: get encoding from destination file and write
contents using this encoding
As 1)+2) is done before 3), we're sure that destination file encoding will be
ok. Furthermore, it will be easier to remove it (ie. fix bug 67823) when bug
67606 will be fixed...
Comment 17 Frederic Fusier CLA 2004-06-18 11:49:03 EDT
Created attachment 12490 [details]
Additional test added to EncodingTests

This additional test verify that setCharset is applied immediately by executing
copy/move/rename operations in a runnable...
Comment 18 Philipe Mulet CLA 2004-06-18 11:52:42 EDT
Reviewed change. Pls release.
Comment 19 Frederic Fusier CLA 2004-06-18 12:44:18 EDT
Fixed and released in HEAD.
Comment 20 Frederic Fusier CLA 2005-09-28 06:49:16 EDT
*** Bug 110576 has been marked as a duplicate of this bug. ***