Bug 198621 - CVS uses the wrong encoding when reading/writing to "Entries" (dual-boot system)
Summary: CVS uses the wrong encoding when reading/writing to "Entries" (dual-boot system)
Status: ASSIGNED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: CVS (show other bugs)
Version: 3.2.2   Edit
Hardware: PC Linux
: P3 major with 1 vote (vote)
Target Milestone: ---   Edit
Assignee: platform-cvs-inbox CLA
QA Contact: Szymon Brandys CLA
URL:
Whiteboard:
Keywords: helpwanted
Depends on:
Blocks:
 
Reported: 2007-08-02 03:22 EDT by Mads Stavang CLA
Modified: 2019-09-06 16:03 EDT (History)
3 users (show)

See Also:


Attachments
SynchFileWriter patch (using project-encoding in read/write) (1.58 KB, patch)
2007-08-06 08:01 EDT, Mads Stavang CLA
no flags Details | Diff
Patch (2.28 KB, patch)
2008-04-16 09:26 EDT, Tomasz Zarna CLA
no flags Details | Diff
mylyn/context/zip (1.24 KB, application/octet-stream)
2008-04-16 09:26 EDT, Tomasz Zarna CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mads Stavang CLA 2007-08-02 03:22:58 EDT
This problem occurs when the CVS files (Entries, Repository, Root) are written in Windows using cp-1252, and then later read in Linux using UTF-8. If Entries contain a filename with non-ascii characters (example testäöü.txt) and this file was commited under Windows, the item testäöü.txt is added to "Entries" with the encoding cp-1252. Under Linux this CVS file is read/written using UTF-8, and synchronizing the project under Linux produces a conflict with the repository, because test?.txt (? = garbage character) does not exist.

Setting the text file encoding = ISO-8859-1/CP-1252 in Preferences->General->Workspace does not affect which encoding the CVS plugin uses when handling the CVS files.

Changing the Locale of Linux (from UTF-8 to cp-1252) does not work, because then the filenames are not correctly interpreted.

System description:

OS 1:
Windows XP Home
Eclipse 3.2.2
Volume: NTFS

OS 2:
Linux Suse 10.2
Eclipse 3.2.2
Volume: NTFS, mounted with ntfs-3g, UTF-8
Comment 1 Michael Valenta CLA 2007-08-02 13:29:04 EDT
This should be a simple fix in SyncFileWriter so we'll try and get to it for 3.4. However, it would be great if someone who already has a good test case setup could provide a patch. If you're interested, I can tell you were you would need to make the change and how to get the proper encoding.
Comment 2 Mads Stavang CLA 2007-08-06 03:18:08 EDT
I've just checked out org.eclipse.team.cvs.core, and if I'm correct, only writeLinesToStreamAndClose(OutputStream os, String[] contents) and readLines(IFile file) needs to be changed. If you you can show me how to get the encoding, then it should be easy to make the appropriate changes.
Comment 3 Mads Stavang CLA 2007-08-06 08:01:44 EDT
Created attachment 75421 [details]
SynchFileWriter patch (using project-encoding in read/write)

Altered the following methods to include Workspace-Encoding (ResourcesPlugin.getEncoding()):
private static String[] readLines(IFile file) throws CVSException
private static void writeLinesToStreamAndClose(OutputStream os, String[] contents) throws CVSException
Comment 4 Mads Stavang CLA 2007-08-06 08:31:53 EDT
Comment to Comment #3 (Bugzilla needs some time to get used to ;-) )

The patched FileSyncWriter uses the encoding defined in ResourcesPlugin.getEncoding() which I believe is the same as defined in Preferences->General->Workspace->Text file encoding (correct me if I'm wrong). If this value is altered, then the CVS-Plugin is not properly refreshed when using "Synchronize with repository". I had to restart Eclipse to see the change taking effect.

A bi-effect of this change is that some users may experience the same merge-conflict in the bug-description after an Eclipse upgrade if they use a different encoding than the OS-Locale, for example if a Windows-user has configured Eclipse to use UTF-8. The solution is to commit everything before Eclipse-upgrade, and replace with latest from head/version/branch when merge-errors occour after upgrade.

If the user at some time changes the encoding, then a merge-conflict will also occur. The solution is the same as in the above statement. Commit before changing encoding, and replace all after change.

Perhaps a constant Encoding should be used for CVS, instead of it being project or OS dependent? Only the CVS-Plugin are reading the affected files anyway and these kind of conflicts would then only occur 0 to 1 time(s) in the lifetime of a Workspace/Project.
Comment 5 Michael Valenta CLA 2007-08-08 16:40:41 EDT
If you are using the encoding returned from ResourcesPlugin.getEncoding(), you would need to add a listener to the ResourcePlugin.getPlugin().getPluginPreferences() that would flush all the cached syncInfo when the preference changed. This could be done using the EclipseSynchronizer.flush(IContainer, ...) method using the workspace root as the resource (ResourcesPlugin.getWorkspace().getRoot()). This should fix the refresh problem.

Also, you may want to use the encoding that is specified on the project or even on the file itself. Have a look at the IContainer#getDefaultCharset(boolean) method and the IFile#getCharset(boolean) method. This would allow you to use the encodings that are associated with the project, folder or file involved. If you chang the encoding of a resource (i.e. using the Resource properties page), a delta is generated so you don;t need to woryy about refreshes. However, you would still need to react to the global change.

Comment 6 Tomasz Zarna CLA 2008-04-16 09:24:56 EDT
I can take a look at the patch during 3.5. Mads, are you still willing to work on it? If not, I can give it a try and apply Michael's suggestions. But please, let me know what is your status, as I don't want to get in your way.
Comment 7 Tomasz Zarna CLA 2008-04-16 09:26:03 EDT
Created attachment 96252 [details]
Patch

Some of Michael's suggestions applied.
Comment 8 Tomasz Zarna CLA 2008-04-16 09:26:07 EDT
Created attachment 96253 [details]
mylyn/context/zip
Comment 9 Szymon Brandys CLA 2009-05-13 04:21:17 EDT
Removing 3.5 target milestone. We are in the end-game now. Please let me know if you think this should be targeted at 3.6.
Comment 10 Eclipse Webmaster CLA 2019-09-06 16:03:54 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.