Bug 285922

Summary: [encoding] Unable to save almost unmodified file if it contained invalid characters when loaded
Product: [Eclipse Project] Platform Reporter: Michael Schierl <schierlm>
Component: TextAssignee: Platform-Text-Inbox <platform-text-inbox>
Status: RESOLVED DUPLICATE QA Contact:
Severity: normal    
Priority: P3 CC: daniel_megert
Version: 3.5   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:

Description Michael Schierl CLA 2009-08-06 14:13:01 EDT
Build ID: I20090611-1540

Steps To Reproduce:
1. Take any eclipse project that includes at least one text file. For example, create a new Java project and add a Java class to it.
2. Start Eclipse on Russian Windows version - or alternatively set the default charset of the project to Cp1251.
3. Open a file from your project.
4. Add (at a hidden position, if the file is large), a comment that includes a cyrillik &#1036; character (Unicode U+040C). Add some more cyrillic characters if you like.
5. Close the file (save it), and close the project.
6. Start Eclipse on a Western (European or American) Windows - or set the project default charset to Cp1252.
7. Open the file you just added that comment (the cyrillic characters look like garbage, but hey, they are comments anyway).
8. Edit the file, add some methods or anything, don't touch the cyrillic stuff.
9. Try to save the file, which will fail with the following error:

java.nio.charset.UnmappableCharacterException: Input length = 1
at java.nio.charset.CoderResult.throwException(Unknown Source)
at java.nio.charset.CharsetEncoder.encode(Unknown Source)
at org.eclipse.core.internal.filebuffers.ResourceTextFileBuffer.commitFileBufferContent(ResourceTextFileBuffer.java:365)
at org.eclipse.core.internal.filebuffers.ResourceFileBuffer.commit(ResourceFileBuffer.java:325)
at org.eclipse.jdt.internal.corext.refactoring.changes.AbstractDeleteChange.saveFileIfNeeded(AbstractDeleteChange.java:47)
at org.eclipse.jdt.internal.corext.refactoring.changes.DeleteSourceManipulationChange.saveCUnitIfNeeded(DeleteSourceManipulationChange.java:133)
at org.eclipse.jdt.internal.corext.refactoring.changes.DeleteSourceManipulationChange.doDelete(DeleteSourceManipulationChange.java:99)
at org.eclipse.jdt.internal.corext.refactoring.changes.AbstractDeleteChange.perform(AbstractDeleteChange.java:36)
at org.eclipse.ltk.core.refactoring.CompositeChange.perform(CompositeChange.java:278)
at org.eclipse.jdt.internal.corext.refactoring.changes.DynamicValidationStateChange.access$0(DynamicValidationStateChange.java:1)
at org.eclipse.jdt.internal.corext.refactoring.changes.DynamicValidationStateChange$1.run(DynamicValidationStateChange.java:98)
at org.eclipse.jdt.internal.core.BatchOperation.executeOperation(BatchOperation.java:39)
at org.eclipse.jdt.internal.core.JavaModelOperation.run(JavaModelOperation.java:728)
at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:1800)
at org.eclipse.jdt.core.JavaCore.run(JavaCore.java:4694)

Workaround: Copy the whole file content into the clipboard, close the file, change its file encoding to utf-8, open it again and paste it. So you can at least save your changes, but the cyrillic comments are lost anyway.

Expected behaviour: It should tell me *where* in that large file the unencodable characters are (so that I can remove the cyrillic comment). In addition, it would be nice if I received a warning when loading the file that saving it again will result in information loss (so that I can change the file encoding at that point to make sure to open it with cp1251 - or iso-8859-1).
Comment 1 Dani Megert CLA 2009-08-07 03:43:41 EDT
I assume you get a dialog and not a .log entry.

*** This bug has been marked as a duplicate of bug 144422 ***
Comment 2 Michael Schierl CLA 2009-08-10 13:45:08 EDT
Yes, I get a dialog and an entry in the "Log" window with that stack trace inside.

(As you pointed out in bug 285922, this is more a duplicate of 145754, so I'll mark it as such).

*** This bug has been marked as a duplicate of bug 145754 ***