Community
Participate
Working Groups
Eclipse 3.2 I had a file whose encoding was set to ISO-8859-1 in which I pasted some text from word that contained double quotes like those: ”. When I tried to save the file I got a dialog saying that the file could not be saved because it was containing invalid characters. However it did not tell me what and where were those characters.
Simply compare it the previous element in the local history. There are currently no plans to provide more support for this.
This is a big pain. The dialog is not useful - it provides absolutely no insight as to how I can fix the problem. Can you hightlight the offending characters? Or provide a button: "Remove characters from different encoding" ? This would solve the problem in those cases when the text looks just fine but there's a character somewhere that eclipse chokes on (and it will not tell you where it is!) The current state is very bad - I had to use another editor to paste the text into my file. The other editor had not complaint whatsoever.
I just checked. The other editor (emacs) converted the offending characters to unicode symbols like so \u219c etc. I think it'd be much more appealing solution to have a warning button that said something in the spirit of "All characters from different encoding will be converted to unicode symbols" than to have the editor refuse to save my file at all.
Get rid of deprecated state.
any plans to fix this? I still have to use other editors than eclipse to simply paste text into files.
No plans. Feel free to provide a patch.
*** Bug 217560 has been marked as a duplicate of this bug. ***
The dialog is also not helpful if you want to *keep* the unsaveable characters by changing the file's encoding, because - Edit > Set Encoding... is disabled, and - after setting the encoding in Properties > Resource on the file, Save still complains that the editor content is not valid w.r.t. the old encoding. The only way I found to hammer the content into the file was to select all, cut, save, change the encoding, paste the content back, save.
*** Bug 276987 has been marked as a duplicate of this bug. ***
I addition to offering a button that just replaces bad characters with e.g. '?', we could also add a button "Show in Compare Editor", which opens a compare editor with the original content on one side and the proposed simplifications on the other side. That would allow the user to verify every single change and accept it or fix it manually. Saving the compare editor should reconcile the compare viewer and leave it unsaved (maybe with another error dialog), such that the user can fix the remaining issues.
Yep, bug 261716 discusses to use compare for another feature. We "only" have to solve the chicken and egg problem: currently compare depends on text.
*** Bug 193769 has been marked as a duplicate of this bug. ***
*** Bug 284069 has been marked as a duplicate of this bug. ***
*** Bug 285922 has been marked as a duplicate of this bug. ***
The compare editor proposed in comment 10 is probably overkill. An easier solution would be a dialog that: 1. solves comment 8, i.e. allows me to save the file with a different encoding (change encoding of the file or encoding of the whole project) 2. allows me to select the first offending character in the file, so that I can just go back to the editor and fix it in place. Would also be good if the dialog could tell me to total count of problematic characters, to help me decide how to best fix the problem.
Changing encoding of the entire project doesn't make sense, unless all existing files in the project get transcoded, or there is risk of data loss. A "transcode project" wizard would be nice, but I think that's a separate problem. I also think that most users don't know or understand encodings. Rather than offering any other encoding to choose, it may make sense for the dialog to give a fixed button "Save as UTF-8" since that's known to be a safe choice. But I think that before saving in a different encoding, most users will want to review what's going wrong. What about this idea that might solve the compare viewer dependency issues: When "Save" runs into an encoding problem, then... 1.) Editor contents is copied into a temporary buffer 2.) A special "Save" operation replaces all offending characters with a "?" or "\u0123" so current encoding is not violated in the file on disk 3.) Temporary buffer is copied back into the editor (becomes dirty) 4.) Dialog is opened: "Some characters could not be saved in the current encoding. Do you want to (a) Review changes, (b) Save as UTF-8" 5.) Compare editor against saved version (from local history) is opened -- since this is existing functionality, it should be possible by sending a Command so the dependency problem should be solved Users can now review / edit changes. When they just click "save" again and not all issues are resolved yet, they can now save as UTF-8 which is guaranteed not to lose any data. Once they have successfully saved as UTF-8, they should be able to Edit > Set Encoding... if they want something other than UTF-8 (this should be transcoding the file). Am I missing anything?
>Changing encoding of the entire project doesn't make sense, unless all existing >files in the project get transcoded, or there is risk of data loss. There's no immediate data loss when changing the encoding property on the file/project but a file might no longer be correctly read afterwards and when then saved it might cause damage. >I also think that most users don't know or understand encodings. Exactly and hence they don't run into this issue too often and therefore writing too much code/feature work around this is overkill. What we need is 1. a way to go to the problematic characters 2. allow to save the file in a different encoding (UTF-8 being the suggested default
Bug 285922 was marked a "duplicate" of this bug, but please keep in mind that that bug is not about incorrect pasted text but about files that once decoded cannot be encoded again (because different developers used different default encodings but shared the same files via source control). That means it can happen that you open a file, add a space, cannot save again, because somewhere else there is an offending comment... A "save as UTF-8" feature is fine (at least it will let me save it; preferrable with an attached UTF-8 BOM so that the file will in all cases be read as UTF-8 later), but a message that tells me when I open the file that it cannot be saved again in this encoding because the content is invalid would still be nice ;-) Save as \u1234 is fine for Java files, but might be suboptimal for HTML files...
>but a message that tells me when I open the file that it cannot be >saved again in this encoding That's bug 145754.
Fixed in HEAD. Available in builds > N20091018-2000.
.
Verified for 3.6 M3 with I20091026-1442
It has been mentioned in bug 261716 #c23 that a fix for that bug may allow a better fix for this one. I think the idea is to allow compare plugin to contribute a 'compareOpener' to text, so that text can use it in situations like enconding problems (this bug) or out-of-syncs (that bug). However, I have a suggestion: how about open a search result highlighting the offending characters? This way the user can fix the wrong chars in the editor itself without opening a new view for that. I think this can be more natural. Eclipse says there are offending chars, you ask what chars, and Eclipse simply highlights them for you. Then you can delete or replace them in the text editor itself, or you can even use the opened search view to make a batch replace.
Search would be another approach. The problem there is that you don't see the diff with what's currently on disk. We could even combine the two.
(In reply to comment #24) > Search would be another approach. The problem there is that you don't see the > diff with what's currently on disk. We could even combine the two. The whole point is that there's nothing to compare to. You just want to highlight the offending chars, your changes may include a considerable amount of valid chars which would be obfuscating those invalid, so you still would feel lost searching for them. For example, imagine you're refactoring a class, and you change many lines, but you accidentally insert an invalid char. If you guys use compare, the user will see the diff between the original file and all the refactoring which was done without saving, and will have to find the char from this diff. However, doesn't it make much more sense to not search for the char at all? That is, you just tell exactly what chars are these. Note: I'm not sure though, if one could open a search view for unsaved files. However as I have explained, I think a search highlight has much more sense.
I can see the status of this is "Fixed", but what exactly was the fix?
>I can see the status of this is "Fixed", but what exactly was the fix? The dialog now - provides a way to go to the problematic characters - allows to save the file in a different encoding (UTF-8 being the suggested default