Bug 312249

Summary: Refactorings do not handle non-ASCII characters correctly
Product: [Tools] PTP Reporter: Michel DEVEL <michel.devel>
Component: Photran.Refactoring EngineAssignee: Jeffrey Overbey <com-eclipse-dot-org>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P1 CC: com-eclipse-dot-org
Version: 5.0   
Target Milestone: 6.0   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
.log file of the workspace with traces of cases when refactoring fails
none
Unicode support patch g.watson: iplog+

Description Michel DEVEL CLA 2010-05-10 09:09:09 EDT
Created attachment 167699 [details]
.log file of the workspace with traces of cases when refactoring fails

Consider the following little program:

program nonASCII_bug

! comment with a non ASCII character: é
do i=1,2
   print *,i
enddo

end

If the non-ASCII character "é" at the end of the comment line is removed, the 'introduce implicit none' refactoring works. if the "é" is there, it does not work.
I even think that it would be the case for any refactoring (I tested it also for the "Change Keyword case..." refactoring.

the workspace/.metadata/.log is included
Comment 1 Jeffrey Overbey CLA 2010-05-11 21:12:31 EDT
Just reproduced this.

The example works as expected when the file is MacRoman encoded (default on OS X).

It gives "This refactoring does not change any source code" when the file is UTF-16 encoded.

When the file is UTF-8 encoded, it gives the error in the log file (org.eclipse.text.edits.MalformedTreeException: End position lies outside document range)
Comment 2 Jeffrey Overbey CLA 2010-05-11 22:37:40 EDT
Created attachment 168076 [details]
Unicode support patch

The changes to support this were pervasive but not particularly complicated.  Attaching a patch for reference.
Comment 3 Jeffrey Overbey CLA 2010-05-11 22:39:12 EDT
Patch committed to CVS.