Bug 106403

Summary: PublicScanner returns EOF late
Product: [Eclipse Project] JDT Reporter: Stoney <hjackson>
Component: CoreAssignee: Olivier Thomann <Olivier_Thomann>
Status: VERIFIED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: philippe_mulet
Version: 3.1   
Target Milestone: 3.1.1   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
Proposed fix none

Description Stoney CLA 2005-08-08 16:10:54 EDT
PROBLEM

A whitespace-tokenizing PublicScanner returns a whitespace token when the
whitespace starts one position after the end position specified by resetTo(). It
should return EOF. Below is some test code that tickles the bug, along with its
output. I ran it inside a HelloWorld plug-in. I also give a workaround for
clients and a possible, untested fix.

CODE

public void testScanner() throws InvalidInputException
{
   IScanner s = ToolFactory.createScanner (true, true, true, "1.5", "1.5");
   System.err.println("IScanner is " + s.getClass());
   char[] source = {';', ' '};
   s.setSource (source);
   s.resetTo (0, 0);
   s.getNextToken ();
   System.err.println("token is ;? "
      + ";".equals (new String (s.getRawTokenSource ())));
   int t = s.getNextToken ();
   System.err.println("token is EOF? "
      + (ITerminalSymbols.TokenNameEOF == t));
}

OUTPUT

IScanner is class org.eclipse.jdt.internal.core.util.PublicScanner
token is ;? true
token is EOF? false

WORKAROUND

Clients need to check for EOF and tokens whose start position is after end. For
example:

   int token;
   IScanner scanner =
      ToolFactory.createScanner (true, true, true, "1.5", "1.5");
   scanner.setSource (source);
   scanner.resetTo (start, end);
   token = scanner.getNextToken ();
   while (token != ITerminalSymbols.TokenNameEOF
          && scanner.getCurrentTokenStartPosition () <= end))
   {
      // ...
      token = scanner.getNextToken ();
   }

POSSIBLE, UNTRIED FIX

In PublicScanner.getNextToken(), just bofore consuming whitespace, check for EOF.

   ...
   int whiteStart = 0;
   try {
      // FIX?
      if (this.currentPosition >= this.eofPosition) {
         return TokenNameEOF;
      }
      // END FIX?
      while (true) { //loop for jumping over comments
         this.withoutUnicodePtr = 0;
         ...
Comment 1 Olivier Thomann CLA 2005-08-08 16:13:03 EDT
This could be a candidate for 3.1.1
Comment 2 Philipe Mulet CLA 2005-08-09 08:52:41 EDT
If not causing harm to any client tests (UI, refactoring, etc...) then +1 for 3.1.1.
Comment 3 Olivier Thomann CLA 2005-08-09 14:56:56 EDT
Created attachment 25922 [details]
Proposed fix
Comment 4 Olivier Thomann CLA 2005-08-10 11:07:47 EDT
This fix passed all existing tests including jdt/ui and refactoring.
Comment 5 Philipe Mulet CLA 2005-08-10 12:41:06 EDT
+1 for 3.1.1
Comment 6 Philipe Mulet CLA 2005-08-10 12:41:29 EDT
Remember to update both the master and public scanners.
Comment 7 Olivier Thomann CLA 2005-08-10 14:40:01 EDT
Fixed and released in 3.1 maintenance stream.
Regression tests added in
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest.test036/test041.

Will be released in HEAD post 3.2M1.
Comment 8 Stoney CLA 2005-08-11 09:30:47 EDT
You guys are great!
Comment 9 Olivier Thomann CLA 2005-08-11 11:52:38 EDT
Fixed and released in HEAD.
Same regression tests as 3.1.1.
Comment 10 David Audel CLA 2005-09-21 07:21:50 EDT
Verified in I20050920-0010 for 3.2M2
Comment 11 David Audel CLA 2005-09-26 12:12:43 EDT
Verified using M20050923-1430 for 3.1.1