Bug 88849

Summary: Infinite loop in scanner when using eof=Integer.MAX_VALUE
Product: [Eclipse Project] JDT Reporter: Philipe Mulet <philippe_mulet>
Component: CoreAssignee: Olivier Thomann <Olivier_Thomann>
Status: VERIFIED FIXED QA Contact:
Severity: normal    
Priority: P3    
Version: 3.1   
Target Milestone: 3.1 M6   
Hardware: PC   
OS: Windows XP   
Whiteboard:

Description Philipe Mulet CLA 2005-03-23 09:11:54 EST
Build 20050322

The following snippet does not complete. Problem is located in Scanner, looping
for ever since never getting beyond EOF_POSITION (Integer.MAX_VALUE).


import java.io.*;
import org.eclipse.jdt.core.compiler.InvalidInputException;
import org.eclipse.jdt.internal.compiler.parser.Scanner;
import org.eclipse.jdt.internal.compiler.parser.TerminalTokens;
import org.eclipse.jdt.internal.compiler.util.Util;

public class Scan { 

    public static void main(String[] args) {
        
        try{
            char[] content = Util.getFileCharContent(new
File("d:/eclipse/workspaces/dev3.1/plugins/org.eclipse.jdt.core/compiler/org/eclipse/jdt/internal/compiler/parser/Parser.java"),
null);
            Scanner scanner = new Scanner();
            scanner.setSource(content);
            
            long start = System.currentTimeMillis();
            int tokenCount = 0;
            for (int i = 0; i < 800; i++ ) {
//                scanner.resetTo(0, content.length);
                scanner.resetTo(0, Integer.MAX_VALUE);
                tokenize: while (true) {
                    int token = scanner.getNextToken();
                    switch (token) {
                        case TerminalTokens.TokenNameEOF :
                            break tokenize;
                    }
                    tokenCount++;
                }
            }
            long duration = System.currentTimeMillis() - start;
            System.out.print(tokenCount + " tokens read in " + duration + " ms");
            System.out.println(", " + ((tokenCount / duration) / 1000.00) + "
Mtokens/s");
            
        } catch(InvalidInputException e) {
            e.printStackTrace();
        } catch(IOException e) {
            e.printStackTrace();
        }
    }
}
Comment 1 Olivier Thomann CLA 2005-03-23 11:13:43 EST
I suggest that internally the eof position is set to source.length if the
specified end is bigger than source.length.
Comment 2 Olivier Thomann CLA 2005-03-23 11:57:09 EST
I propose the following patch.

Index: Scanner.java
===================================================================
RCS file:
/home/eclipse/org.eclipse.jdt.core/compiler/org/eclipse/jdt/internal/compiler/parser/Scanner.java,v
retrieving revision 1.140
diff -u -r1.140 Scanner.java
--- Scanner.java	23 Mar 2005 14:36:51 -0000	1.140
+++ Scanner.java	23 Mar 2005 16:55:03 -0000
@@ -2301,6 +2301,9 @@
 	this.diet = false;
 	this.initialPosition = this.startPosition = this.currentPosition = begin;
 	this.eofPosition = end < Integer.MAX_VALUE ? end + 1 : end;
+	if (this.source != null && this.source.length < this.eofPosition) {
+		this.eofPosition = this.source.length;
+	}
 	this.commentPtr = -1; // reset comment stack
 	this.foundTaskCount = 0;
 	
Comment 3 Olivier Thomann CLA 2005-03-23 12:01:19 EST
Even better:
Index: Scanner.java
===================================================================
RCS file:
/home/eclipse/org.eclipse.jdt.core/compiler/org/eclipse/jdt/internal/compiler/parser/Scanner.java,v
retrieving revision 1.140
diff -u -r1.140 Scanner.java
--- Scanner.java	23 Mar 2005 14:36:51 -0000	1.140
+++ Scanner.java	23 Mar 2005 16:59:12 -0000
@@ -2300,7 +2300,11 @@
 
 	this.diet = false;
 	this.initialPosition = this.startPosition = this.currentPosition = begin;
-	this.eofPosition = end < Integer.MAX_VALUE ? end + 1 : end;
+	if (this.source != null && this.source.length < end) {
+		this.eofPosition = this.source.length;
+	} else {
+		this.eofPosition = end < Integer.MAX_VALUE ? end + 1 : end;
+	}
 	this.commentPtr = -1; // reset comment stack
 	this.foundTaskCount = 0;
 	
Comment 4 Olivier Thomann CLA 2005-03-23 14:16:16 EST
I released this patch.
This fixes this issue. When no source is set, the eof position will be left as
is. When the source is set, the eof position is updated accordingly.
Comment 5 Olivier Thomann CLA 2005-03-30 23:39:02 EST
Verified in 20050330-0500