Bug 294529 - The Scanner sometimes ignores the given offset if larger than the EOF.
Summary: The Scanner sometimes ignores the given offset if larger than the EOF.
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.6 M4   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 300162 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-11-07 03:43 EST by Dieter Kleinrath CLA
Modified: 2010-02-08 06:47 EST (History)
3 users (show)

See Also:


Attachments
a simple fix for the public API (986 bytes, text/plain)
2009-11-07 03:55 EST, Dieter Kleinrath CLA
no flags Details
Proposed fix + regression tests (16.68 KB, patch)
2009-11-12 14:51 EST, Olivier Thomann CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dieter Kleinrath CLA 2009-11-07 03:43:19 EST
User-Agent:       Opera/9.26 (Windows NT 5.1; U; de)
Build Identifier: 20090619-0625

The class org.eclipse.jdt.internal.compiler.parser.Scanner will sometimes ignore the given offset when using the method scanner.resetTo(start, offset). This is because the scanner sometimes catches IndexOutOfBoundsExceptions instead of testing if the EOF position is actually reached.

Reproducible: Always

Steps to Reproduce:
Scanner javaScanner = new Scanner(true, true, false, 
			ClassFileConstants.JDK1_4, 
			ClassFileConstants.JDK1_4,
			null, null, true);
javaScanner.setSource("// a comment, longer than the offset".toCharArray());
javaScanner.resetTo(0,5);
try {
	javaScanner.getNextToken();
	System.out.println(javaScanner.getCurrentIdentifierSource());
} catch (InvalidInputException e) {}
Comment 1 Dieter Kleinrath CLA 2009-11-07 03:55:28 EST
Created attachment 151630 [details]
a simple fix for the public API
Comment 2 Olivier Thomann CLA 2009-11-07 12:15:31 EST
What do you expect with your code snippet ?
Comment 3 Dieter Kleinrath CLA 2009-11-07 13:33:51 EST
(In reply to comment #2)
> What do you expect with your code snippet ?

Well, if the scanner scans after the EOF position because no  IndexOutOfBoundsExceptions was thrown then the code snippet will use (eofPosition -1) instead of (currentPosition -1) as the end position of the current token. This way getCurrentTokenSource() will never return a character array that exceeds the given end offset when using resetTo(start, offset).

When getNextToken() is called again and the currentToken points behind the eofPosition the scanner will return the expected EOF token.

The snipped will also make sure that getCurrentTokenEndPosition() will never return a token end position that points after the eofPosition which I think should be the expected behaviour.

This of course doesn't solve the problem really but it will make the methods getCurrentTokenEndPosition() and getCurrentTokenSource() work correctly. At least I think so...

Of course to really solve the problem one would have to always test for the eofPosition instead of catching IndexOutOfBoundsExceptions. But I'm not sure this is worth it because until yet this problem seemd to cause no problems and I also only found it by luck.

Cheers,
Dieter
Comment 4 Dieter Kleinrath CLA 2009-11-07 14:25:56 EST
I just had another quick look at the Scanner and while my snippet migth solve some problems it will certainly not solve all.

Here another example where my snippet doesn't help anything:

//CODE START
Scanner javaScanner = new Scanner(
			true, true, false, 
			ClassFileConstants.JDK1_4, 
			ClassFileConstants.JDK1_4,
			null, null, true);

javaScanner.setSource("/*a comment, longer\n than the\n offset*/".toCharArray());
javaScanner.resetTo(0,5);

try {
	int token = javaScanner.getNextToken();
	System.out.println(javaScanner.getCurrentIdentifierSource());
	if (token == TerminalTokens.TokenNameCOMMENT_BLOCK) {
		System.out.println("A comment block");
	}
} catch (InvalidInputException e) {}
//CODE END

This code should throw an InvalidInputException because the source does not contain a valid java token between the index 0 and 5. Instead the scanner will return the token TokenNameCOMMENT_BLOCK because the eofPosition was ignored
by the scanner.
Comment 5 Olivier Thomann CLA 2009-11-09 14:38:13 EST
In this the fix will be to check EOF position while scanning comments. Your test case in comment 4 clearly shows a bug.
Will be fixed soon.
Comment 6 Olivier Thomann CLA 2009-11-11 14:23:42 EST
This is a bug, but the attached patch is only hiding it. The scanner should not scan characters passed the EOF position.
I'll take a look.
Comment 7 Olivier Thomann CLA 2009-11-12 14:51:51 EST
Created attachment 152098 [details]
Proposed fix + regression tests

This should fix the issue.
Comment 8 Olivier Thomann CLA 2009-11-12 15:25:55 EST
Released for 3.6M4.
Regression tests added in ScannerTest:
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test047
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test048
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test049
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test050
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test051
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test052
Comment 9 Srikanth Sankaran CLA 2009-12-08 04:52:46 EST
Verified for 3.6M4 using Build id: I20091207-1800
Comment 10 Srikanth Sankaran CLA 2009-12-08 04:54:27 EST
Verified for 3.6M4 using Build id: I20091207-1800
Comment 11 Frederic Fusier CLA 2010-02-08 06:47:15 EST
*** Bug 300162 has been marked as a duplicate of this bug. ***