294529 – The Scanner sometimes ignores the given offset if larger than the EOF.

Bug 294529 - The Scanner sometimes ignores the given offset if larger than the EOF.

Summary: The Scanner sometimes ignores the given offset if larger than the EOF.

Status:	VERIFIED FIXED

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	3.6
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.6 M4
Assignee:	Olivier Thomann
QA Contact:

URL:
Whiteboard:
Keywords:

Duplicates (1):	300162 (view as bug list)
Depends on:
Blocks:

Reported:	2009-11-07 03:43 EST by Dieter Kleinrath
Modified:	2010-02-08 06:47 EST (History)
CC List:	3 users (show)

See Also:

Attachments
a simple fix for the public API (986 bytes, text/plain) 2009-11-07 03:55 EST, Dieter Kleinrath	no flags	Details
Proposed fix + regression tests (16.68 KB, patch) 2009-11-12 14:51 EST, Olivier Thomann	no flags	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dieter Kleinrath

2009-11-07 03:43:19 EST

User-Agent:       Opera/9.26 (Windows NT 5.1; U; de)
Build Identifier: 20090619-0625

The class org.eclipse.jdt.internal.compiler.parser.Scanner will sometimes ignore the given offset when using the method scanner.resetTo(start, offset). This is because the scanner sometimes catches IndexOutOfBoundsExceptions instead of testing if the EOF position is actually reached.

Reproducible: Always

Steps to Reproduce:
Scanner javaScanner = new Scanner(true, true, false, 
			ClassFileConstants.JDK1_4, 
			ClassFileConstants.JDK1_4,
			null, null, true);
javaScanner.setSource("// a comment, longer than the offset".toCharArray());
javaScanner.resetTo(0,5);
try {
	javaScanner.getNextToken();
	System.out.println(javaScanner.getCurrentIdentifierSource());
} catch (InvalidInputException e) {}

Comment 1 Dieter Kleinrath

2009-11-07 03:55:28 EST

Created attachment 151630 [details]
a simple fix for the public API

Comment 2 Olivier Thomann

2009-11-07 12:15:31 EST

What do you expect with your code snippet ?

Comment 3 Dieter Kleinrath

2009-11-07 13:33:51 EST

(In reply to comment #2)
> What do you expect with your code snippet ?

Well, if the scanner scans after the EOF position because no  IndexOutOfBoundsExceptions was thrown then the code snippet will use (eofPosition -1) instead of (currentPosition -1) as the end position of the current token. This way getCurrentTokenSource() will never return a character array that exceeds the given end offset when using resetTo(start, offset).

When getNextToken() is called again and the currentToken points behind the eofPosition the scanner will return the expected EOF token.

The snipped will also make sure that getCurrentTokenEndPosition() will never return a token end position that points after the eofPosition which I think should be the expected behaviour.

This of course doesn't solve the problem really but it will make the methods getCurrentTokenEndPosition() and getCurrentTokenSource() work correctly. At least I think so...

Of course to really solve the problem one would have to always test for the eofPosition instead of catching IndexOutOfBoundsExceptions. But I'm not sure this is worth it because until yet this problem seemd to cause no problems and I also only found it by luck.

Cheers,
Dieter

Comment 4 Dieter Kleinrath

2009-11-07 14:25:56 EST

I just had another quick look at the Scanner and while my snippet migth solve some problems it will certainly not solve all.

Here another example where my snippet doesn't help anything:

//CODE START
Scanner javaScanner = new Scanner(
			true, true, false, 
			ClassFileConstants.JDK1_4, 
			ClassFileConstants.JDK1_4,
			null, null, true);

javaScanner.setSource("/*a comment, longer\n than the\n offset*/".toCharArray());
javaScanner.resetTo(0,5);

try {
	int token = javaScanner.getNextToken();
	System.out.println(javaScanner.getCurrentIdentifierSource());
	if (token == TerminalTokens.TokenNameCOMMENT_BLOCK) {
		System.out.println("A comment block");
	}
} catch (InvalidInputException e) {}
//CODE END

This code should throw an InvalidInputException because the source does not contain a valid java token between the index 0 and 5. Instead the scanner will return the token TokenNameCOMMENT_BLOCK because the eofPosition was ignored
by the scanner.

Comment 5 Olivier Thomann

2009-11-09 14:38:13 EST

In this the fix will be to check EOF position while scanning comments. Your test case in comment 4 clearly shows a bug.
Will be fixed soon.

Comment 6 Olivier Thomann

2009-11-11 14:23:42 EST

This is a bug, but the attached patch is only hiding it. The scanner should not scan characters passed the EOF position.
I'll take a look.

Comment 7 Olivier Thomann

2009-11-12 14:51:51 EST

Created attachment 152098 [details]
Proposed fix + regression tests

This should fix the issue.

Comment 8 Olivier Thomann

2009-11-12 15:25:55 EST

Released for 3.6M4.
Regression tests added in ScannerTest:
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test047
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test048
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test049
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test050
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test051
org.eclipse.jdt.core.tests.compiler.regression.ScannerTest#test052

Comment 9 Srikanth Sankaran

2009-12-08 04:52:46 EST

Verified for 3.6M4 using Build id: I20091207-1800

Comment 10 Srikanth Sankaran

2009-12-08 04:54:27 EST

Verified for 3.6M4 using Build id: I20091207-1800

Comment 11 Frederic Fusier

2010-02-08 06:47:15 EST

*** Bug 300162 has been marked as a duplicate of this bug. ***