Bug 52363 - Unicode character sequence '\u000a'; causes JDT parsing error
Summary: Unicode character sequence '\u000a'; causes JDT parsing error
Status: RESOLVED INVALID
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 2.1.2   Edit
Hardware: PC Windows 2000
: P3 minor (vote)
Target Milestone: 3.0 M8   Edit
Assignee: JDT-Core-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-02-18 10:12 EST by Johan Persson CLA
Modified: 2004-02-18 12:35 EST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Johan Persson CLA 2004-02-18 10:12:31 EST
The following statement produces a compilation error in JDT, even though it 
looks to me as a syntactically correct statement:

char a = '\u000a';

same result is obtained for 

char a = '\u000A';

Trying to comment the line out by appending the line with // still produces a 
compile-time error, which clearly indicates that something is wrong.


----

Other unicode character sequences such as 

char a = '\u000b';

are accepted as correct.
Comment 1 Olivier Thomann CLA 2004-02-18 12:35:08 EST
\u000a is a line break and this is illegal inside a character constant.
We report:
1. ERROR in C:\tests_sources\X.java (at line 2)
	char a = '\u000a';
	         ^^^^^^^
Invalid character constant

javac 1.4.2 reports:
C:\tests_sources>javac X.java
X.java:2: illegal line end in character literal
        char a = '\u000a';
                 ^
1 error

jikes 1.18 reports:
C:\tests_sources>jikes -classpath c:\jdks\jdk1.4.1_05\jre\lib\rt.jar X.java

Found 2 lexical errors in "C:/tests_sources/X.java":

     2.         char a = '
                         ^
*** Lexical Error: Character constant not properly terminated.


     3. ';
        ^^
*** Lexical Error: Character constant not properly terminated.

Found 1 syntax error in "C:/tests_sources/X.java":

     3. ';
        ^^
*** Syntax: ; expected instead of this token

See http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#100964:


Because Unicode escapes are processed very early, it is not correct to write
'\u000a' for a character literal whose value is linefeed (LF); the Unicode
escape \u000a is transformed into an actual linefeed in translation step 1
(§3.3) and the linefeed becomes a LineTerminator in step 2 (§3.4), and so the
character literal is not valid in step 3. Instead, one should use the escape
sequence '\n' (§3.10.6). Similarly, it is not correct to write '\u000d' for a
character literal whose value is carriage return (CR). Instead, use '\r'.

So this is an invalid code that is properly rejected by the Eclipse compiler.
Close as INVALID.