Bug 126744 - Eclipse Java Compiler accepts string constants with UTF-8 representation larger 64kByte
Summary: Eclipse Java Compiler accepts string constants with UTF-8 representation larg...
Status: RESOLVED INVALID
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.0.2   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.2 M5   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-07 11:41 EST by Martin Purkert CLA
Modified: 2006-02-07 21:22 EST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Purkert CLA 2006-02-07 11:41:06 EST
Dear Bugteam!

The Eclipse Java Compiler (Eclipse 3.0.2) accepts string constants which UTF-8 representation exceeds 64kB.

Since Sun's Class File specification (see  http://java.sun.com/docs/books/vmspec/html/ClassFile.doc.html) defines a CONSTANT_Utf8_info structure with a length attribute of 2 byte unsigned, string constants with utf8-representation larger 65535 Bytes should not be possible.

Suns Java Compiler (JDK 1.4.2_07 in my case) causes a compilation error, Eclipse compiler does not. 

Although the Eclipse compiled java class seems as if it works correctly, it is unclear if this behaviour is always correct.

How to reproduce: Simply define a main class and assign a very large string constant (> 64kb) to a string object.

regards
Martin Purkert
Martin.Purkert@bawagpsk.com
Comment 1 Olivier Thomann CLA 2006-02-07 11:44:56 EST
To be sure that we do reproduce the problem you noticed, could you please provide a test case that fails with javac, but works with Eclipse's compiler?
As soon as I get it, I will investigate.
Thanks.
Comment 2 Olivier Thomann CLA 2006-02-07 19:49:45 EST
Reproduced.
I am investigating.
Comment 3 Olivier Thomann CLA 2006-02-07 21:04:23 EST
The Eclipse compiler is splitting the long string into smaller pieces.
Then we concatenate all the pieces and at the end we call the intern() method.

I don't know if this is illegal to do that. As long as we generate valid bytecodes and we respect the fact that CONSTANT == CONSTANT returns true, where CONSTANT is the long string.

Do you have an example in the JLS that states this is wrong?
Let me know. I would be very interested to clarify this issue.

For now I am closing as INVALID since I didn't find anything against the way we treat long strings. Don't hesitate to reopen if you believe this is wrong.

Thanks for your bug report.
Comment 4 Olivier Thomann CLA 2006-02-07 21:22:59 EST
Added regression test org.eclipse.jdt.core.tests.compiler.regression.XLargeTest.test009