Bug 109636 - Comment formatter doesn't support "*/"
Summary: Comment formatter doesn't support "*/"
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: 3.3.1   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL:
Whiteboard:
Keywords: contributed
: 173130 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-09-15 12:22 EDT by Olivier Thomann CLA
Modified: 2008-09-16 09:46 EDT (History)
4 users (show)

See Also:


Attachments
Proposed fix (2.63 KB, patch)
2007-07-30 22:25 EDT, Olivier Thomann CLA
no flags Details | Diff
Regression test (2.95 KB, patch)
2007-07-30 22:26 EDT, Olivier Thomann CLA
no flags Details | Diff
Second fix (4.74 KB, patch)
2007-08-01 12:50 EDT, Matt McCutchen CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Thomann CLA 2005-09-15 12:22:33 EDT
Using latest, when */ is reached, it is converted to */ and this is closing
the block comment.
Therefore the remaining part of the comment is considered to be outside the
comment and the code doesn't compile anymore.
A test case is the org.eclipse.jdt.core.dom.Statement class.
Comment 1 Olivier Thomann CLA 2005-09-15 12:37:41 EDT
The bug resides in the HTML2JavaReader. It converts "/*" into "/ *" and
"*/" into " */". This is wrong. Inside a string literal, these characters
should not be converted. And outside, I would suggest to convert them to a
unicode. It should also be remembered that they were converted to convert them
back once the code is formatted. The code formatter should never switch
characters  (like *)  for their corresponding ascii form (*).
Comment 2 Dani Megert CLA 2005-09-15 12:59:03 EDT
Can you clarify:
- afaik HTML2JavaReader has nothing to do with code formatter
- afaik HTML2JavaReader has nothing to do with compiler

Please clarify what exactly the bug is that you report.
Comment 3 Olivier Thomann CLA 2005-09-15 13:18:30 EDT
Let's take an example. Try to format the following javadoc comment inside a cu.
/**
 * <code>
 * <pre>
 * setLeadingComment("/&#42; traditional comment &#42;/");  // correct
 * setLeadingComment("missing comment delimiters");  // wrong
 * setLeadingComment("/&#42; unterminated traditional comment ");  // wrong
 * setLeadingComment("/&#42; broken\n traditional comment &#42;/");  // correct
 * setLeadingComment("// end-of-line comment\n");  // correct
 * setLeadingComment("// end-of-line comment without line terminator");  // correct
 * setLeadingComment("// broken\n end-of-line comment\n");  // wrong
 * </pre>
 * </code>
 */
When the code inside the <pre></pre> is formatted, the &#42; is converted to
'*'. So when the &#42;/ is reached, it is converted to */ and once it is
replaced back into the javadoc comment, the comment ends there instead of its
actual end.
The resulting code doesn't compile anymore.

Try to format the class org.eclipse.jdt.core.dom.Statement. The code is
extracted from there. Also the &#42; is gone in the formatted string. The code
formatter should not replace characters.
Comment 4 Dani Megert CLA 2005-09-16 02:40:51 EDT
I guess your assessment of the bug was made too quick or during a coffee
shortage ;-)

HTML2JavaReader does not exits. I'd say the cause for this is in
HTMLEntity2JavaReader which resides in JDT Core.
Comment 5 Olivier Thomann CLA 2005-09-16 12:56:01 EDT
yes, the bug is in HTMLEntity2JavaReader.
This is part of the code we inherited from JDT/Text.
Comment 6 Olivier Thomann CLA 2007-02-06 13:40:52 EST
*** Bug 173130 has been marked as a duplicate of this bug. ***
Comment 7 Olivier Thomann CLA 2007-02-08 13:48:04 EST
The only way I can see to fix this is to add an extra scanning at the end of the code snippet formatting to replace all ending block/javadoc comment */ back with &#42;/.
Comment 8 Olivier Thomann CLA 2007-07-30 22:25:53 EDT
Created attachment 74987 [details]
Proposed fix
Comment 9 Olivier Thomann CLA 2007-07-30 22:26:17 EDT
Created attachment 74988 [details]
Regression test
Comment 10 Olivier Thomann CLA 2007-07-30 22:27:48 EDT
Jérôme,

I would candidate it for 3.3.1. With this, the JDT/Core source could be successfully formatted.

Released for 3.4M1.
Regression tests added in org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636
Comment 11 Matt McCutchen CLA 2007-08-01 11:11:49 EDT
The fix does not consider that the character after a star may need to be substituted in its own right and thus fails in the two cases shown in the comment below:

/**
 * <pre>
 * /* Comment ending in multiple stars *&#42;/
 * /* Entity-needing character after a star *&lt; &#42;/
 * </pre>
 */

One easy solution is to add a loop to handle a sequence of stars.  When a non-star is encountered, is stored in c, the loop ends, and computeSubstitution proceeds as normal but prefixes its return value with the appropriate number of stars.  A more elegant solution is to add a "peeking" or "unreading" capability to SubstitutionTextReader so that, when computeSubstitution finds a * followed by a character other than /, it can leave the SubstitutionTextReader in a state so that the following character will be read again on the next call to nextChar.  Such a capability could be implemented in SubstitutionTextReader or could use the mark/reset of the underlying reader.
Comment 12 Matt McCutchen CLA 2007-08-01 11:14:53 EDT
I will be happy to write the patch if you tell me which solution you prefer.
Comment 13 Olivier Thomann CLA 2007-08-01 11:45:12 EDT
I would take the easy one for now.
If we need something more elegant in the future, we would change it.

If you can provide a patch today, feel free to attach a patch. Otherwise, I'll write one.

I added your test case as a disabled test in org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#_test109636_2.

Could you please confirm if you provide a patch?

Thanks.
Comment 14 Matt McCutchen CLA 2007-08-01 12:50:46 EDT
Created attachment 75133 [details]
Second fix

This patch does the following:

- Fixes Java2HTMLEntityReader to handle the new cases
- Fixes a bug in SubstitutionTextReader's handling of EOF that was revealed by the first change
- Enables test109636_2
- Fixes a mistake in the expected output of test109636_2

Changing the convention to represent */ as *&#47; instead of &#42;/ would make the code slightly nicer because it wouldn't have to overwrite the star it has already processed.  It also might be nice to use &#042; or &#047; for consistency with &#064; .  I didn't do either of these in this patch.
Comment 15 Olivier Thomann CLA 2007-08-01 14:23:39 EDT
Released for 3.4M1.
Added new regression tests org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636_2
org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636_3
Comment 16 Olivier Thomann CLA 2007-08-01 14:23:59 EDT
Thank you for the patch.
Comment 17 Frederic Fusier CLA 2007-08-07 05:16:18 EDT
Verified for 3.4M1 using build I20070806-1800.
Comment 18 Jerome Lanneluc CLA 2007-08-09 05:14:23 EDT
(In reply to comment #10)
> Jérôme,
> 
> I would candidate it for 3.3.1. With this, the JDT/Core source could be
> successfully formatted.

+1 for backporting to 3.3.1
Comment 19 Olivier Thomann CLA 2007-08-14 11:05:59 EDT
Reopen for 3.3.1.
Comment 20 Olivier Thomann CLA 2007-08-14 11:08:18 EDT
Released for 3.3.1.
Regression tests added in
org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636
org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636_2
org.eclipse.jdt.core.tests.formatter.comment.JavaDocTestCase#test109636_3
Comment 21 Eric Jodet CLA 2007-09-03 06:55:02 EDT
Verified for 3.3.1 using build M20070831-2000