Bug 165525 - [comments] ASTParser excludes trailing line comments from extended range of fields in enums
Summary: [comments] ASTParser excludes trailing line comments from extended range of f...
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.3   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.3 M4   Edit
Assignee: Frederic Fusier CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-22 16:14 EST by Dmitry Denisov CLA
Modified: 2006-12-12 07:10 EST (History)
2 users (show)

See Also:


Attachments
Test case illustrating the problem (4.44 KB, text/plain)
2006-11-22 16:18 EST, Dmitry Denisov CLA
no flags Details
Test case shows handling comments in enum constants and fields (6.99 KB, text/plain)
2006-11-23 13:47 EST, Dmitry Denisov CLA
no flags Details
Proposed patch (5.45 KB, patch)
2006-11-24 06:27 EST, Frederic Fusier CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Denisov CLA 2006-11-22 16:14:31 EST
ASTParser does not include line comment of the last field declaration of enum in the extended range of the declaration.

In the following class, both fields have line comments included in their extended range:

class X
{	
	String a; //$NON-NLS-1$
	String b; //$NON-NLS-2$
}

As expected, field a has extended range "String a; //$NON-NLS-1$", and field b has range "String b; //$NON-NLS-2$".

On the other hand, for enum last field declaration does not include its line comment:

enum X
{	ENUM_CONST_1; // line comment after ENUM_CONST_1
	String a; //$NON-NLS-1$
	String b; //$NON-NLS-2$
}

Field a has extended range "String a; //$NON-NLS-1$", field b only "String b;";


The above difference causes very undesired behaviour of ASTRewrite:

When a new field is added to enum after the last field, the trailing line comment becomes the line comment of the added field:

enum X
{	ENUM_CONST_1; // line comment after ENUM_CONST_1
	String a; //$NON-NLS-1$
	String b;
	int MISSING; //$NON-NLS-2$
}

On the other hand, for class declarations everything works as expected:

class X
{	String a; //$NON-NLS-1$
	String b; //$NON-NLS-2$
	int MISSING;
}

Therefore, if we are using NON-NLS markers (similarly to above), they are being moved to a different declaration when another declaration is added. The original declaration looses its marker.

It is important for EMF Merge tool to preserve the original NON-NLS markers when new declarations are added.

Potential temporary workaround would be to extend the range of such nodes by extending TargetSourceRangeComputer and looking through the source after the end of the node. This workaround is lengthy to implement :) and will create performance overhead as well.
Comment 1 Dmitry Denisov CLA 2006-11-22 16:18:48 EST
Created attachment 54367 [details]
Test case illustrating the problem

ASTParser: read extended ranges for class and enum declarations:

testReadClassFieldDeclarations
testReadEnumFieldDeclarations - fails

ASTRewrite: add new field declaration to class and enum

testAddClassField
testAddEnumField - fails
Comment 2 Olivier Thomann CLA 2006-11-22 20:53:29 EST
I am not really sure that the enum case is similar to the class declaration case.
In an enum declaration the semi-colon is a separator between enum constants and the enum body declarations. It doesn't represent the "end" of the enum declaration.
Frederic, Martin,
any thoughts?
Comment 3 Frederic Fusier CLA 2006-11-23 05:23:57 EST
Sounds to be a bug in DefaultCommentMapper, I'll investigate
Comment 4 Martin Aeschlimann CLA 2006-11-23 05:50:25 EST
I agree with Olivier's comment 2. The semicolon is not part of the enum declaration, but marks the end of all declarations. Mapping the comment to the variable is not possible.
Comment 5 Frederic Fusier CLA 2006-11-23 06:18:26 EST
(In reply to comment #4)
> I agree with Olivier's comment 2. The semicolon is not part of the enum
> declaration, but marks the end of all declarations. Mapping the comment to the
> variable is not possible.
> 
Even if the semicolon has a different meaning for enum constants declaration, I think we should be able to attach comment to each enum constant:
public enum E2 {
	ENUM_CONST_1, // line comment after ENUM_CONST_1
	ENUM_CONST_2; // line comment after ENUM_CONST_2
}
I'll see if this is feasible in DefaultCommentMapper

Also, note that DefaultCommentMapper fails to attach non-nls comment to last field declaration => there's a real problem here...
Comment 6 Dmitry Denisov CLA 2006-11-23 09:49:34 EST
Agreeing with comment 5 regarding comment 4,

the main real problem is failing to attach the line comment of the last field of enum to the field itself (which causes the non-nls marker to move to a new declaration). The semicolon at the end of enum constants should not affect the field declarations.


Thinking about a different problem mentioned in comment 5, 

it would be very desired to have a possibility to keep the non-nls markers of enum constants attached to them. Is it possible to add a constant with non-nls marker and keep existing constants with attached markers? For example, if we have something like this:

public enum E2 {
        ENUM_CONST_1("String constant 1"), //$NON-NLS-1$
        ENUM_CONST_2("String constant 2"); //$NON-NLS-1$
...
}

would be possible to add another constant with its own marker, and get something like this after rewrite?

public enum E2 {
        ENUM_CONST_1("String constant 1"), //$NON-NLS-1$
        ENUM_CONST_1_1("String constant 1.1"), //$NON-NLS-1$
        ENUM_CONST_2("String constant 2"); //$NON-NLS-1$
...
}
Comment 7 Martin Aeschlimann CLA 2006-11-23 10:02:38 EST
Changing this as suggested would break the ast rewriter which expects to find a semicolon after the extended range of the last enum declaration.
A similar example would be to assing the comment to 'arg' in the following example:
  foo(arg0,
      arg1, // comment
      arg2);
Here it would be ok:
  foo(arg0,
      arg1 /* comment */,
      arg2);
Comment 8 Frederic Fusier CLA 2006-11-23 12:25:00 EST
There are 2 problems here:

1) The enum constant declaration node range does not include the semi-colon or the colon at the end of the node. As Martin, said, it's similar to method arguments. DefaultCommentMapper algorithm does expect white space token after the node end to be able to attach trailing comments... As this is not the case (due to the semi-colon or colon), it cannot attach these comments to the constant declaration.

The only workaround to keep non-nls comments to these constants would be to change the colon and semi-colon position:
public enum E2 {
        ENUM_CONST_1("String constant 1") //$NON-NLS-1$
        , ENUM_CONST_2("String constant 2") //$NON-NLS-1$
        ;
...
}
as usage of block comments would lose non-nls comments meaning...

Martin is it possible to modify ASTRewriter to accept node extension including trailing comments? If not, I doubt we can fix this problem...


2) The other issue is about following example:
enum X
{       ENUM_CONST_1; // line comment after ENUM_CONST_1
        String a; //$NON-NLS-1$
        String b; //$NON-NLS-2$
}
and field declaration of 'b' does not include trailing comment.
This one is a different problem of DefaultCommentMapper which could be easily fixed.
Comment 9 Martin Aeschlimann CLA 2006-11-23 12:35:18 EST
The AST rewriter can't be changed in this direction. You can see this on the example of a moved name: A move contains the name and it's comments. This would mean the comma would be moved as well, or the code gets really complicated.
Comment 10 Dmitry Denisov CLA 2006-11-23 13:43:34 EST
Regarding comment 9, I can easily see that it would be very difficult to associate comments after the comma or semicolon with the node.

Using the workaround suggested in comment 8, problem 1, there is the same problem with enum constants as with the last field. Enum constants do not include non-nls markers in their extended range. I.e. in example from comment 8, problem 1,

public enum E2 {
        ENUM_CONST_1("String constant 1") //$NON-NLS-1$
        , ENUM_CONST_2("String constant 2") //$NON-NLS-1$
        ;
...
}

(Note the placement of commas and semicolon.)

The extended range of both constants do not include non-nls markers. Therefore, when a new constants are added, non-nls markers are moved:

public enum E2 {
        ENUM_CONST_1("String constant 1") //$NON-NLS-1$
        , ENUM_CONST_2("String constant 2"), MISSING //$NON-NLS-1$
        ;
...
}

While the expected output is

public enum E2 {
        ENUM_CONST_1("String constant 1") //$NON-NLS-1$
        , ENUM_CONST_2("String constant 2") //$NON-NLS-1$
        MISSING;
...
}

I will attach the updated test case that illustrates this problem.

Would it possible to fix the range for this constants as well, similarly to the last field declaration?
Comment 11 Dmitry Denisov CLA 2006-11-23 13:47:17 EST
Created attachment 54427 [details]
Test case shows handling comments in enum constants and fields

Test case illustrates original problem with the last field and problem with enum constants described in comment 10.
Comment 12 Martin Aeschlimann CLA 2006-11-24 03:25:49 EST
Note that NON-NLS comments are bad example for comment mapping. NON-NLS always belong to a line, associating them with a element isn't the right thing to do.

String x= "Hello", y= "World"; //$NON-NLS-1$ //$NON-NLS-2$

It would have been better if non-nls comments would have been designed like this:
String x= "Hello"/*NON_NLS*/, y= "World"/*NON_NLS*/;


Comment 13 Frederic Fusier CLA 2006-11-24 06:27:08 EST
Created attachment 54468 [details]
Proposed patch

This patch fixes problem 2) described in comment 8. Provided test case works fine now if you modify the expected output while adding constants as follows:
  protected String expectedEnumSourceAfterAddingEnumConst =
    "enum E\n" +
    "{" +
    "\tENUM_CONST_1(\"String const 1\") //$NON-NLS-1$\n" +
    "\t, MISSING, ENUM_CONST_2(\"String const 2\") //$NON-NLS-2$\n" +
    ", MISSING\n" +
    "\t;\n" +
    "\tString a; //$NON-NLS-1$\n" +
    "\tString b; //$NON-NLS-2$\n" +    
    "}\n";
Comment 14 Frederic Fusier CLA 2006-11-24 08:33:55 EST
Released for 3.3 M4 in HEAD stream.
Comment 15 David Audel CLA 2006-12-12 07:10:32 EST
Verified for 3.3M4 with I20061212-0010.