Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jdt-core-dev] leading and trailing comments and whitespace for AST/DOM nodes

A parallel tree for comments would be fine.
 
Regards
 
Jonathan Gossage
----- Original Message -----
Sent: Tuesday, December 17, 2002 11:43 AM
Subject: RE: [jdt-core-dev] leading and trailing comments and whitespace for AST/DOM nodes


Here is the refactoring view:

Currently we don't handle comments in a special way. This means that in some cases we loose comments or
leave comments at the original locaction when we move or somehow else modify source code. So we would
highly benefit from a general comment story provided by the AST. Whitespace are less important for us since
we have a solution to preserve formatting information (e.g. whitespace and comments) when modifying and
rewriting an AST.

The following items are important for us:

- The rules source positions adhere to (e.g. parent.start < children[0].start, .....) should not be broken or modified
   when introducing support for comments.

- simply adding the comments to the source range of an ASTNode will break your current refactorings. They assume
  that [getStartPosition(), getStartPosition() - getLength() - 1] cover only the statement relevant characters and no
  preceding or trailing comments. IMO changing this would break the spec of the ASTNodes.

Introducing composite nodes as suggested by Jonathan would IMO lead to problems where subnodes assume a special
kind of child nodes. Consider a VariableDeclarationStatement which contains a list of VariableDeclarationFragement.
If we introduce a special node, what are the subnodes of the declaration statement if we have preceding or trailing
comments for declaration fragments ?

So instead of merging comments into the existing AST we could build a "comment tree" and provide methods to connect
ASTNodes to comments and vice versa. We could built the special comment tree on request and that tree should handle
all the cases described in Jonathan's mail.

Dirk



"Jonathan Gossage" <jgossage@xxxxxxxx>
Sent by: jdt-core-dev-admin@xxxxxxxxxxx

12/17/2002 10:47 AM
Please respond to jdt-core-dev

       
        To:        <jdt-core-dev@xxxxxxxxxxx>
        cc:        
        Subject:        RE: [jdt-core-dev] leading and trailing comments and whitespace for AST/DOM nodes




> >-----Original Message-----
> >From: jdt-core-dev-admin@xxxxxxxxxxx
> >[mailto:jdt-core-dev-admin@xxxxxxxxxxx]On Behalf Of
> >Jim_des_Rivieres@xxxxxxxxxx
> >Sent: December 16, 2002 5:36 PM
> >To: jdt-ui-dev@xxxxxxxxxxx; jdt-core-dev@xxxxxxxxxxx
> >Subject: [jdt-core-dev] leading and trailing comments and whitespace for
> >AST/DOM nodes
> >
> >
> >We're trying to decide how to deal with leading and trailing whitespace
> >and comments for AST/DOM nodes, and we need your input.
> >
> >ref: http://bugs.eclipse.org/bugs/show_bug.cgi?id=28268
> >
> >Summary of where we are right now:
> >- source range extends from 1st character of 1st real token through
> >last character of last real token matched by grammar rule for node type;
> >leading whitespace and comments, and trailing whitespace, comments are
> >NOT INCLUDED in source range with the exception of Javadoc comments
> >- for BodyDeclarations, the Javadoc comment is treated like a token and
> >is represented by a Javadoc node
> >- Statement.get/setLeadingComment allows for a single comment before
> >the statement; however, AST.parseCompilationUnit has never associated
> >leading comments with any statement nodes it creates
> >
> >So where do we go from here? At the very least, we should
> >(1) clarify this contract in the API spec
> >(2) delete (deprecate) Statement.get/setLeadingComment
> >
> >This would at least gives us a minimal, consistent, approach for
> >leading and trailing comments. The question is: is it worthwhile doing
> >more?
> >The general approach to date has been that AST/DOM clients interested in
> >finer-grained lexical issues should rescan the source in the vicinity of
> >the construct to find what they're interesting in. This is reasonably
> >straightforward, given org.eclipse.jdt.core.compiler.IScanner
> >and accurate
> >and consistent source ranges for all nodes and ancestors
> >
> >We could add a second, "extended" source range to certain node types
> >like statements, body declarations, import declarations, and package
> >declarations that would
> >"round up" to the "natural" source line boundary to better align
> >with what
> >a human author would
> >consider the source range for a construct.
> >
> >Q: If the API had this, would clients use it (given that many of them
> >already
> >have their own scanners and have to do this sort of thing other places
> >too)? If yes, what are
> >"natural" boundaries that this API should recognize?
> >
> >Your input would be appreciated.
> >
> >Thanks,
> >jeem
> >
Ideally, I would like to see all comments and other whitespace accounted for
in the AST. This becomes important if you want to provide a renderer that
can take an AST and produce human readable source code. Specifically I would
like to see the following kinds of nodes for dealing with whitespace.

1. A node that describes a consecutive run of whitespace characters
including runs of a single character. This would always be a leaf node. This
node would be ignored by compilers and tools that are only interested in the
Java content.
2. A node that describes for a multi-line comment (i.e. /* */). This would
also be a leaf node. Again this node would be ignored by tools that are not
interested.
3. A node describing a single line comment (i.e // ...). This would be a
leaf node and would be ignored by tools that are not interested.
4. A composite node that would be a parent to any node with associated
comments. The children would be the associated comment and white space nodes
and the Java node. To me the following rules for associating comments and
white space with Java constructs make sense.

a) Recognize and preserve a file comment block at the start of a compilation
unit. This could take the form of a single multi-line comment or it could be
a consecutive range of single line comments and blank lines. This should
result in a composite node with one or more comment and whitespace nodes
under it.
b) Recognize any consecutive run of blank lines, multi-line comments or
single line comments as a comment block to be attached to the following java
node type using the list of types you specified with the addition of field
declarations. Here there would be a composite node with the
whitespace/comment nodes and the Java node underneath.
c) If a single or multi-line comment is found immediately following, on the
same line, any of the nodes defined above, collect it and all single line
comments and whitespace that immediately follow. The intent is to deal with
constructs such as the following:

   {...} // comment
         // comment2

or statement; // comment
             // comment2

or statement; /* comment1
                comment2 */

d) Multiline or single line comments embedded within a statement should
simply generate a comment node without any composite node. For example
  invoke( a, // comment 1
          b /* comment2 */ );
would simply generate two comment nodes plus the associated whitespace
nodes.

Since there could be a substantial space penalty in generating these nodes,
consideration should be given to making the generation optional. This would
allow tools that need this kind of information to have it without penalizing
conventional tool users.

The composite node I mentioned above is a specialized instance of a more
general capability that I would like to see in the AST. I would like to
introduce the concept of a intermediate node that could be used by tools
that generate source code fragments. Such a node would have two sub-trees
under it, one consisting of nodes that only have meaning to a specific tool,
and the other that contains the generated code as an AST fragment. This type
of node would allow tools to present constructs at a high conceptual level
to the developer while also giving compilers etc. direct access to the
generated source code.

Regards

Jonathan Gossage


_______________________________________________
jdt-core-dev mailing list
jdt-core-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/jdt-core-dev


Back to the top