Community
Participate
Working Groups
20060224 The following cases of statement recovery are interesting for 'assing to local' public void foo(Point p, int[] a) { p.x; a[0]; } There is statement recovery, but a strange conbination of nodes is created: ExpressionStatement (recovered) EXPRESSION Assignment (recovered) LEFT_HAND_SIDE ArrayAccess OPERATOR: '=' RIGHT_HAND_SIDE SimpleName (recovered) The assigment node seems unnecessary and needs extra processing on our side to be detected. It would be more natural to simply get: ExpressionStatement (recovered) EXPRESSION ArrayAccess [349, 6] If this is difficult to fix in the compiler AST, maybe you can detect the case when converting to the DOM AST.
I haven't the same AST as you with your test case. ExpressionStatement (recovered) EXPRESSION Assignment LEFT_HAND_SIDE QualifiedName OPERATOR: '=' RIGHT_HAND_SIDE ArrayAccess I have the same AST as you with the following test case public void foo(Point p, int[] a) { a[0]; } Statements recovery try to rebuild 'syntactically' correct AST from code with syntax errors and ExpressionStatement (recovered) EXPRESSION ArrayAccess isn't a 'syntactically' correct AST. Build AST as you suggest would be doable when converting to the DOM AST. But i believe there is less chance to break a client by produce only 'syntactically' correct AST. Are DOM AST with fake assignment expression really very problematic for you ?
I think if the bit 'recovered' is set, a user must assume that the AST has some strangeness, and that makes it ok to have an ArrayAccess in a expression statement. If this is a 'syntax' error or not really depends on the compiler implementation, e.g. how the EBNF rules are defined. Also note that these rules are not defined in the spec and it was always possible to create such an AST by hand (using ast.new...) But I guess this is a topic worth discussing. A similar case is if (true) int i=9; It is also not allowed to have a variable declaration in a if block, but the recovered AST does it. The fake assignment complicates our code, as we have to detect that pattern (assignment with fake right hand side) at several places in quick fix. I would defintly prefer the simple ExpressionStatement - ArrayAccess or ExpressionStatement - qualified name which represents the code. I guess we have to decide: - define and follow some of the unwritten AST rules like: expression statement can only contain assigment, method invocation, ... - define that in an recovered AST only types must be matched (e.g. if a block defines to contains only statements)
I would also go for Martin's suggestion no. 2 (in an recovered AST, only types must be matched). I think the principle should be that the AST tries to match the given source as closely as possible by introducing as few "empty" nodes and inexistent tokens as possible. Syntactic correctness in terms of node types is less important - it just has to be correct enough to fit the child type requirements of the parent node. The example from comment 1 is problematic, since the operator '=' and the rightHandSide of the Assignment don't have a source counterpart. The source range for SimpleName of the rightHandSide is even wrong in this case, since it overlaps with the source range of the ArrayAccess (ASTParser#setKind(..) forbids overlapping siblings).
Just to make this complete: For the code from comment 0, the ideal would be: ExpressionStatement [100, 4] (recovered) EXPRESSION QualifiedName [100, 3] ExpressionStatement [111, 5] (recovered) EXPRESSION ArrayAccess [111, 4] And likewise for the code from comment 1: ExpressionStatement [100, 4] (recovered) EXPRESSION QualifiedName [100, 3]
David, Should I investigate a fix during the conversion ?
The fix should be in the conversion as only DOM would need of this kind of behavior. Olivier - I have a fix for this problem. Could you review it ? unless you already have your own fix.
Created attachment 37984 [details] Possible fix
public void foo(Point p, int[] a) { p.x; a[0]; } creates a tree like this: ASSIGNMENT: (Recovered) LEFT_HAND_SIDE: QUALIFIED_NAME (p.x) RIGHT_HAND_SIDE: ARRAY_ACCESS (a[0]). I don't see how we can handle this. But cases like: public void foo(Point p, int[] a) { p.x; } or: public void foo(Point p, int[] a) { a[0]; } or: public void foo() { int x = ; } can be handled the way you expected them.
Created attachment 37996 [details] Another proposal
Created attachment 38003 [details] Better patch
Created attachment 38004 [details] Last patch With this patch, we support: void foo(....) { a[0]; } or: void foo(....) { p.x; } or: void foo(....) { int x = ; } but not: void foo(....) { p.x; a[0]; } In order to support this case, we would need to iterate over the list of statements() for each method again to detect this case. We need to replace one node with two new nodes. This requires to do it inside the endVisit of a method declaration. So this would have an impact on performance. This is not a case that occurs enough to justify a specific handling.
Philippe, +1 for RC1?
+1 for 3.2RC1, fix looks good, and UI requested it specifically for RC1.
Fixed and released in HEAD. Regression tests added in org.eclipse.jdt.core.tests.dom.ASTConverterTestAST3_2.test0642/0643/0644
Verified for 3.2 RC1 using build I20060413-0010