Community
Participate
Working Groups
20030829 Altough the AST is designed to only guarantee completness on compilable source Quick fixes and other code generation functionality gain a lot when the AST is robust against some smaller errors, similar to code assist. The following cases would nice if handled better: private void foo(String c) { fo(null,); c.toString(); } The comma in the method invocation blocks the generation of all other statements. private void foo(String c) { fo(null) c.toString(); } The missing semicolon also prevents all other statements to be in the AST.
The parser jumps over methods that contain syntax error in order to be able to generate statements for other methods. We don't have any recovery at the statement level for now.
*** Bug 44983 has been marked as a duplicate of this bug. ***
Post 3.0
reopening
*** Bug 53243 has been marked as a duplicate of this bug. ***
This is an optional item for M4. We should investigate some statement recovery.
Deferring.
David - pls document where you ended up and attach patch.
Deferring post 3.1
*** Bug 93141 has been marked as a duplicate of this bug. ***
*** Bug 97820 has been marked as a duplicate of this bug. ***
Reopening for consideration within 3.2 cycle.
JDT/UI scenarios: - statement recovery foo() ==> foo(); - method invocation recovery foo(10, ==> foo(10) or foo(10, null, ...) where null is the default value for the missing parameters. Having this recovery allows selecting 10 and extract it to a local or method without completing the invocation. - anonymous class recovery Runnable runnable= new Runnable() { ru<Ctrl+Space> } should still return node and binding for anonymous class
*** Bug 110322 has been marked as a duplicate of this bug. ***
Other cases where I often miss an AST: - Expression that is not a valid ExpressionStatement, e.g.: array[1]; "Hello" + "World"; fField; Especially for the first two examples, it would be nice to have an AST in order to apply the "Assign to new local variable" quick fix. - missing conditions: if () { // -> recover to "if (false)" System.out.println("hello"); } for() { -> recover to "for(;;)" System.out.println(i + " is " + array[i]); }
A scenario that I often have: I started typing something, but realize that I have to do something else e.g. a line above. The code fragment (here 'file.get') is still around and prevents that errors are shown on the line above (getCanonicalFile would return a 'File') and and that I can use a quick fix that would fix the variable type. As a result I first have to fix the fragment (remove it or complete it) before I can use the quick fix. Here it would make sense to hava heuristic that treat 'file.get' as a complete statement, for example because it is on its own line. public void foo() { String file= new File("x.txt").getCanonicalFile(); file.get Map map= new HashMap(); return map; }
The scenario on comment #16 is also very common for me.
Created attachment 31884 [details] statements recovery I have an implementation of statements recovery. This implementation isn't complete and i must test it more deeply to be sure of his reliability. The current steps to parse a file are 1 - parse top level structure (diet parse) 2 - if there was a syntax error perform a recovery parse of the top level structure 3 - for each methods/initiliazers body (method parse): A - parse statements of the method/initiliazer B - if there was a syntax error in the method/initiliazer stop to parse this method The new steps to parse a file are 1 - parse top level structure (diet parse) 2 - if there is a syntax error perform a recovery parse of the top level structure 3 - for each methods/initiliazers body (method parse): A - parse statements of the method/initiliazer B - if there was a syntax error in the method/initiliazer perform a recovery parse similar to the top level structure recovery parse C - if there was a recovery parse perform a statments recovery parse The new steps are B and C. B recover only local types and members of local types. C recover only statements and not local types and members of local types. This two step are seperate to be able to keep the same type structure as the diet parse B use the same mechanism as our current recovery C use the result of the syntax diagnosis to add or remove suggested tokens. The suggested non terminals symbols are replaced by a predefined set of terminal tokens (eg Block is replaced by '{' '}').
I released the statement recovery algorythm in HEAD. Currently this new behavior is disabled, so the option JavaCore.COMPILER_STATEMENTS_RECOVERY must be set to ENABLED to use it (It's a temporary option). As said in comment 18, the statement recovery use the result of the syntax error diagnosis and apply the suggestion of the diagnosis. In the released code all the diagnosis are not applied because the mapping between suggested non-terminal symbols and a list of terminal symbols is not complete. I tested performances of the Parser. It is near 0% slower than before when there is no syntax error. And it is at worst 300% slower when there is a lot of errors and statements. it is 300ms to parse a type with one method which contain 20000 simple statements and 100 syntax errors instead of 100ms before.
Created attachment 33154 [details] Released code
I added a new API for DOM ASTNode flag to identify recovered nodes: ASTNode.RECOVERED It is a flag constant indicating that a node or a part of this node is recovered from source that contains a syntax error detected in the vicinity. class X extends Y{ void foo() { bar() } } In this example the following nodes will be flag as RECOVERED MethodInvocation:bar MethodDeclaration:foo TypeDeclaration:X CompilationUnit
In the ASTRewrite we need to know which nodes are recovered, as rewriting these nodes might be special. I would therefore prefer the flags only on the really recovered nodes. To ask a parent if it cotains some recovered node is interesting, but it should be possible to know if its the node that is recovered, or just one of the children. It also seems to me that we have to be careful not to break clients that assume that every node in the AST is complete. What about having a flag in the ASTParser to enable recovered nodes? That's also important performance wise. Some clients are maybe only interested in correct ASTs, and as they would never use a recovered AST, they don't want to pay any performance penalty.
reply to comment 22 The reason why i flag as RECOVERED a node and his enclosing nodes is because when a a part of source code is syntaxically incorrect all enclosing code parts are also syntaxically incorrect. So if a part of a node is recovered, the node itself is recovered. But i understand your need to recognize the deeper recovered part of a node. I investigate how to improve this flag. I could: - change nothing =) or - add a different flag for enclosing nodes and "really" recovered node. or - flag only the "really" recovered node and let the client visit the DOM AST to recognize enclosing recovered nodes. Client already have to manipulate 'incorrect' node because we give an ast even if there is a syntax error in a type declaration or a method declaration. By 'incorrect' i mean a node that does not really represent the content of the source file. Another point is that all nodes created by statement recovery are complete but they can have an empty string as name. And your are right, performance could be a problem. Is your suggestion to add ASTParser.setStatementRecovery(boolean enabled) and disable statement recovery by default? if i do that clients won't have recovered statements for free but i am agree that we will avoid to potentially harm a client.
I added the API method ASTParser#setStatementsRecovery(boolean enabled) to enable statements recovery in ASTParser. It's disabled by default.
I added the API method ICompilationUnit#reconcile(int astLevel, boolean forceProblemDetection, boolean enableStatementsRecovery, WorkingCopyOwner owner, IProgressMonitor monitor) to enable/disable statements recovery during reconcile.
see also bug 124662.
I changed the way to flag nodes as RECOVERED. Now only nodes which contains added/removed/replaced tokens are flagged. A parent of these kind of node isn't flagged. eg. class X extends Y{ void foo() { bar() // thw missing part is ";" } } Only the ExpressionStatement node is flagged. Sometimes our heuristic can't recognize the really recovered node, so in this case all potentially recovered nodes are flagged. eg. class X extends Y{ void foo() { for(int i // the missing part is ";;);" } } Only the last EmptyStatement and the ForStatement should be flagged but our heuristic fail to find them so Block, ForStatement, VariableDeclarationStatement, VariableDeclarationFragment and EmptyStatement are flagged. I will try to improve our heuristic. This behavior is caused by the fact that only our Scanner know what tokens are recovered not our Parser. Make our Parser aware of these recovered tokens would be too heavy (in term of changes and perfomance).
Created attachment 34338 [details] Released code
I think we can consider this bug as fixed. If problems are found, they should be write in their own bug report. Close as FIXED
*** Bug 120870 has been marked as a duplicate of this bug. ***
Verified for 3.2 M5 using build I20060215-0010. I've reassigned 3 JDT/UI bugs as they were wrongly set as duplicate of this one although they were depending on: - bug 44964 - bug 53243 - bug 93141 I've also noticed a sample where robust AST could be improved: - comment 15 - missing conditions: the ForStatement is empty although it should not