Community
Participate
Working Groups
When a method has multiple return statements, it cannot be inlined. The message recieved is "Can't inline call. Return statement in method declaration interrupts execution flow.", which is a big improvement over previous versions where it would just inline it and break your code, but I'd like to try and push it further. An example case: public String getString(boolean test) { if(test) { return "True Value"; } else { System.out.println("It was false"); return "False Value"; } } public String doSomeStuff() { boolean value = true; String string = getString(value); return string; } The getString method cannot be inlined. In this case, the manually refactored code would be: public String doSomeStuff() { boolean value = true; String string; if(value) { string = "True Value"; } else { System.out.println("It was false"); string = "False Value"; } return string; } This is done be replacing the "return" statement with an assignment to the external variable, in much the same way as a single return statement refactoring. However, a more complex method to inline would be: public String getString(boolean test) { if(test) { return "True Value"; } System.out.println("It was false"); return "False Value"; } Inlining this code in the same way would result in the code: public String doSomeStuff() { boolean value = true; String string; if(value) { string = "True Value"; } System.out.println("It was false"); string = "False Value"; return string; } Which would incorrectly be returning "False Value" every time. I'd like to find a way to determine the difference between the first getString() method and the second so that the first may be safely inlined. It would be nice to have a refactoring that could convert multiple returns to single returns, so that it could be combined with the inline refactoring, but I don't see a way to easily do so without using a GOTO statement.
The first case can be inlined if you first inline the temp string. However the second case can never be inlined since it will change behaviour (determinable using flow analysis).
The second example can be inlined using a simple, if ugly, way to convert to a one-return method: public String getString(boolean test) { String returnValue; methodBody: { if(test) { returnValue = "True Value"; break methodBody; } System.out.println("It was false"); returnValue = "False Value"; break methodBody; } return returnValue; } I wonder wether this is generic enough to work in every case. See also the discussion at http://groups.yahoo.com/group/refactoring/message/3086
What I meant with "can never be inlined" is without refactoring the method itself.
If the following conditions are met: - If there is a return statement in an if..else block, every node of the block is terminated by a return statement. - There are no return statements within for/while loops. It is safe to replace every instance of: return value; with result = value; and inline the method.
A Further Previso: - If any node of a try...catch block is terminated return statement, all nodes of the try...catch block must be terminated by a return statement. - Finally blocks cannot contain return statements? Unsure on this.
For 3.0 we should handle the case public String getString(boolean test) { if(test) { return "True Value"; } else { System.out.println("It was false"); return "False Value"; } } Time permitted, if we investigate in rewriting the control flow for the other mentioned cases.
Solution to this bug can be divided into two almost separate parts. First part is modification of flow analysis allowing certain conditions when a method with multiple returns can be inlined. Secondary, inlining functionality itself must be modified because currently it is based on String operations and hardly extendible to support methods with multiple returns. I'm going to use AST for code replacements and this must be done before modification of flow analysis.
Dmirty, the inline method code already uses AST rewriting to modify the code to be inlined depending on the caller context (see class SourceProvider). The reason why at the end strings are cut from the source (the CU defining the method) and pasted into the target is to preserve as much formatting as possible. If you copy the nodes from the source into the target (which is possible) the inlined code will be reformatted.
Dirk, I know that AST is used during rewriting but it is mostly used through ASTRewrite.createPlaceholder calls and method getCodeBlocks of SourceProvider returns array of Strings, so all the original AST information is lost for CallInliner. Method CallInliner.replaceCall implies that the last String object in the array of Strings contains a return statement. This will not work in case of mutliple return statements. I propose to change implementation and use AST functionality wherever it is possible: 1. copy original body declaration using ASTNode.copySubtree method 2. SourceProvider class does substitution of all parameters and implicit receivers 3. resulting AST structure is returned to CallInliner class 4. CallInliner processes every return statement and adds explicit casts, local variables where it is needed 5. Finally, ASTRewriter is used for actual text modifications Do you think that it will be a problem if original formatting is lost and inlined code is reformatted by standard Eclipse formatter?
Dmirty, the downside of this approach is that you loose formatting and all comments. Currently the AST doesn't have nodes for comments. So after rewriting the AST copy of the method to be inlined all comments will be lost. Formatting might not be a big issue in the future since the new formatter will support lots of options. Do you see an possibility to keep the current approach of rewriting the source and returning (a) string(s) ? I am reluctent to accept a solution where we loose comments and it is not clear when the AST will support comment nodes.
Created attachment 5813 [details] Demonstration of inline method refactoring implementation based on AST nodes Hi Dirk, Please take a look at the attached draft implementation of inline method refactoring based on AST transformations. The patch doesn't implement requested feature allowing inlining of methods with multiple returns, but it will be an easy next step requiring only changes in flow analysis. Using AST nodes for modifications of original method body allowed to remove limitation on inlining complex methods used in multi-fragment variable declarations and initializers of for statements. Planned improvements(https://bugs.eclipse.org/bugs/show_bug.cgi?id=37657) of code formatter must reduce impact of loosing original method formatting. In regards to comments, current implementation doesn't save all of them either. Statement class has a member optionalLeadingComment but it is always null, probably this is just another bug and after it gets fixed new inlining implementation will remain all leading comments. I think that further development of method inlining when strings are returned from SourceProvider and ASTRewriter is used everywhere in the code, will make it difficult to tackle many problems and will shift more and more funcionality from CallInliner to SourceProvider. The main problem I stumbled upon during the implementation was bindings resolving for copied AST nodes. Start positions and lengths of nodes are copied by ASTNode.copySubtree method, but node properties and bindings are lost. I haven't found any good way to replace binding resolver for copied nodes. The only solution I thought out was to support a map of start positions to bindings that is filled by passing through original method body and used later for copied nodes since start positions are the same. This is ugly :) Probably you can suggest something better than that. Fixed unit tests will be submitted in the next attachment.
Created attachment 5814 [details] Unit test update reflecting changes in formatting
Hi Dmirty, I looked at the code but unfortunatelly the patch can't be applied anymore. I fixed a bug in Inline method for M3 and had to change some signatures. Since you know your changes better, can you please try to create a patch on the latest. From what I found out looking and the "incomplete" patched code: I like the idea to do more by direct manipulation of the AST. But from the past I know that users are very picky regarding their formatting and there comments (we had lots of bug reports regarding these issues). So if possible we try to preserve comments and formatting whenever possible. Since we stumbled over the same problem then you did (need to do more in AST rewrite) we added some support to the AST rewriter. What we currently do is the follwing: we manipulate the code based on AST rewriting. If we are interested in portions of the code we mark the node as tracked using the API ASTRewrite.markAsTracked. After you have rewritten the code you can use the tracked position to find the node in the new source. We then create new node using ASTRewrite.createPlaceholder to use the code as a "normal" AST node. I would like to encourage you to have a look if this approach does work for you as well. From how I understand you code you split the statements to be inlined into three portions: before, replace and after. May be the three portions can be single statemens create via ASTRewrite.createPlaceholder since a statement place holder can old the source code of more than one statement. We also looked into attaching comments to AST nodes, but this isn't so easy either. After discussing several strategies we found out that this has to be done by some formatter related code and not the scanner/parser. Some comments regarding the implementation: As I understand the code you remove return statements from the code to be inlined. Doesn't this change the semantic. For example: void toBeInline() { if (true) { foo(); return; } bar(); return; } void calling() { toBeInlined(); removing the return statemens and inlining the code will change the sematic. Since you get: if (true) { foo(); } bar(); In the not inlined version bar() wasn't called. Am I missing something ? Regarding the loss of bindings when copying nodes: the reason is that a copy isn't related to a context anymore and can be added to a different context. Hence the original bindings may be wrong in the new context. Additionally some of the bindings are resolved lazily, which doesn't make to much sense in a different context. So there isn't a better solution than the one you implemented except manipulating the original AST ;-)). I highly appreciate your contributions! Looking forward reading your comments Dirk
Created attachment 5837 [details] Updated patch Hi Dirk, Here is an updated patch that can be applied to the latest version of source code. Method toBeInlined() from your example could be either denied for inlining by flow analysis or modified to have following form: void toBeInlined() { if (true) { foo(); return; } else { bar(); return; } } This modification would be done by some pre-processor method called prior to processReturnStatements. Why comments can't be supported as an AST node Comment derived from Expression class? That would allow processing of comments not only in refactoring but also use them for other purposes when ASTRewriter is not involved. Dmitry
Hi Dmirty, thanks again for the updated patch. Regarding AST and comments: treating comments as special expression doesn't work since comments can occur between all tokens, hence they can't be only expression. We investigated in this approach and we came to the conclusion that we then have to introduce for all nodes which can occur in list special comment nodes. So we dismissed that approach. Our current idea is to have a separate list of comments and some algorithm that maps the comments to nodes. This alogorithm is best located in the formatter/AST world since they know best what to do with comments. But given our current time constraints this might not be added for 3.0 (the team that is responsible for AST/formatting has the big item generics on their list). I discussed the loss of formatting and comments with my colleague here in Zurich and we agreed that "loss of formatting" might be acceptable with the new formatter but loss of comments isn't. So I would like to encourage you to think about the following alternative design: - we keep the three list of statments (replace, before, after) - currently all lists only contain one element created by a AST rewrite using place holders. This ensure that we keep comments but it makes it easy to convert the code easily if future releases of the AST have better support for comments. - I still prefer computing the three list in the source provide class. I get the feeling that call inliner may get to compilcated in the future (after adding flow analysis, ....) - we also change the rest of the signatures to ASTNode instead of strings (e.g. computeRealArguments) but do the same tricks as with the list to keep comments. I know that this doesn't produce the best maintainable code but given all the constraints we have I think it is the best solution. Let me know what you think about it?
Hi Dirk, Sorry for the long silence. I was on vacation … I tried to follow your proposal and came across several issues: 1. GroupDescriptions of tracked nodes get updated only upon a call to ASTRewriter.rewriteNode, right? That means we will have to do this operation twice, first time to substitute parameters, make names unique, update types etc. and the second time for return statements (or vice versa). 2. If tracked node gets replaced then there is no possibility to get new bounds of the replacing node. Consider following example: class Test { public int inlineMe(int p) { return p; } public void main() { int i = inlineMe(55); } } If return expressions are tracked then during parameter substitution p gets replaced with 55 and there would be no way to get new position of return expression because the original node was deleted. The same outcome would be if parameters were tracked and return statements replaced with assignments. 3. Placeholders can’t be used as target nodes in calls to ASTRewrite methods. If whole return statement is tracked then after all parameter substitutions I do following to get new bounds of the return statement: private ASTNode createReturnStatementPlaceholder(ReturnStatement rs) { TextRange range= fRewriter.getTrackedNodeData(rs).getTextRange(); String content= fBuffer.getContent(range.getOffset(), range.getLength()); return fRewriter.createPlaceholder(content, ASTRewrite.EXPRESSION); } The resulting node will be used on the left side in a call to ASTRewrite.markAsReplaced or in a call to ASTRewrite.markAsRemoved. I get assertion failure: "Node that is changed is not located inside of ASTRewrite root". The first issue is mostly about performance and code clearness. But the second and third items prevent me from moving forward. It looks to me that ASTRewrite must be extended prior to implementing functionality requested in this bug report. Am I missing something? What are your suggestions?
Hi Dmitry, good to "hear" you again. I hope you had nice vacation. Regarding your questions: 1.) May be I first give you some background: GroupDescription are used to collect edits generated by the AST rewrite. When executing edits they adjust there regions so that the reflect the position of the "original" edit in the modified source. Group descriptions are filled when calling ASTRewrite.rewriteNode. A node can be put into different group descriptions. So I don't really see why we have to rewrite two times. Why can't we have two different group description ? I guess I don't fully understand the problem. 2.) This should be doable, although not obvious. You have to mark the replacing node as tracked. You might want to have a look at the test case ASTRewritingTrackingTest.testNamesWithReplace(). 3.) This is correct. The AST doesn't support "recursive" modification. The current behavior is: - last modification wins, except for - inserted edits can't be manipulated again (e.g. marked as removed). To get rid of them you have to remove them from the list again. What exactly do you want to achieve ? Doesn't the suggestion in 2.) solve the porblem. Let me know if this helps. Dirk
Created attachment 6876 [details] New patch based on ASTRewrite class Hi Dirk, Back to the bug ... Please, take a look at the attached patch. It is totally based on ASTRewrite instead of AST manipulations as old patches were. That should keep some original comments and formatting. I'm a little worried about changes in ASTRewrite API. Old good methods like ASTRewrite.markAsInserted are deprecated now and new methods in NewASTRewrite class require type of inserted node (childProperty parameter) to be specified. The type can't be a generic STATEMENT but must be more specific, like BODY, ELSE_STATEMENT, CATCH_CLAUSES etc. That will complicate inlining functionality because source method can contain many statements of different types and the types will have to be saved along with nodes' placeholders. If you have any ideas how to overcome this problem then let me know and I will update the patch. Dmitry
Hi Dmitry, thanks again for the patch!! As always I appreciate your contributions. I looked at it and I have several comments: - couldn't the flow analyzer be used to check if all control pathes end in a return. It seems that your ReturnAnalyzer duplicates some of the code. You might want to have a look at ExtractMethodAnalyzer#analyzeSelection. The flow analyzer returns a flow info object which gives you information about the return statements in the analyzed code - There are two unnecessary methods in ASTNodes which aren't referenced. We should remove isChild and move the code to the refactoring. The method could then use ASTNode.isParent which avoid iterating over the AST tree. - Trying to inline inlineMe in the following example ends in an exception. public class InlineTest { public int inlineMe() { if (true) return 10; else return 20; } public void foo() { for (int i= inlineMe(); i < 10; i++) { } } } Regaring your question: we discovered that as well and currently the method isn't deprecated anymore until we can come up with a better solution.
Created attachment 7066 [details] Updated patch Dirk, I followed your recommendations and updated the patch. Also I have fixed the bug. It only occured when inlining of all occurences was chosen in the dialog box and reason was a confusion of source and destination ASTs. In regards to merging ReturnAnalyzer into flow analyzer, I don't quite agree with you. First of all, analysis of return statements is not accomplished for every invocation. In current implementation there is one situation that omits the analysis - it is when direct parent of method invocation is a return statement. Flow analysis is performed every time. Another difference between ReturnAnalyzer and InputFlowAnalyzer classes is a number of instances created during method body traversing. ReturnAnalyzer is recurrent and creates multiple instances of itself. This is not efficient but allowed to keep ReturnAnalyzer code very simple and easy to maintain. On the other hand, there is only one instance of InputFlowAnalyzer class created per method body. Incorporating ReturnAnalyzer into InputFlowAnalyzer will require major code changes of ReturnAnalyzer or InputFlowAnalyzer class. What do you think? Dmitry
Dmitry, I didn't want to suggest that we merge them. I though about something like this: InOutFlowAnalyzer flowAnalyzer= new InOutFlowAnalyzer(fInputFlowContext); FlowInfo info= flowAnalyzer.perform(getBodyStatements()); if (info.isPartialReturn()) // we can't inline.... if (info.isValueReturn()) // method has return statemens like return value; in all control pathes // find all return statements using a special analyzer to be able to re- // write them if (info.isVoidReturn()) // method has return statement return; in all control pathes. Do you think this is possible ?
This is definitely possible, but I don't understand why we should do it this way. CallInliner uses InputFlowAnalyzer which is also used by ExtractMethodAnalyzer class. Return statement analysis is only required for CallInliner and not even for all situations. That means InputFlowAnalyzer.perform method must not analyze return statements. InputFlowAnalyzer methods isPartialReturn, isValueReturn and isVoidReturn will have to perform return statement analysis on first call to any of them. Also if info.isValueReturn() returns true we will need to iterate over AST tree again to collect all return statements. This is not efficient and must be avoided where possible. Could you clarify what are your reasons for the proposed implementation ? Dmitry
The reason why I am asking to reuse the existing code is to minimize the amount of code that has a "certain" complexity. I agree that in the case isValueReturn is true we have to itereate the modthod body again, but this visitor will be very simple. We can even optimize the flow analysis by inly considering return values. This can be done by setting the RETURN_VALUEs mode to the FlowContext.setComputeMode. While looking at the code again a new questions came up: - why is it necessary to analyze the return statements for every call to be inlined. Wouldn't it be sufficient to do it once before we start inlining. I couldn't come up with a scenario where the result of the return analyzer depends on the calling context. And a little bug I found: public String bar() { return null; } public void foo(int i) { bar(); bar(); } Inline method bar at all invocation sides. The refactoring stops with a fatal error complaining about overlapping text edits. Let me know what you think. Dirk
Dirk, I understand your point and will take a closer look at FlowInfo classes. Regarding dependency of return analyzer on invocation context ... If method invocation is a return expression then in this case we can inline any method without analyzing return statements. This case was handled separately to skip iterating method body, now I think that we should analyze return statements always and only once before method inlining. Then results of the analysis will be reused for every inlined invocation. I have investigated the bug you mentioned in last comment. Both invocations are inlined, as a result there are produced two edits deleting both 'bar();' statements. The problem is that the delete edits intersect. They delete not only statements themselves but also spaces and CRLFs before or after statement being deleted. Therefore spaces after first invocation and before second are deleted two times. What is the best way to merge these two edits and adjust their positons ?
Dirk, I looked at FlowInfo classes and have several comments. First of all, there is no support for return statements in loops. What I'd like to see is a method FlowInfo.isLoopReturn returning true if there is at least one return statement inside a loop body. This would allow us to show more clear explanation to user why the method can't be inlined. public int foo() { if (true) { return 1; } return 2; } Flow analysis of the above method produces PARTIAL_RETURN for 'if' statement and VALUE_RETURN for last 'return' statement. Then both these return kinds are merged by FlowInfo.mergeExecutionFlowSequential. Result is VALUE_RETURN return kind for method body. This is not acceptable for method inlining. What would you suggest in this situation? I can either extend FlowInfo classes with needed functionality or leave ReturnAnalyzer as it is now.
Dmitry, I am on vaction unitl Januar, 5th and I will very likely not find the time to look at this.
Created attachment 7311 [details] The updated patch reuses FlowInfo classes to analyse return statements
Dirk, Did you have a chance to look at the patch ? Dmitry
Dmitry, I apologize for the log delay. I am activly working to get the refactoring API out of the door and the target is to have something for EclipseCon which is next week (btw are you participating at EclipseCon ?). And this basically absorbs all my programming time. I will do my best to look at the patch until the end this week. Is this still fine with you ? Dirk
This is absolutely fine with me. I just want to continue fixing bugs in inline method refactoring functionality. Unfortunatelly, I will not participate at EclipseCon this year. Probably, next time if my work load allows me that. Dmitry
Created attachment 7891 [details] Patch containing a modified version of the flow analysis changes Hi Dmitry, I finally found the time to look at our patch. Unfortunatelly some of the refactoring code changed so I couldn't fully apply the patch. I think the most interesting part currently are the modification done to the flow analysis. I reviewed them and did some minor changes. Mostly wording and I enhanced the method perform(ASTNode) you added. Due to some limitation of the flow analyzer it can only accept body declarations except types and node that are inside a body declaration that is accepted. Please let me know if the changes I did are OK for you. If so I am going to put the changes into head so that you can continue working on the bug.
Created attachment 7943 [details] Updated patch Dirk, I agree with your changes. I have merged FlowInfo changes into the patch and updated it, so it can be applied to latest source code now. Dmitry
*** Bug 51966 has been marked as a duplicate of this bug. ***
Created attachment 8737 [details] Updated patch
Hi Dmitry, I looked at the code and it seems OK to me, except of the following issues: Code structuring ================ I moved some methods back from CallInliner to SourceProvider (the ones dealing with creating the source lists). I also changed the order of some parameters. A "result" passed as a parameter should always be passed as the first argument. Handling of rewriters/asts: ============================ there seems to a bogus usage of the two asts regarding which one is the owner of a newly created node. For example the method inlineInVariableDeclaration calls the method createDefaultInitializer with the source ast. The method creates new AST nodes which are then used as a replacemenet inside of the target AST/rewriter. This may cause grief and should be avoided since it can cause illegal argument exceptions. Try inlining the following example: public class ZZZ { public String tobeInlined() { if (true) return "dirk"; else return "ralf"; } public void bar() { String s= tobeInlined(); } public void baz() { } } Can you please make a pass over the code and make sure that the AST nodes are created for the right target rewriter/ast. Place that I already dicovered during code review: - inlineVoidMethod: should use the AST from the source provider. - replaceReturnWithAssignment: I already changed this Comment handling: ================= Don't we "loose" comments in the way inlining works now. The method getSourceContent (which I moved back to SourceProvider) gives back a separate String for all the statements in the passed list. When we have code like public void foo() { bar(); // comment bar(); } and inline foo we get public void baz() { bar(); bar(); } So the method getSourceContent should be clever and not remove the comment. It should generate one String containing bar();//comment;bar();
Created attachment 8753 [details] Patch containing the reviewed code.
Dmitry. let me know waht you think about my comments and the code changes.
Didn't make it into 3.0.
Comment on attachment 8753 [details] Patch containing the reviewed code. Patch can no longer be applied (outdated).