Bug 326950 - [compiler][null]Do not optimize code generation based on static analysis (dead code)
Summary: [compiler][null]Do not optimize code generation based on static analysis (dea...
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.7   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 3.7 M6   Edit
Assignee: Ayushman Jain CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-04 13:31 EDT by Ayushman Jain CLA
Modified: 2011-03-08 17:30 EST (History)
6 users (show)

See Also:


Attachments
proposed fix (4.60 KB, patch)
2011-02-17 09:01 EST, Ayushman Jain CLA
no flags Details | Diff
another proposed solution (4.03 KB, patch)
2011-02-25 10:14 EST, Ayushman Jain CLA
no flags Details | Diff
proposed fix v2.0 (34.45 KB, patch)
2011-03-02 14:01 EST, Ayushman Jain CLA
no flags Details | Diff
updated fix v2.0 (2.68 KB, patch)
2011-03-04 08:55 EST, Ayushman Jain CLA
no flags Details | Diff
proposed fix v2.1 + tests (53.30 KB, patch)
2011-03-04 10:21 EST, Ayushman Jain CLA
no flags Details | Diff
proposed fix v2.2 + tests (69.55 KB, patch)
2011-03-04 14:53 EST, Ayushman Jain CLA
no flags Details | Diff
released patch (72.45 KB, patch)
2011-03-05 12:18 EST, Ayushman Jain CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ayushman Jain CLA 2010-10-04 13:31:10 EDT
We had introduced optimization for the generated code for if-else statements in 3.6, such that when the compiler can determine either the if or the else branch will never be executed, it optimizes out that code. 

This optimization was basically dependent on the compiler's static analysis. Because of this, we think there's a risk involved in making changes to static analysis code, since any change may directly affect the code gen, and if there's a regression introduced by the change, it will manifest itself in the code generated. Because of the associated risk as mentioned above, I suggest we keep the improvements done to static analysis, but do not optimize if-else code generation based on static analysis.
Comment 1 Stephan Herrmann CLA 2010-10-04 13:52:08 EDT
In a similar vein I was thinking whether we should encourage developers
to set dead code warnings to "error" so they'd definitely see what the
compiler deems as dead.

But maybe we can combine both considerations: make the optimization an 
option that is off by default, and when switching to "on" pop up a
dialog asking whether dead code should be reported as error (recommended).

Generally speaking, if it is an either-or, I see more benefit in static
analysis than in byte code optimization.
Comment 2 Ayushman Jain CLA 2010-10-05 04:23:37 EDT
(In reply to comment #1)
> In a similar vein I was thinking whether we should encourage developers
> to set dead code warnings to "error" so they'd definitely see what the
> compiler deems as dead.
> 
> But maybe we can combine both considerations: make the optimization an 
> option that is off by default, and when switching to "on" pop up a
> dialog asking whether dead code should be reported as error (recommended).

There's already an option available for setting dead code to 'error'. We could consider having an enable/disable option for optimizing code gen which is grayed out unless dead code is set "error". But even then, I'm not too inclined to have it because in the future if there's a bad change in the static analysis code, and the code gen gets affected, its still a major bug, even though it can be "hidden" by enabling the option. I still dont want to go ahead and take the next step. We can discuss on the best approach. Olivier, any thoughts?
Comment 3 Ayushman Jain CLA 2010-10-08 05:23:28 EDT
While investigating this, I realised i was under a misconception thinking that we had introduced optimization of dead if/else branch for 3.6. Even before 3.6 we used to optimize not just dead if-else code, but other statements too, such as returns, switch, local declaration, etc. using the IsReachable bit set on the ASTNode during static analysis. 
For 3.6, we had started to optimize out the generation of if statement's condition itself if it was determined that only one branch is going to be executed. This was done with the fix for bug 293917.
So for the following code
String s = null;
if (s == null) {
   System.out.println("s is null");
} else {
   System.out.println("s is non null");
}

before 3.6 we do generate the code for if(s==null), but not for the else statement - System.out.println("s is non null");
Post 3.6 we *dont* generate the code for if(s== null) too, since its basically redundant.

In light of this observation, specifically plucking out code gen optimization for if-else will be messy and also doesnt make much sense since we're optimizing other kinds of statements. Sorry for the false alarm, I would want to leave this as it is.
Comment 4 Ayushman Jain CLA 2010-10-19 02:14:36 EDT
Closing as WONTFIX since this behaviour has always been there and is too interwoven into the static analysis. See comment 3.
Comment 5 Srikanth Sankaran CLA 2010-10-26 00:43:29 EDT
Verified for 3.7 M3
Comment 6 Ayushman Jain CLA 2011-02-14 10:20:48 EST
Reopening to explore removing deadcode optimization for generated code.
Comment 7 Ayushman Jain CLA 2011-02-17 09:01:18 EST
Created attachment 189182 [details]
proposed fix

This patch comments out the places where the ASTNode.IsReachable bit is reset, so that dead code is not optimized out during code gen. It also disables optimizing out of the if condition statement when one branch of the if statement is dead code.

Some tests fail because more code is generated now. They mostly look ok. However, I dont understand the cause of failure for org.eclipse.jdt.tests.compiler.regression.FromJacksTest.test066(). Looks like some issue in the class file generation because of which the class loader is not able to load the class. I get a 
junit.framework.AssertionFailedError: Unexpected error running resulting class file for X.java:
--[START]--
java.lang.VerifyError: JVMCFRE016 local variable index out of range in increment; class=X, method=main([Ljava/lang/String;)V, pc=1
	at java.lang.ClassLoader.defineClass(ClassLoader.java:265)
	.........

I tried investigating but couldn't really understand why this happens. Olivier, does anything strike looking at the error message on what can be the cause?
Comment 8 Stephan Herrmann CLA 2011-02-17 09:11:02 EST
(In reply to comment #7)
> However, I dont understand the cause of failure for
> org.eclipse.jdt.tests.compiler.regression.FromJacksTest.test066().

Can I find this test somewhere in CVS?
Comment 9 Olivier Thomann CLA 2011-02-17 09:12:44 EST
(In reply to comment #8)
> Can I find this test somewhere in CVS?
No, it is an internal test.

Ayushman, I'll take a look later today.
Comment 10 Olivier Thomann CLA 2011-02-23 16:36:42 EST
With the current patch, we end up generating with preserve all locals.

  public static void main(java.lang.String[] args);
    0  return
    1  iinc 1 1

which is clearly wrong.
I'll take a look.
Comment 11 Olivier Thomann CLA 2011-02-23 18:45:04 EST
I think the problem with this patch is that we don't propagate the unreachable status to the statement bits the way it should be done.

In this case, the test case looks like this:

public class X {
	public static void main(String[] args) {
		int i;
		label : {
			if (false)
				break label;
			return;
		}
		i++;
	}
}

The bug was introduced by the line commented out in:
org.eclipse.jdt.internal.compiler.ast.Statement.complainIfUnreachable(FlowInfo, BlockScope, int)

By doing that we don't actually tag the corresponding code as unreachable and therefore we end up generating it.

The optimization that we would like to consider removing is the one that tags as dead code statements as the result of the null analysis. Let me know if I misunderstood it.

The current patch goes too far as trying to remove the tagging of unreachable code as unreachable.
	if ((flowInfo.reachMode() & FlowInfo.UNREACHABLE) != 0) {
		this.bits &= ~ASTNode.IsReachable;
		...

That method makes sure that the statement is tagged as unreachable when the corresponding flow info is tainted as unreachable. If the line:
this.bits &= ~ASTNode.IsReachable;
is commented out, we don't properly propagate the "unreachableness" when needed leading to the VerifyError you got.

Let me know if this comment is not clear enough.
Comment 12 Ayushman Jain CLA 2011-02-24 06:12:05 EST
Ok, so basically this test shows us one scenario in which a dead code HAS to be optimized out, come what may. Then our approach of removing the optimization will not work out, and this bug should be closed unresolved again.
I think this failing tests actually shows one case where we can clearly infer that i++ is unreachable, but because of the spec, cant raise that warning. So we have to make do with a deadcode warning, yet at the same time have to optimize it out just like an unreachable code.
Comment 13 Stephan Herrmann CLA 2011-02-24 07:27:09 EST
(In reply to comment #12)
> Ok, so basically this test shows us one scenario in which a dead code HAS to be
> optimized out, come what may. Then our approach of removing the optimization
> will not work out, and this bug should be closed unresolved again.

That would be a pitty.

> I think this failing tests actually shows one case where we can clearly infer
> that i++ is unreachable, but because of the spec, cant raise that warning. So
> we have to make do with a deadcode warning, yet at the same time have to
> optimize it out just like an unreachable code.

Would it help to add one more constant to FlowInfo.{REACHABLE,UNREACHABLE}
that would signal unreachability based on null-analysis?
Only for those NULLNESS_UNREACHABLE would we turn of optimization by default.

From a quick glance it could work if we just let
FlowContext.recordUsingNullReference use the new constant.
I haven't checked about the sites that *query* the reachability, though.
Comment 14 Ayushman Jain CLA 2011-02-24 07:33:58 EST
(In reply to comment #13)
> Would it help to add one more constant to FlowInfo.{REACHABLE,UNREACHABLE}
> that would signal unreachability based on null-analysis?
> Only for those NULLNESS_UNREACHABLE would we turn of optimization by default.

I'm not sure if we will be able to cleanly differentiate between these two constants. Though I can try and see if it works.
Comment 15 Ayushman Jain CLA 2011-02-24 08:12:02 EST
(In reply to comment #14)
> I'm not sure if we will be able to cleanly differentiate between these two
> constants. Though I can try and see if it works.

Actually this is very confusing. In places like org.eclipse.jdt.internal.compiler.flow.FlowInfo.mergedOptimizedBranchesIfElse(FlowInfo, boolean, FlowInfo, boolean, boolean, FlowInfo, IfStatement), there's no clear distinction about why a branch is being marked unreachable.
Also, I can only imagine the plight of those who are going to work on this area in the future. Null analysis code is as it is so complex! ;)
Comment 16 Stephan Herrmann CLA 2011-02-24 08:36:30 EST
(In reply to comment #15)
> (In reply to comment #14)
> > I'm not sure if we will be able to cleanly differentiate between these two
> > constants. Though I can try and see if it works.
> 
> Actually this is very confusing. In places like
> org.eclipse.jdt.internal.compiler.flow.FlowInfo.mergedOptimizedBranchesIfElse(FlowInfo,
> boolean, FlowInfo, boolean, boolean, FlowInfo, IfStatement), there's no clear
> distinction about why a branch is being marked unreachable.

Which branch is causing grief?
I think FlowInfo.DEAD_END always means: "really dead", right?
flowInfo.tagBits & FlowInfo.UNREACHABLE can easily make the new difference.
Is it (ifStatement.bits & ASTNode.IsElseStatementUnreachable) which is unclear?
Maybe that flag would have to be duplicated, too? <sigh/>

> Also, I can only imagine the plight of those who are going to work on this area
> in the future. Null analysis code is as it is so complex! ;)

;)
Comment 17 Srikanth Sankaran CLA 2011-02-25 00:16:54 EST
(In reply to comment #15)
> (In reply to comment #14)
> > I'm not sure if we will be able to cleanly differentiate between these two
> > constants. Though I can try and see if it works.
> 
> Actually this is very confusing. In places like
> org.eclipse.jdt.internal.compiler.flow.FlowInfo.mergedOptimizedBranchesIfElse(FlowInfo,
> boolean, FlowInfo, boolean, boolean, FlowInfo, IfStatement), there's no clear
> distinction about why a branch is being marked unreachable.
> Also, I can only imagine the plight of those who are going to work on this area
> in the future. Null analysis code is as it is so complex! ;)

I share this worry, my intention in pushing for this to build an escape hatch
not further traps (via more complicated code). If this cannot be cleanly
implemented, it is better to drop this IMO.
Comment 18 Stephan Herrmann CLA 2011-02-25 08:36:36 EST
Aysush, I'd like to offer my help here.
In turn you might do me the favor of double-checking bug 336428 :)
(It's a small patch).
Comment 19 Ayushman Jain CLA 2011-02-25 10:14:27 EST
Created attachment 189809 [details]
another proposed solution

Stephan, before starting to devote more time on this, you may like to consider my alternative solution: 
Let us only stop optimization of dead code in the context of if-else statements, because thats where all the regressions happened because of the improved dead code analysis.
The above patch achieves that by utilizing the return value of complainIfUnreachable(..) and if it is determined that we warned for dead code for a statement, just mark it as reachable. 
So, for 
 if (false)
     i++;
we mark i++ as deadcode but dont optimize it away.

But for,
public static void main(String[] args) {
		int i;
		l : {
			if (false)
				break;    
			return;
		}
		i++;      
	}

we still optimize away the i++
Comment 20 Ayushman Jain CLA 2011-02-25 10:29:52 EST
(In reply to comment #19)
> Created attachment 189809 [details] [diff]
> another proposed solution

This does not mean that I'm opposed to your solution. I'm just trying to find an easy way out. Its still possible though that your approach may offer a more comprehensive solution. So, maybe you can also attach a patch when you have one! :)
Comment 21 Olivier Thomann CLA 2011-02-25 10:45:16 EST
I thought the purpose of this bug report was to remove "null" analysis result from the optimized code generation of the if statement.
All recent "regressions" over the last year or so were all related to the if statement that was wrongly optimized due to changes made in the null analysis.

Did I miss something ?
Comment 22 Stephan Herrmann CLA 2011-02-25 10:49:10 EST
(In reply to comment #19)
> Created attachment 189809 [details]
> another proposed solution
> 
> Stephan, before starting to devote more time on this, you may like to consider
> my alternative solution: 
> Let us only stop optimization of dead code in the context of if-else
> statements, because thats where all the regressions happened because of the
> improved dead code analysis.
> The above patch achieves that by utilizing the return value of
> complainIfUnreachable(..) and if it is determined that we warned for dead code
> for a statement, just mark it as reachable. 
> So, for 
>  if (false)
>      i++;
> we mark i++ as deadcode but dont optimize it away.
> 
> But for,
> public static void main(String[] args) {
>         int i;
>         l : {
>             if (false)
>                 break;    
>             return;
>         }
>         i++;      
>     }
> 
> we still optimize away the i++

This sounds good as an escape hatch to avoid wrong optimizations.

It also turns off optimization of development time code like
  if (DEBUG) ...
etc.
People won't want to remove these, despite the warning, but still expect
this to be excluded from released code.

So the extra bonus would be, if for those IfStatements in question we keep
a bit signaling *why* analysis believes this is dead code.

I'll give it a try, and if it turns out to significantly complicate the
code your alternative might be good enough for now.
Comment 23 Stephan Herrmann CLA 2011-02-25 10:51:20 EST
(In reply to comment #21)
> I thought the purpose of this bug report was to remove "null" analysis result
> from the optimized code generation of the if statement.

In the luxurious solution, "null analysis" would be one (the prominent?) reason
to be signaled.
Comment 24 Ayushman Jain CLA 2011-02-25 11:13:49 EST
(In reply to comment #21)
> I thought the purpose of this bug report was to remove "null" analysis result
> from the optimized code generation of the if statement.
> All recent "regressions" over the last year or so were all related to the if
> statement that was wrongly optimized due to changes made in the null analysis.
> 
> Did I miss something ?

Yeah that was the original intension, but in last week's call, we'd deliberated upon why we need to optimize code gen at all. So i'd reopened this bug, changed the summary and was trying to see what will be the consequence if we disable any kind of code optimization based on null analysis. Hence, the mess. :)
Comment 25 Stephan Herrmann CLA 2011-02-27 16:51:28 EST
Could the possible solution sketched in bug 338234 comment 8 also resolve
this issue?
Comment 26 Ayushman Jain CLA 2011-02-28 05:14:41 EST
(In reply to comment #25)
> Could the possible solution sketched in bug 338234 comment 8 also resolve
> this issue?

Yes, i think it can
Comment 27 Ayushman Jain CLA 2011-03-02 14:01:43 EST
Created attachment 190188 [details]
proposed fix v2.0

This patch leverages the new bit introduced for bug 338234. This is pretty straightforward and marks a statement unreachable only when the unreachability is decided by reasons other than null analysis.
It also makes sure we dont optimize away the "if (condition)" itself, incase one branch of the if-else statement is analysed to be dead code.

Olivier/Srikanth, do you think we need a new option here? I guess we can just do away with the optimization of dead code since this patch decouples it from null analysis.
Comment 28 Ayushman Jain CLA 2011-03-03 02:06:04 EST
(In reply to comment #27)
> Created attachment 190188 [details] [diff]
> proposed fix v2.0
 
Forgot to mention, this patch should only be applied after the patch for bug 338234 has been applied.
Comment 29 Olivier Thomann CLA 2011-03-03 13:46:24 EST
I believe test org.eclipse.jdt.core.tests.compiler.regression.NullReferenceTest#testBug304416 needs to be updated.
Comment 30 Ayushman Jain CLA 2011-03-03 14:06:53 EST
(In reply to comment #29)
> I believe test
> org.eclipse.jdt.core.tests.compiler.regression.NullReferenceTest#testBug304416
> needs to be updated.

Yes, the tests can be ignored for now. I will update the tests once the patch for bug 338234 is released.
Comment 31 Ayushman Jain CLA 2011-03-04 08:55:55 EST
Created attachment 190384 [details]
updated fix v2.0

While trying to test this patch, I found out that not all is well, yet. I noticed that the compiler fails to generate a return statement at the end of the method in cases such as:

void foo(){
   String s = "";
   if (s!= null)
      return;
   System.out.println();    // dead code
}

Here even though the compiler correctly does *not* optimize away the sysout statement, it does not generate a "return" following it, resulting in an invalid classfile. Code generated for every method should end with a return statement.
Comment 32 Ayushman Jain CLA 2011-03-04 09:06:52 EST
(In reply to comment #31)
This happened because of the following line in MethodDeclaration.analyseCode():
if ((flowInfo.tagBits & FlowInfo.UNREACHABLE) == 0) {
	this.bits |= ASTNode.NeedFreeReturn;
}

changing the bit to UNREACHABLE_OR_DEAD fixes this
Comment 33 Olivier Thomann CLA 2011-03-04 09:37:20 EST
Tagging as M6 as this is ready to be released.
Comment 34 Ayushman Jain CLA 2011-03-04 10:21:39 EST
Created attachment 190394 [details]
proposed fix v2.1 + tests

The above free return issue opened my eyes a bit, and now I've made sure that all the places where UNREACHABLE bit is used to set or reset any bit for code gen, now use UNREACHABLE_OR_DEAD instead (since we dont want code gen to be affected by UNREACHABLE_BY_NULLANALYSIS). This has made the patch a bit bigger but the changes are all same- changing the bit.
I've also added tests to make sure free return is generated for methods and constructors as well.
Tests currently running.
Comment 35 Ayushman Jain CLA 2011-03-04 12:05:33 EST
(In reply to comment #34)
> Created attachment 190394 [details] [diff]
> proposed fix v2.1 + tests
> Tests currently running.
All tests pass.
Olivier, can you take a quick look at the patch and see if you find anything amiss. I've tested the patch and I'm fairly convinced we're good here. Will release once you give the go ahead. Thanks!
(btw, the changes in IfStatement.generateCode(..) are done to make sure we don't optimize away the generation of the if condition expression itself if one branch is dead code, since now we're not optimizing out the dead branch.)
Comment 36 Olivier Thomann CLA 2011-03-04 12:20:23 EST
For the synthetic variable/method management, we have:
if ((flowInfo.tagBits & FlowInfo.UNREACHABLE) != 0)	return;

I believe this should test UNREACHABLE_OR_DEAD since otherwise we might optimize out some synthetic members and want to refer them later in the code.

Did I miss something ?

Beside this, it looks good.
Comment 37 Ayushman Jain CLA 2011-03-04 13:26:14 EST
(In reply to comment #36)
> For the synthetic variable/method management, we have:

Hmm. Yeah this too affects code gen.
Made changes to all manageSyntheticAccess.. and manageEnclosingInstanceAccess..
methods. Tests running
Comment 38 Ayushman Jain CLA 2011-03-04 14:53:00 EST
Created attachment 190438 [details]
proposed fix v2.2 + tests

Hopefully this should be the final patch.
Comment 39 Olivier Thomann CLA 2011-03-04 15:32:41 EST
There is some indentation that can be fixed inside the org.eclipse.jdt.internal.compiler.flow.LoopingFlowContext#recordContinueFrom(FlowContext, FlowInfo).

Patch looks good.
Comment 40 Ayushman Jain CLA 2011-03-04 15:49:13 EST
(In reply to comment #39)
> There is some indentation that can be fixed inside the
> org.eclipse.jdt.internal.compiler.flow.LoopingFlowContext#recordContinueFrom(FlowContext,
> FlowInfo).

Ok, will update this. Will anyway have to update after Stephan releases bug 324178. Thanks!
Comment 41 Ayushman Jain CLA 2011-03-05 12:18:08 EST
Created attachment 190473 [details]
released patch

Same patch updated for HEAD.
Comment 42 Ayushman Jain CLA 2011-03-05 12:18:58 EST
Released in HEAD for 3.7M6.
Comment 43 Satyam Kandula CLA 2011-03-08 04:45:00 EST
Verified for 3.7M6 using build I20110307-0800
Comment 44 Karim Fatehi CLA 2011-03-08 11:57:01 EST
public static void main(String[] args) {
String test = null;
test += "... some text";
if (test != null) {
System.out.println("It worked!");
}
}

generates a warning that is not correct.
also does not print "It worked!" as expected in 3.6.2
Comment 45 Ayushman Jain CLA 2011-03-08 11:59:40 EST
(In reply to comment #44)
> generates a warning that is not correct.
Warning is not correct, agreed.

> also does not print "It worked!" as expected in 3.6.2

With the above fix, this is correctly printed. This can't be fixed in 3.6.2 unfortunately, since its already shipped.
Comment 46 Ayushman Jain CLA 2011-03-08 12:02:39 EST
(In reply to comment #45)
> (In reply to comment #44)
> > generates a warning that is not correct.
> Warning is not correct, agreed.
Opened bug 339250