Bug 325755 - [compiler] wrong initialization state after conditional expression
Summary: [compiler] wrong initialization state after conditional expression
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.7   Edit
Hardware: PC Windows XP
: P3 critical (vote)
Target Milestone: 3.7 M2   Edit
Assignee: Ayushman Jain CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 325749 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-09-20 10:03 EDT by Dani Megert CLA
Modified: 2011-05-03 01:56 EDT (History)
10 users (show)

See Also:


Attachments
Regression test (1.67 KB, patch)
2010-09-20 13:24 EDT, Olivier Thomann CLA
no flags Details | Diff
proposed fix (1.24 KB, patch)
2010-09-20 15:38 EDT, Stephan Herrmann CLA
no flags Details | Diff
fix v2 plus tests (4.93 KB, patch)
2010-09-20 16:35 EDT, Stephan Herrmann CLA
amj87.iitr: iplog+
amj87.iitr: review+
Details | Diff
additional test (2.31 KB, patch)
2010-09-21 09:16 EDT, Stephan Herrmann CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dani Megert CLA 2010-09-20 10:03:36 EDT
N20100919-2000 - works in 3.7 M2.

Not sure whether this is caused by recent changes in PDE build or Equinox.

1. check out a bundle from CVS, e.g. org.eclipse.core.filebuffers
2. File > Export... > Deployable plug-ins and fragments
==> NPE:
!ENTRY org.eclipse.core.jobs 4 2 2010-09-20 15:52:26.567
!MESSAGE An internal error occurred during: "Export Plug-ins".
!STACK 0
java.lang.NullPointerException
	at org.eclipse.osgi.framework.internal.core.FilterImpl$Parser.<init>(FilterImpl.java:1373)
	at org.eclipse.osgi.framework.internal.core.FilterImpl.newInstance(FilterImpl.java:142)
	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.createFilter(BundleContextImpl.java:915)
	at org.eclipse.pde.internal.build.BundleHelper.createFilter(BundleHelper.java:123)
	at org.eclipse.pde.internal.build.BundleHelper.getFilter(BundleHelper.java:165)
	at org.eclipse.pde.internal.build.builder.ClasspathComputer3_0.matchFilter(ClasspathComputer3_0.java:574)
	at org.eclipse.pde.internal.build.builder.ClasspathComputer3_0.addPluginAndPrerequisites(ClasspathComputer3_0.java:566)
	at org.eclipse.pde.internal.build.builder.ClasspathComputer3_0.addPrerequisites(ClasspathComputer3_0.java:528)
	at org.eclipse.pde.internal.build.builder.ClasspathComputer3_0.getClasspath(ClasspathComputer3_0.java:120)
	at org.eclipse.pde.internal.build.builder.ModelBuildScriptGenerator.generateBuildJarsTarget(ModelBuildScriptGenerator.java:1260)
	at org.eclipse.pde.internal.build.builder.ModelBuildScriptGenerator.generateBuildScript(ModelBuildScriptGenerator.java:307)
	at org.eclipse.pde.internal.build.builder.ModelBuildScriptGenerator.generate(ModelBuildScriptGenerator.java:161)
	at org.eclipse.pde.internal.build.builder.BuildDirector.generateModels(BuildDirector.java:520)
	at org.eclipse.pde.internal.build.builder.BuildDirector.generateChildrenScripts(BuildDirector.java:265)
	at org.eclipse.pde.internal.build.builder.BuildDirector.generate(BuildDirector.java:199)
	at org.eclipse.pde.internal.build.builder.BuildDirector.generate(BuildDirector.java:192)
	at org.eclipse.pde.internal.build.BuildScriptGenerator.generateFeatures(BuildScriptGenerator.java:238)
	at org.eclipse.pde.internal.build.BuildScriptGenerator.generate(BuildScriptGenerator.java:112)
	at org.eclipse.pde.internal.core.exports.FeatureExportOperation.doExport(FeatureExportOperation.java:258)
	at org.eclipse.pde.internal.core.exports.FeatureBasedExportOperation.run(FeatureBasedExportOperation.java:50)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)
Comment 1 Thomas Watson CLA 2010-09-20 10:24:27 EDT
I wonder if this is actually a compiler issue.  In org.eclipse.pde.internal.build.BundleHelper.getFilter(BundleDescription) I see a warning about the following code at the end of the method for deadcode:

		if (filterString != null)
			return createFilter(filterString);
		return null;

The compiler seems to think filterString can never be null.  I think the compiler is removing the if statement and always calling createFilter even when filterString is null.  That would explain the strange NPE we are seeing.  This would be similar to bug324762.
Comment 2 Thomas Watson CLA 2010-09-20 10:30:42 EDT
Has something changed in the compiler that causes it to automatically remove deadcode?  I really think deadcode removal should be an option that is disabled by default.  I know of other cases that have code that is expected to be called by reflection.  I fear this could cause such code to be removed if it is determined to be dead code.
Comment 3 John Arthorne CLA 2010-09-20 10:33:52 EDT
I agree this looks like a possible compiler bug.
Comment 4 Dani Megert CLA 2010-09-20 10:37:21 EDT
In HEAD the fix for bug 325567 could be related but I AFAIK the build does not use the HEAD version and it works with M2. Kim, which version of the compiler was used for the N-build?
Comment 5 Thomas Watson CLA 2010-09-20 10:42:39 EDT
Either way I don't see how org.eclipse.pde.internal.build.BundleHelper.createFilter is getting called with a null from line 165 of org.eclipse.pde.internal.build.BundleHelper.getFilter.  I admit that I have not tried to reproduce on the n-build.

I am using M2 and I do see the warning about deadcode in BundleHelper.getFilter.  Perhaps Kim moved up to M2 for the N-build and the bug is in the M2 compiler.
Comment 6 DJ Houghton CLA 2010-09-20 10:44:36 EDT
This might explain bug 325749?
Comment 7 Olivier Thomann CLA 2010-09-20 10:47:38 EDT
Move to JDT/Core
Comment 8 Olivier Thomann CLA 2010-09-20 10:47:56 EDT
Ayushman, please investigate.
Comment 9 Olivier Thomann CLA 2010-09-20 11:24:00 EDT
(In reply to comment #4)
> In HEAD the fix for bug 325567 could be related but I AFAIK the build does not
> use the HEAD version and it works with M2. Kim, which version of the compiler
> was used for the N-build?
The change for bug 325567 is unrelated.
Comment 10 John Arthorne CLA 2010-09-20 12:02:49 EDT
(In reply to comment #4)
> In HEAD the fix for bug 325567 could be related but I AFAIK the build does not
> use the HEAD version and it works with M2. Kim, which version of the compiler
> was used for the N-build?

The builder is currently using 3.7 M1 version of JDT compiler. So maybe this is the same code gen bug that was already fixed in M2?
Comment 11 Olivier Thomann CLA 2010-09-20 12:22:36 EDT
In this reduced test case:
public class X {
	public static Object foo(String s1, String s2) {
		String local1 = s1;
		String local2 = s2;
		
		String local3 = null;
		if (local1 != null && local2 != null)
			local3 = ""; //$NON-NLS-1$
		else
			local3 = local1 != null ? local1 : local2;

		if (local3 != null)
			return new Integer(local3.length());
		return null;
	}
	
	public static void main(String[] args) {
		System.out.println(foo(null, null));
		System.out.println(foo("p1", null));
		System.out.println(foo(null, "p2"));
		System.out.println(foo("p1", "p2"));
	}
}

		if (local3 != null)
			return new Integer(local3.length());
		return null;
in the if condition the local3 is considered as definitely non null which is wrong.
Comment 12 Olivier Thomann CLA 2010-09-20 13:02:01 EDT
The null info of:
local1 != null ? local1 : local2 in local3 = local1 != null ? local1 : local2;

is tagged as POTENTIALLY_NULL, POTENTIALLY_NON_NULL and POTENTIALLY_UNKNOWN.
But when later asked if the local1 is definitely non null, the answer is true.

What is surprising is that inside the assignment the null status seems to be right. Then the following bits are set for the local:
markPotentiallyUnknownBit(local);
markPotentiallyNullBit(local);
markPotentiallyNonNullBit(local);

Ayushman, please investigate this asap. We must fix it for the next I-build.
Comment 13 Olivier Thomann CLA 2010-09-20 13:11:13 EDT
If the conditional expression is replaced with an if statement, everything works as expected.
Comment 14 Olivier Thomann CLA 2010-09-20 13:24:55 EDT
Created attachment 179259 [details]
Regression test
Comment 15 Ayushman Jain CLA 2010-09-20 13:47:10 EDT
(In reply to comment #12)
> The null info of:
> local1 != null ? local1 : local2 in local3 = local1 != null ? local1 : local2;
> 
> is tagged as POTENTIALLY_NULL, POTENTIALLY_NON_NULL and POTENTIALLY_UNKNOWN.
> But when later asked if the local1 is definitely non null, the answer is true.
> 
> What is surprising is that inside the assignment the null status seems to be
> right. Then the following bits are set for the local:
> markPotentiallyUnknownBit(local);
> markPotentiallyNullBit(local);
> markPotentiallyNonNullBit(local);

Thats strange. I guess it may have something to do with the fix for bug 133125, but I'm not sure. I'll investigate.
Comment 16 Ayushman Jain CLA 2010-09-20 13:50:19 EDT
(In reply to comment #0)
> N20100919-2000 - works in 3.7 M2.

Thats another strange thing. We havent released anything around static analysis since 3.7M2 last week!
Comment 17 Olivier Thomann CLA 2010-09-20 14:45:55 EDT
(In reply to comment #16)
> (In reply to comment #0)
> > N20100919-2000 - works in 3.7 M2.
> Thats another strange thing. We havent released anything around static analysis
> since 3.7M2 last week!
I reproduced the wrong behavior with HEAD contents. So I don't see how this can work with M2.
Comment 18 Stephan Herrmann CLA 2010-09-20 14:50:05 EDT
(In reply to comment #16)
> Thats strange. I guess it may have something to do with the fix for bug 133125,
> but I'm not sure. I'll investigate.

Any findings yet? I can join you investigating ...
Comment 19 Ayushman Jain CLA 2010-09-20 15:02:50 EDT
(In reply to comment #18)
> (In reply to comment #16)
> > Thats strange. I guess it may have something to do with the fix for bug 133125,
> > but I'm not sure. I'll investigate.
> 
> Any findings yet? I can join you investigating ...

Unfortunately this scenario seems to be occuring because of our fix for bug 324762 (the combining of null status in nullStatus(..)) . That, combined with the shortcoming in case of conditional expressions given in bug 324178 is causing this to happen. If we can somehow analyse the expression local1!=null before collecting and combining the null status from the 2 branches of the conditional expression, this problem wont occur. 

Right now, in Assignment.java: line 45 we call ConditionalExpression.nullStatus(), and on line 52 we call analyse assignment. In case of conditional expression, putting line 45 after line 52 seems viable, but i'm not sure if it'll be ok to do so for other kinds of expressions. This will need some more investigations, but its getting late here and some parts of my brain are going into switch off mode, so I'm not sure if i missed something in my analysis! :)
I'll try to come up with a patch if possible tonight itself, or else will continue investigation tomorrow.
Comment 20 Stephan Herrmann CLA 2010-09-20 15:27:00 EDT
I think I have a fix coming up very soon:

(In reply to comment #15)

> is tagged as POTENTIALLY_NULL, POTENTIALLY_NON_NULL and POTENTIALLY_UNKNOWN.
> But when later asked if the local1 is definitely non null, the answer is true.

First quick smell is: the bit code 0111 was not originally documented
by Maxime. It was first observed in bug 320170.


It appears that UnconditionalFlowInfo.mergedWith(UnconditionalFlowInfo)
needs a similar fix to what we did in bug 320170 to
NullInfoRegistry.mitigateNullInfoOf().

First quick&dirty testing shows that Oliviers test case passes.
More testing needed before I'll post a patch here.
Comment 21 Ayushman Jain CLA 2010-09-20 15:36:24 EDT
(In reply to comment #20)
> I think I have a fix coming up very soon:
> 
> (In reply to comment #15)
> 
> > is tagged as POTENTIALLY_NULL, POTENTIALLY_NON_NULL and POTENTIALLY_UNKNOWN.
> > But when later asked if the local1 is definitely non null, the answer is true.
> 
> First quick smell is: the bit code 0111 was not originally documented
> by Maxime. It was first observed in bug 320170.
> 
> 
> It appears that UnconditionalFlowInfo.mergedWith(UnconditionalFlowInfo)
> needs a similar fix to what we did in bug 320170 to
> NullInfoRegistry.mitigateNullInfoOf().

Yes. This is also the conclusion i had reached, after having commented and reconsidering.
Comment 22 Andrew Niefer CLA 2010-09-20 15:37:44 EDT
*** Bug 325749 has been marked as a duplicate of this bug. ***
Comment 23 Stephan Herrmann CLA 2010-09-20 15:38:38 EDT
Created attachment 179274 [details]
proposed fix

This indeed seems to do the trick.
All of NullReferenceTests pass including Olivier's test case
from comment 14.
Comment 24 Thomas Watson CLA 2010-09-20 15:42:47 EDT
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #0)
> > > N20100919-2000 - works in 3.7 M2.
> > Thats another strange thing. We havent released anything around static analysis
> > since 3.7M2 last week!
> I reproduced the wrong behavior with HEAD contents. So I don't see how this can
> work with M2.

I think Kim updated to M2 sometime last week for the builder to run some test builds.  I think we are still using M2 in the current nightly builds.  This explains why the pde.build bundle works in the M2 build, because pde.build in M2 was built with an older jdt core which does not have this bug.  The nightly builds ever since moving up to M2 have a broken pde.build because they were built with M2 jdt.core.

We need to either:

1) get a private jdt.core fix to use for the I-Build tomorrow.
2) backout the use of jdt.core M2 in the builder for the I-Build tomorrow.
3) delay the I-Build tomorrow.

Using the current builder as configured will result in a broken I-Build tomorrow.
Comment 25 Kim Moir CLA 2010-09-20 15:48:58 EDT
Yes, when I was talking to John earlier I had forgotten that I changed the builder on Friday to use the 3.7M2 test candidate bundles when I tagged the builder for another bug fix.  I'm running a test build now with the 3.7M1 jdt core bundle with the rest of the bundles from 3.7M2.
Comment 26 Olivier Thomann CLA 2010-09-20 15:50:57 EDT
(In reply to comment #24)
> We need to either:
> 
> 1) get a private jdt.core fix to use for the I-Build tomorrow.
> 2) backout the use of jdt.core M2 in the builder for the I-Build tomorrow.
> 3) delay the I-Build tomorrow.
> 
> Using the current builder as configured will result in a broken I-Build
> tomorrow.
Here is the way I see it. Stephan's patch fixes the issue. Thanks a lot Stephan for your help on this issue!
I will let Ayushman play with it tomorrow and make sure the fix goes into the next I-build.

Meanwhile I would recommend that the previous JDT/Core build is used instead of M2 for the I-build so that the build can proceed successfully.

Once the I-build is done with the fix in, then we can verify that the case is fixed. If successful, we should respin a new build for M2 that contains this fix. This is unfortunate, but I don't see how we can live with a problem like that till M3. I prefer to respin a build than getting people having all kinds of runtime failures.

Any comment?
Comment 27 Ayushman Jain CLA 2010-09-20 15:54:18 EDT
(In reply to comment #26)
[..]
> Any comment?

Sounds fine to me. I'll try and get this released before tomorrow's build tagging.
Comment 28 Stephan Herrmann CLA 2010-09-20 15:56:43 EDT
On the one hand I'm confident the patch gives us a compiler that
works for the issue at hand. I'm currently running all compiler
regression tests (on my poor little notebook), results are looking
good and by construction the patch has effect only if we encounter
a 0111 state, a state that previously was believed to be impossible.

On the long run we are a bit in an awkward situation, because we are 
adding tweaks around complex logical formulae which are not understood
by today's developers (please! correct me, if anybody has more 
insights than I have).

We should also invest some efforts on similar functions that might
suffer from the same shortcoming.

I'm sure the connection to our recent changes are only that those make
a previously *really* rare situation a bit more likely to happen.
Comment 29 Ayushman Jain CLA 2010-09-20 16:06:22 EDT
(In reply to comment #23)
> Created an attachment (id=179274) [details] [diff]
> proposed fix
> 
> This indeed seems to do the trick.
> All of NullReferenceTests pass including Olivier's test case
> from comment 14.

Stephen, perhaps the same fix has to be done in case we have extra storage (i.e. line 1600 onwards). Other than that I think the fix is ok. Thanks a lot for looking into it.
Comment 30 Stephan Herrmann CLA 2010-09-20 16:35:24 EDT
Created attachment 179283 [details]
fix v2 plus tests

(In reply to comment #29)
> (In reply to comment #23)
> > Created an attachment (id=179274) [details] [details] [diff]
> > proposed fix
> > 
> > This indeed seems to do the trick.
> > All of NullReferenceTests pass including Olivier's test case
> > from comment 14.
> 
> Stephen, perhaps the same fix has to be done in case we have extra storage
> (i.e. line 1600 onwards).

I fully agree. Patch version 2 includes a test that demonstrate that the
previous patch fails if we define more than 64 local variables in this 
method prior to local3. I've extended the fix to also cover this case.
Thanks for pointing this out.

> Other than that I think the fix is ok. Thanks a lot
> for looking into it.

welcome :)
Comment 31 Olivier Thomann CLA 2010-09-20 17:12:05 EDT
(In reply to comment #28)
> On the long run we are a bit in an awkward situation, because we are 
> adding tweaks around complex logical formulae which are not understood
> by today's developers (please! correct me, if anybody has more 
> insights than I have).
Yes, this is exactly the situation and this is rather unfortunate. The biggest issue was that the null analysis was left as an incomplete task and too many false positive warnings or missing warnings were found. So this area had to be improved. We could not simply remove everything.
Ayushman worked hard on improving this diagnosis and he made a really good job.

Now by fixing some issues, some other cases surfaced. I think this is expected as this area is completely undocumented. We are running thousands of tests for each build and none of them found this issue. What is unfortunate is that this ended up in a milestone build.

> We should also invest some efforts on similar functions that might
> suffer from the same shortcoming.
Do you have something specific in mind ?
 
> I'm sure the connection to our recent changes are only that those make
> a previously *really* rare situation a bit more likely to happen.
Yes, but the problem was already there. Recent fixes just made it more visible.

Thank you again and Ayushman too for looking into this.
Comment 32 Stephan Herrmann CLA 2010-09-20 18:00:37 EDT
(In reply to comment #31)
> What is unfortunate is that this ended up in a milestone build.

Unfortunate, indeed.
 
> > We should also invest some efforts on similar functions that might
> > suffer from the same shortcoming.
> Do you have something specific in mind ?

I'm currently using your test case as a starting point trying to
break UnconditionalFlowInfo.addInitializationsFrom(FlowInfo).
This and addPotentialNullInfoFrom should be the main suspects
from UnconditionalFlowInfo.

Knowing the kind of pattern that triggers wrong computations I see
good chances we can solve the 0111 problem completely.
 
> Yes, but the problem was already there. Recent fixes just made it more visible.

And I do hope that after those false alarms the analysis will also give
many more valid warnings in the future.
Comment 33 Olivier Thomann CLA 2010-09-20 20:06:08 EDT
I confirm that the org.eclipse.pde.internal.build.BundleHelper problem is fixed with this patch.
Comment 34 Ayushman Jain CLA 2010-09-21 02:34:14 EDT
Released in HEAD for 3.7M3
Updated copyrights
Comment 35 Dani Megert CLA 2010-09-21 02:50:23 EDT
>Once the I-build is done with the fix in, then we can verify that the case is
>fixed. If successful, we should respin a new build for M2 that contains this
>fix.
I agree. The bug is severe enough to justify a respin.
Comment 36 Stephan Herrmann CLA 2010-09-21 09:16:12 EDT
Created attachment 179310 [details]
additional test

Here's another test I suggest adding to the suite because it demonstrates
two interesting things:

- This bug could also be triggered without a conditional expression

- The test shows directly how bug 320170 and this bug interact:
  1.disable both patches: 
    no compiler warning but NPE at runtime
    (given appropriate infrastructure for running the class)
  2.enable patch from bug 320170 only: 
    correct warning re x and
    incorrect warning about redundant check for y 
  3.enable both patches: 
    correctly only warn regarding x
  The connection is by the variable x having state 0111 on entry
  of the catch block, which requires both patches for correct handling.


I thoroughly challenged UnconditionalFlowInfo.addInitializationsFrom(FlowInfo)
but the only unexpected observation was: when combining two  0111-states
the "pot.unknown" information is dropped, but I wouldn't see how this could
break downstream analysis.

I'll finally give addPotentialNullInfoFrom a try soon.
Comment 37 Olivier Thomann CLA 2010-09-21 09:51:46 EDT
I also incremented the build state version to force the rebuild of all projects as the current code generation can be wrong.
Comment 38 Olivier Thomann CLA 2010-09-21 17:53:46 EDT
Verified using I20100921-1024
Comment 39 Srikanth Sankaran CLA 2011-05-03 01:56:36 EDT
Verified again using Build id: I20110428-0848 for RC0.