324178 – [null] ConditionalExpression.nullStatus(..) doesn't take into account the analysis of condition itself

Bug 324178 - [null] ConditionalExpression.nullStatus(..) doesn't take into account the analysis of condition itself

Summary: [null] ConditionalExpression.nullStatus(..) doesn't take into account the ana...

Status:	VERIFIED FIXED

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	3.7
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.7 M6
Assignee:	Stephan Herrmann
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-09-01 07:22 EDT by Ayushman Jain
Modified:	2011-03-08 12:27 EST (History)
CC List:	3 users (show)

See Also:

Flags:	amj87.iitr: review+

Attachments
proposed fix (4.42 KB, patch) 2011-02-07 19:03 EST, Stephan Herrmann	no flags	Details \| Diff
better fix and 2 more tests (6.04 KB, patch) 2011-02-27 19:37 EST, Stephan Herrmann	no flags	Details \| Diff
even better fix (11.07 KB, patch) 2011-03-03 18:12 EST, Stephan Herrmann	no flags	Details \| Diff
simplified patch (14.98 KB, patch) 2011-03-04 14:10 EST, Stephan Herrmann	no flags	Details \| Diff
final patch (14.74 KB, patch) 2011-03-04 16:28 EST, Stephan Herrmann	no flags	Details \| Diff
Show Obsolete (4) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ayushman Jain

2010-09-01 07:22:53 EDT

HEAD

In cases such as these:

public class X {
	boolean b;
	void foo(Object u) {
	if (u == null) {}
		Object o = (u == null) ? new Object() : u;
		o.toString();   // Incorrect potential NPE
	}
	}
}

we warn a potential NPE for o because one branch of the condition expression can initialize it to u, which may be null. But its clear that o can never be initialized to u when it is null because of the condition expression. However, currently we dont take the analysed info from the condition in ConditionExpression.nullStatus(..).

Expected behaviour: no warning

Comment 1 Stephan Herrmann

2011-02-07 19:03:58 EST

Created attachment 188486 [details]
proposed fix

From reading the recent ideas in bug 336544 I received some inspiration for
tackling this issue, too.

It seems the complicated block in ConditionalExpression.analyzeCode() is 
actually wrong: the merged flow should allow to distinguish the original 
true and false branches. If that merged flow is then fed back into 
ConditionalExpression.nullStatus() we're able to use the correct branch infos.

NullReferenceTest and FlowAnalysisTest pass.
I will give it some proper testing after bug 336544 is fixed, as my patch
depends on the fix proposed in bug 336544 comment 8 (contained in this patch).

If all goes well all changes in LocalDeclaration should also be applied to
Assignment.

Comment 2 Ayushman Jain

2011-02-08 02:28:57 EST

> It seems the complicated block in ConditionalExpression.analyzeCode() is 
> actually wrong: the merged flow should allow to distinguish the original 
> true and false branches. 

I didn't quite understand the logic behind the fix. Why is the current merge wrong?

Comment 3 Stephan Herrmann

2011-02-08 02:38:31 EST

(In reply to comment #2)
> > It seems the complicated block in ConditionalExpression.analyzeCode() is 
> > actually wrong: the merged flow should allow to distinguish the original 
> > true and false branches. 
> 
> I didn't quite understand the logic behind the fix. Why is the current merge
> wrong?

I read the current logic as handling source like this:

  c1 ? (c2 ? t1 : f1) : (c3 ? t2 : f2)

then it merges the flows from (t1 and t2) and also the flows from (f1 and f2).
I have no idea what those combinations would signify. Why should the outer
conditional look into the nested branches??

Instead I propose to leave the flows of the toplevel true and false branches
as is and only with those combined to a conditional flow can we compute the
right null status.

Comment 4 Ayushman Jain

2011-02-08 05:11:01 EST

(In reply to comment #3)
> I read the current logic as handling source like this:
> 
>   c1 ? (c2 ? t1 : f1) : (c3 ? t2 : f2)

Hmm, maybe. I wish the block was atleast a bit documented. :(

Comment 5 Stephan Herrmann

2011-02-25 09:03:45 EST

Ayush: I'm ready to validate my theory by writing more tests etc. 
Do you want to assign this over to me?

Comment 6 Ayushman Jain

2011-02-26 05:16:31 EST

(In reply to comment #5)
> Ayush: I'm ready to validate my theory by writing more tests etc. 
Thats ok. I also agree with your theory. I was just wondering if that weird merge was done to take care of a corner case in the JLS. I found section 15.26 about the conditional operator, but that still doesnt justify that sort of a merge. So i guess its justified to do away with it.

> Do you want to assign this over to me?
Sure.

Comment 7 Stephan Herrmann

2011-02-27 19:37:42 EST

Created attachment 189907 [details]
better fix and 2 more tests

Tales of a hike through logicland and the glorious JLS:

On the one hand the code block in question is broken wrt to how null info
is processed. OTOH I finally found a test case that breaks if I disable
the block we didn't understand (something the example given in the code 
failed to provide):

boolean foo(boolean b) {
    boolean v;
    if (b ? false : (true && (v = false))) 
        return v;
    return false;
}

Using, e.g., the patch from comment 1 I get:
   "The local variable v may not have been initialized"

Wew, this warning is bogus because on all paths where the whole condition 
yields "true" the variable v will actually be assigned.
Next I found that JLS 16.1.5 actually has a shape similar to this code.

It turns out we have two conflicting(?) requirements:
 - for conditional expressions of boolean type we need to sort flowinfos
   according to the outcome (whole condition yield true / false?)
 - for null analysis we need to sort flowinfos according to the condition.

Perhaps we should actually have both strategies in the code and switch
depending on the type. Luckily expressions of type boolean require no
null-analysis.

The updated patch follows this path, and if no further obstacles are found
on the road, I will apply some finish (more code comments, update
Assignment as well, maybe more tests) and finally release.

Comment 8 Ayushman Jain

2011-03-02 07:42:50 EST

(In reply to comment #7)
> Created attachment 189907 [details] [diff]
> better fix and 2 more tests
> Luckily expressions of type boolean require no
> null-analysis.

Why not? Maybe the conditions are being created out of comparisons with null?
I did try to construct such cases as this (after extending the fix to LocalDeclaration)
if( ((o=u) == null) ? u == null : u2 != null) 

but somehow the analysis seems ok. Will test some more. But i'm not really convinced that expressions of type boolean dont require null analysis.

Comment 9 Stephan Herrmann

2011-03-03 08:41:42 EST

(In reply to comment #8)
> > Luckily expressions of type boolean require no
> > null-analysis.
> 
> Why not? Maybe the conditions are being created out of comparisons with null?
> I did try to construct such cases as this (after extending the fix to
> LocalDeclaration)
> if( ((o=u) == null) ? u == null : u2 != null) 
> 
> but somehow the analysis seems ok. Will test some more. But i'm not really
> convinced that expressions of type boolean dont require null analysis.

You're right to question my statement. As phrased it is not correct.
We should distinguish:
- analysis of side effects wrt null => always necessary (and always happens)
- analyse whether the conditional expression as a whole can evaluate to null
  => only applicable for reference types

Only the latter is subject of this bug report.

In order to create a conflict between both analysis-modes we'd need a case
involving auto(un)boxing, s.t. like
    boolean foo(boolean b1) {
	Boolean b2;
	if (b1 ? (b2 = Boolean.TRUE) : null)
	    return b2;
	return false;
    }
And indeed(!): using such test we can trigger bogus codegen:
  java.lang.VerifyError: (class: Bug324178, method: foo signature: (Z)Z) Accessing value from uninitialized register 2

Whereas, without the patch we correctly report:
  The local variable b2 may not have been initialized

I'll see if the patch can be improved to correctly handle this case, too.

Comment 10 Stephan Herrmann

2011-03-03 18:12:34 EST

Created attachment 190328 [details]
even better fix

Based on the previous analysis here's a new strategy:
- do not interfere with the existing construction of flow infos
  because this is needed for definite assignment analysis
- only for the purpose of computing ConditionalExpression.nullStatus(..)
  remember two intermediate flow infos

This solution no longer depends on claims about independence of 
def-ass-analysis and null-analysis.

Additionally it can do without any changes in LocalDeclaration
(and Assignment - which I promised but never implemented).

Could remembering two FlowInfos cause memory issues? If anybody is
concerned about these new references we might actually compute the
nullStatus more eagerly and only remember the status (an int).

Comment 11 Stephan Herrmann

2011-03-03 18:23:10 EST

Ayush, this is my last proposal for M6, if you have the time for a
look into the new (small :) ) patch.

It should be safer than the previous versions, because it only ever
affects the method nullStatus(..), the actual flowInfos are constructed 
just as before.

Comment 12 Ayushman Jain

2011-03-04 05:08:30 EST

(In reply to comment #11)
> Ayush, this is my last proposal for M6, if you have the time for a
> look into the new (small :) ) patch.
> 
> It should be safer than the previous versions, because it only ever
> affects the method nullStatus(..), the actual flowInfos are constructed 
> just as before.

Yes this approach is definitely better and feels safer too. Thanks a lot for documenting the whole thing in the code! :)
One small thing: IMO, the check inside org.eclipse.jdt.internal.compiler.ast.ConditionalExpression.nullStatus(FlowInfo) on line 344 i.e. 
this.trueBranchInfo != null ? this.trueBranchInfo : flowInfo.initsWhenTrue()

is not really required. This is because trueBranchInfo can only be null if the optimized boolean constant is either true or false (cases like (true)? a:b). And in such a case nullStatus will anyway return from either line 339 or 341. 
Also, why do we need to check initsWhenTrue() or initsWhenFalse() on flowInfo? This would be only relevant if the flowInfo is a ConditionalFlowInfo, but I don't think we'll ever end up with that inside nullStatus(). Querying just flowInfo for the nullStatus should work.
Also, tests b, c, and d should be added in org.eclipse.jdt.core.tests.compiler.regression.InitializationTests

Let me know what you think.

Comment 13 Stephan Herrmann

2011-03-04 11:23:56 EST

(In reply to comment #12)
> One small thing: IMO, the check inside
> org.eclipse.jdt.internal.compiler.ast.ConditionalExpression.nullStatus(FlowInfo)
> on line 344 i.e. 
> this.trueBranchInfo != null ? this.trueBranchInfo : flowInfo.initsWhenTrue()
> 
> is not really required. This is because trueBranchInfo can only be null if the
> optimized boolean constant is either true or false (cases like (true)? a:b).
> And in such a case nullStatus will anyway return from either line 339 or 341. 
> Also, why do we need to check initsWhenTrue() or initsWhenFalse() on flowInfo?
> This would be only relevant if the flowInfo is a ConditionalFlowInfo, but I
> don't think we'll ever end up with that inside nullStatus(). Querying just
> flowInfo for the nullStatus should work.

I was suspecting being overly careful, but I figured it would be safest if
we just fall back to the exact previous behavior in cases where the new
fields are null. I agree I could do some cleanup, but for which items are
we actually certain? I mean, really certain :)

I will try to find counter examples for your suggestions later today.
Once I'm convinced that I won't find any, I will do away with the unused
stuff.

> Also, tests b, c, and d should be added in
> org.eclipse.jdt.core.tests.compiler.regression.InitializationTests

Will do, no problem.

Comment 14 Stephan Herrmann

2011-03-04 14:10:12 EST

Created attachment 190428 [details]
simplified patch

Thanks for raising the question. For one reason I felt uneasy about the
proposed simplification: they would create a strong dependency between
different branches in two separate methods (like: in nullStatus(..) we
*assume* that we have taken a specific branch inside analyseCode(..)).
That doesn't feel very comfortable..

Then I combined your observations with my proposal in comment 10:
pre-compute and store the actual nullStatus instead of remembering the
two flowInfos. Now each branch in analyseCode(..) computes the nullStatus
in its specific way. The method nullStatus(..) degenerates to a getter,
that only relies on one assumption: analyseCode has been called before.
And even if that would ever break we have a usable default: UNKNOWN.
I like that.

I'm running the tests again.

Comment 15 Stephan Herrmann

2011-03-04 14:16:34 EST

Forgot to adjust one comment inside analyseCode (line 105).
Will fix this after the review.

Comment 16 Ayushman Jain

2011-03-04 15:44:43 EST

This patch looks good.

Comment 17 Stephan Herrmann

2011-03-04 16:24:38 EST

Released for 3.7M6.

Comment 18 Stephan Herrmann

2011-03-04 16:28:16 EST

Created attachment 190447 [details]
final patch

For the records, this is the version that I committed
(updated to HEAD and fixed cosmetics).

Comment 19 Olivier Thomann

2011-03-08 12:27:57 EST

Verified for 3.7M6.