Bug 127570 - [compiler][null] lazy initialization coding pattern within loops
Summary: [compiler][null] lazy initialization coding pattern within loops
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows 2000
: P3 normal (vote)
Target Milestone: 3.2 RC2   Edit
Assignee: Maxime Daniel CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 133131 (view as bug list)
Depends on:
Blocks: 133131
  Show dependency tree
 
Reported: 2006-02-13 17:01 EST by John Arthorne CLA
Modified: 2006-09-26 08:58 EDT (History)
2 users (show)

See Also:


Attachments
Tentative fix + tests (378.17 KB, patch)
2006-04-21 10:34 EDT, Maxime Daniel CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2006-02-13 17:01:05 EST
Build: I20060213-0800

I tried turning on the new null reference analysis feature, but it stumbles on this common coding pattern to lazy-initialize a variable:

if (o == null)
  o = new Object();

To give a simple example, it gives a null reference warning in this snippet:

public class A {
  public boolean foo() {
    Boolean b = null;
    for (int i = 0; i < 1; i++) {
      if (b == null)
        b = Boolean.TRUE;
      if (b.booleanValue())//reports null reference, but it's impossible
        return b.booleanValue();
    }
    return false;
  }
}
Comment 1 Philipe Mulet CLA 2006-02-13 17:18:49 EST
Maxime - the nullStatus of Boolean.TRUE could be determined by noticing the constant is set, and null cannot be a constant value.
Comment 2 John Arthorne CLA 2006-02-13 17:38:00 EST
Maybe my example was too simple.  Here is a case that can't be determined by analyzing a final field:

public class A {
  public boolean foo() {
    Boolean b = null;
    for (int i = 0; i < 1; i++) {
      if (b == null)
        b = createBoolean();
      if (b.booleanValue())//null reference warning here
        return b.booleanValue();
    }
    return false;
  }
  private Boolean createBoolean() {
    return new Boolean(true);
  }
}
Comment 3 Maxime Daniel CLA 2006-02-14 02:55:20 EST
(In reply to comment #0)
> Build: I20060213-0800
> 
> I tried turning on the new null reference analysis feature, but it stumbles on
> this common coding pattern to lazy-initialize a variable:
> 
> if (o == null)
>   o = new Object();

This one is covered. See NullReferenceTest#test0311.
I would then contend that lazy initialization is covered, as long as we know for sure the null status of the right hand side of the assignment.

> 
> To give a simple example, it gives a null reference warning in this snippet:
> 
> public class A {
>   public boolean foo() {
>     Boolean b = null;
>     for (int i = 0; i < 1; i++) {
>       if (b == null)
>         b = Boolean.TRUE;
>       if (b.booleanValue())//reports null reference, but it's impossible
>         return b.booleanValue();
>     }
>     return false;
>   }
> }
> 

Here, our current implementation doesn't know about Boolean.TRUE null status. The test shades doubts about b being potentially null, which the assignment does not contradict. Hence the message.
The same for the example of Comment #2.

Agree with Philippe that the Boolean.TRUE case could be handled. I will do this in the scope of this bug.

The Comment #2 case is an example of inter-procedural analysis, which is not covered yet. Bug 126551 tracks specifically the latest point.

Having said this, were you suggesting that as long as we get the following pattern, the variable o should be considered as non null?

if (o == null) {
  o = anything;
}
Comment 4 John Arthorne CLA 2006-02-14 09:31:36 EST
I guess I'm confused about how you handle the inter-procedural cases.  It seems like if you don't know anything about the state of the variable, it shouldn't create a warning.  Otherwise the warnings will produce many false positives, making the feature less useful.

I'll give an example that seems to be contradict the outcome in comment #2.  In this case, no warning is produced:

  public boolean bar() {
    Boolean b = createBoolean();
    return b.booleanValue();
  }

Without inter-procedural analysis you don't know if b is null or not. In this case it behaves as I would expect: it doesn't flag a warning because you have no idea about the state of the variable.  Compare this snippet to the one in comment #2.  In both cases we know the variable is only null if createBoolean() returns null.  Without knowledge of createBoolean(), I think neither case should create a warning.
Comment 5 John Arthorne CLA 2006-02-14 09:52:44 EST
Playing around with my snippet some more, it seems to only produce a warning when the code is in a loop.  In this lazy initialization case, it doesn't produce a warning:

  public boolean baz() {
    Boolean b = null;
    if (createBoolean().booleanValue())
      b = new Boolean(true);
    if (b == null)
      b = createBoolean();
    return b.booleanValue();
  }

This seems very similar to the snippet in comment #2: b can only be null if createBoolean() returns null, but this case does not produce a warning.
Comment 6 Maxime Daniel CLA 2006-02-14 11:38:12 EST
(In reply to comment #4)
> I guess I'm confused about how you handle the inter-procedural cases.  It seems
> like if you don't know anything about the state of the variable, it shouldn't
> create a warning.  Otherwise the warnings will produce many false positives,
> making the feature less useful.
> I'll give an example that seems to be contradict the outcome in comment #2.  In
> this case, no warning is produced:
>   public boolean bar() {
>     Boolean b = createBoolean();
>     return b.booleanValue();
>   }
> Without inter-procedural analysis you don't know if b is null or not. In this
> case it behaves as I would expect: it doesn't flag a warning because you have
> no idea about the state of the variable.  Compare this snippet to the one in
> comment #2.  In both cases we know the variable is only null if createBoolean()
> returns null.  Without knowledge of createBoolean(), I think neither case
> should create a warning.

What makes the example of comment #2 complain is the test of b against null.

In:
if (o == null) {
  o = new Object();
}
the object allocation is known to yield a non null object, hence it clears any doubts about o.
In:
if (o == null) { // a
  o = bar();     // b
}
line a taints o, but because we have no information about bar, line b does not clear any doubt.

There might be some pattern we could try to chase here, but it would have to be fairly restrictive. What I mean is that we would need a very strong indication that the developer wants to do a lazy initialization. Moreover, the following pattern would then not be handled with very nicely:

if (o == null) {
  o = bar(); // bar *can* return null
}
if (o == null) {
  // do something
}

The current thinking was that we would add inter procedural support later, which would enable the developer to tell that bar is supposed to return a non null value.
Comment 7 Maxime Daniel CLA 2006-02-14 11:41:17 EST
(In reply to comment #5)
I believe this one is a bug. I'll investigate further.
Comment 8 John Arthorne CLA 2006-02-14 11:58:19 EST
I'm just suggesting the philosophy of "innocent until proven guilty".  In the case where you have assignment "b = createBoolean();" we no longer know anything about the state of b without inter-procedural analysis, so I think it should remove the "taint" from b.  I like to keep my code free of compiler warnings, so I would usually leave a warning turned off if it produced false warnings...

Because I might sound negative, I wanted to mention that this feature in general is very useful.  I have found several potential problems in my code already by turning it on. With the exception of a few cases, it's an excellent new feature!
Comment 9 Maxime Daniel CLA 2006-02-15 03:05:35 EST
(In reply to comment #8)
> I'm just suggesting the philosophy of "innocent until proven guilty".  In the
> case where you have assignment "b = createBoolean();" we no longer know
> anything about the state of b without inter-procedural analysis, so I think it
> should remove the "taint" from b.

I reached similar conclusions this morning around 4am... My previous answers were probably quite defensive, and not as accurate as I wish they had been.

Same player shoot again...
The example in comment #5 is what we expect, and this is how the feature behaves in the current code anyway. 
In your first post, this is the loop that changes the behavior, and I don't know why yet. I'll revisit this after M5.
Comment 10 Maxime Daniel CLA 2006-04-21 10:27:50 EDT
Attaching tentative patch, which is undergoing tests. (Perf tests scheduled for next Monday.)
Will review changes early next week.
Comment 11 Maxime Daniel CLA 2006-04-21 10:34:09 EDT
Created attachment 39168 [details]
Tentative fix + tests

This patch:
- changes the states definitions for the null analysis (adds three);
- recodes transitions accordingly;
- augments functional tests for the null reference analysis;
- reorganizes (radically) implementation tests for the null reference analysis.
Comment 12 Maxime Daniel CLA 2006-04-26 05:34:16 EDT
Fixed and released in HEAD.
(Tagged version v_658a before).
Verifier may track 127570 marker in NullReferenceTest.
Comment 13 Olivier Thomann CLA 2006-04-28 11:58:30 EDT
Verified with I20060427-1600 for 3.2RC2
Comment 14 Maxime Daniel CLA 2006-09-26 08:58:40 EDT
*** Bug 133131 has been marked as a duplicate of this bug. ***