Bug 371642 - Path To GC Roots gives incomplete information
Summary: Path To GC Roots gives incomplete information
Status: CLOSED MOVED
Alias: None
Product: MAT
Classification: Tools
Component: Core (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-15 11:20 EST by Jonathan Lawrence CLA
Modified: 2024-05-08 14:44 EDT (History)
0 users

See Also:


Attachments
Testcase causes OoM using WeakHashMap. Use to generate heap dump. (980 bytes, application/octet-stream)
2012-02-15 11:29 EST, Jonathan Lawrence CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Lawrence CLA 2012-02-15 11:20:30 EST
Build Identifier: Eclipse Memory Analyzer Version 1.2.0

In some cases, the Path To GC Roots query does not return all active paths to an object.
Sometimes the missing paths can be the ones which are actually causing the object to be retained in the heap (eg sole strong reference, or sole path).

Reproducible: Always

Steps to Reproduce:
1. Run the attached testcase (comprising WeakHashLeak, Key and Value java classes)with a small heap size, and configured to take a heap dump on OoM.  java -Xmx2m WeakHashLeak .  This could be with an Oracle or IBM JVM - both seem to give incorrect results although the behaviour is different.
2. Open the heap dump in Memory Analyzer 1.2.0 or the IBM Memory Analyzer from IBM Support Assistant
3. From the Histogram view list the Key objects, choose a Key object and run Path To GC Roots with all references.
4. All Key objects have a path to GC root through a strong reference from a Value object.  This path is not shown.
5. For an Oracle VM hprof dump, excluding weak references then does show this path, for an IBM .phd heap dump, excluding weak references does not show this path so for some Key objects the query shows 0 paths to GC roots, even though there is a strong reference path.
6. Using "List objects with incoming references" allows the true paths to be explored and for the same Key object, the reference from Value can be seen for both VMs.
Comment 1 Jonathan Lawrence CLA 2012-02-15 11:29:22 EST
Created attachment 211050 [details]
Testcase causes OoM using WeakHashMap.  Use to generate heap dump.

Running the attached testcase (java -Xmx2m WeakHashLeak) with appropriate JVM options can produce a heap dump.
The heap dump thus produced, when loaded into MAT, gives misleading results for Paths To GC Roots.
Comment 2 Andrew Johnson CLA 2012-05-09 10:52:48 EDT
I don't know the paths code that well, but it might be deliberate. The algorithm seems to be to maintain a list of paths from the object by looking at inbound references, and if an inbound reference is found which has not been visited then add it to the list of paths, which is processed in FIFO order until a root is found. This means longer paths will be excluded in favour of shorter paths.
root->weakref->value->key
and 
weakref-(weak)>key

so reversing paths
step 1
key->value
key->weakref
step 2
key->value->weakref, but already visited weakref so ignore??
key->weakref->root - path, so show this path

We need a better explanation of what the paths queries do.

The PHD problem is probably because PHD files do not have field names, so MAT doesn't know which references are weak refs.
Currently MAT presumes that if there is not a strong named field reference to an object then it is a weak ref, so both the key and the value references are considered weak when excluding fields named for example "referent". Is there any way we do better?
Comment 3 Jonathan Lawrence CLA 2012-07-27 08:59:25 EDT
Here is a patch which changes the behaviour when field names are unavailable (eg for .phd files).  It changes the presumption that a valid named strong reference must be present for there to be a strong path, to assume that if no named reference can be found at all, there might be a strong one.

I think this is an improvement because it is less misleading to include paths to a GC root which might be excludible, than to exclude valid strong paths - which is what happens at the moment.

However it's still not ideal because really we would want the "exclude weak references" and similar option(s) to be unavailable (greyed out?) if they cannot be correctly computed for a given dump - and also with a message or explanation as to why they are unavailable.  Can anyone suggest how to do this?

Index: SnapshotImpl.java
===================================================================
--- SnapshotImpl.java	(revision 1358)
+++ SnapshotImpl.java	(working copy)
@@ -1741,11 +1741,15 @@
             long referentAddr = mapIdToAddress(referentId);
 
             List<NamedReference> refs = referrerObject.getOutboundReferences();
+            boolean ref_found = false; // set iff we find a valid reference to the referent in refs. 
             for (NamedReference reference : refs)
             {
-                if (referentAddr == reference.getObjectAddress() && !excludeFields.contains(reference.getName())) { return false; }
+                if (referentAddr == reference.getObjectAddress() ) {
+                	ref_found = true; // We have found a valid reference to the referent
+                	if (!excludeFields.contains(reference.getName())) { return false; } // non-excluded reference
+                }
             }
-            return true;
+            return ref_found; // return true only if we found at least one reference and *all* are excluded. 
         }
 
         public int[] getNextShortestPath() throws SnapshotException
Comment 4 Andrew Johnson CLA 2012-08-30 04:46:52 EDT
Please add the patch as an attachment - it makes it easier for the IP log, and Bugzilla will show the effect of the patch.

There is similar code in ObjectMarker too.
Comment 5 Andrew Johnson CLA 2012-08-30 10:16:33 EDT
We should examine all uses of IObject:
public List<NamedReference> getOutboundReferences();

For PHD files not all outbound refs are listed (as there is not field data, just the refs).
Also for DTFJ dumps the references from class loaders to classes are not named.

Is this API meant to list all references, including unnamed ones? That would be hard for PHD files which would require the creation of dummy fields.

Instead we should check if getOutboundReferences is just used to get descriptions of outbounds from the index.
Comment 6 Andrew Johnson CLA 2012-09-05 09:57:31 EDT
Making those changes might cause a problem with the Weak References Statistics etc. queries.
I ran a component report on a PHD file and got a section in the report saying:

A total of 848 java.lang.ref.WeakReference objects have been found, which weakly reference 226 objects.
No objects totalling 0 B are retained (kept alive) only via weak references.
Possible Memory Leak One object totalling 112 B are weakly referenced and also strongly retained (kept alive) via weak references.

We may need to check the logic for ReferenceQuery to see how it finds weak refs and strong refs. See defect 266231.
Comment 7 Eclipse Webmaster CLA 2024-05-08 14:44:35 EDT
This issue has been migrated to https://github.com/eclipse-mat/org.eclipse.mat/issues/18.