Bug 473113 - org.eclipse.mat.parser.index.IndexWriter$Identifier.add(IndexWriter.java 91) run out of memory
Summary: org.eclipse.mat.parser.index.IndexWriter$Identifier.add(IndexWriter.java 91) ...
Status: CLOSED MOVED
Alias: None
Product: MAT
Classification: Tools
Component: Core (show other bugs)
Version: 1.5   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 563960 573598
Blocks:
  Show dependency tree
 
Reported: 2015-07-20 16:40 EDT by G Xu CLA
Modified: 2024-05-08 14:51 EDT (History)
4 users (show)

See Also:


Attachments
Result of Find object by address (76.32 KB, image/png)
2021-05-24 02:28 EDT, Andrew Johnson CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description G Xu CLA 2015-07-20 16:40:44 EDT
When processing a customer java system dump, we run into the
following error.

java.lang.OutOfMemoryError: Requested length of new long[2,147,483,640]
exceeds limit of 2,147,483,639
   at
org.eclipse.mat.parser.index.IndexWriter$Identifier.add(IndexWriter.java
:91)

   at
org.eclipse.mat.dtfj.DTFJIndexBuilder.rememberObject(DTFJIndexBuilder.ja
va:3179)

   at
org.eclipse.mat.dtfj.DTFJIndexBuilder.fill(DTFJIndexBuilder.java:1162)
   at
org.eclipse.mat.parser.internal.SnapshotFactoryImpl.parse(SnapshotFactor
yImpl.java:237)

   at
org.eclipse.mat.parser.internal.SnapshotFactoryImpl.openSnapshot(Snapsho
tFactoryImpl.java:141)

   at
org.eclipse.mat.snapshot.SnapshotFactory.openSnapshot(SnapshotFactory.ja
va:145)

   at
org.eclipse.mat.internal.apps.ParseSnapshotApp.parse(ParseSnapshotApp.ja
va:134)

   at
org.eclipse.mat.internal.apps.ParseSnapshotApp.start(ParseSnapshotApp.ja
va:106)

   at
org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.j
ava:196)

   at
org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplicat
ion(EclipseAppLauncher.java:110)

   at
org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(Eclip
seAppLauncher.java:79)

   at
org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:
369)

   at
org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:
179)

   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:94)

   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:55)

   at java.lang.reflect.Method.invoke(Method.java:619)
   at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:620)
   at org.eclipse.equinox.launcher.Main.basicRun(Main.java:575)
   at org.eclipse.equinox.launcher.Main.run(Main.java:1408)
   at org.apache.tools.ant.taskdefs.Exit.execute(Exit.java:164)
   at
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
   at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:55)

   at java.lang.reflect.Method.invoke(Method.java:619)
   at
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:1
06)

   at org.apache.tools.ant.Task.perform(Task.java:348)
   at
org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68)
   at net.sf.antcontrib.logic.IfTask.execute(IfTask.java:197)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:94)

   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:55)

   at java.lang.reflect.Method.invoke(Method.java:619)
   at
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:1
06)

   at org.apache.tools.ant.TaskAdapter.execute(TaskAdapter.java:154)
   at
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:94)

   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:55)
   at java.lang.reflect.Method.invoke(Method.java:619)
   at
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:1
06)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at org.apache.tools.ant.Target.execute(Target.java:435)
   at org.apache.tools.ant.Target.performTasks(Target.java:456)
   at
org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
   at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
   at
org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecut
or.java:41)

   at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
   at org.apache.tools.ant.Main.runBuild(Main.java:851)
   at org.apache.tools.ant.Main.startAnt(Main.java:235)
   at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
   at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
*Behaviour differences: na
*Impact on product plans: Can't proceed and we don't know we run out of
memory.
Comment 1 Andrew Johnson CLA 2015-11-14 12:24:56 EST
Are there any messages in the error log?
Is it possible that the dump is huge and has more than 2,147,483,639 objects?
The help says:
"Memory Analyzer has an architectural limit of 2^31 - 3 objects, a current limit of 2^31 - 8 = 2,147,483,640 objects, but has not been tested with that many objects. The current record is a heap dump file of 48Gbytes containing 948,000,000 objects, which was opened with Memory Analyzer running with a 58Gbyte heap."
See also https://dev.eclipse.org/mhonarc/lists//mat-dev/msg00324.html
Comment 2 G Xu CLA 2015-11-14 15:40:49 EST
no additional information.  We know that we run out index range a few times.  The work around was to take a dump early to avoid the issue.  However, we still have an issue to address here.
Comment 3 Andrew Johnson CLA 2020-06-06 10:03:07 EDT
I've written a work around for this problem in bug 563960, but that is just for HPROF.
If it looks useful for DTFJ core dumps then we could do the same for the DTFJ parser.
What do you think?
Comment 4 G Xu CLA 2020-06-08 09:16:45 EDT
it is worth to proceed.  MAT is still a tool we use on daily basis.
Comment 5 Andrew Johnson CLA 2020-06-10 15:43:14 EDT
I've made the changes for DTFJ including adding some help in bug 563960.
When you next have a large dump please could you test a snapshot build from https://www.eclipse.org/mat/snapshotBuilds.php
and see if discarding objects helps you analyze your dump.
Thanks.
Comment 6 G Xu CLA 2020-06-11 00:18:50 EDT
thanks and will do!
Comment 7 Tomas Hamal CLA 2020-11-02 05:09:44 EST
Tried to read 149GB hprof file without sucess with snapshot version:

java.lang.OutOfMemoryError: Requested length of new long[2,147,483,640] exceeds limit of 2,147,483,639
	at org.eclipse.mat.parser.index.IndexWriter$Identifier.add(IndexWriter.java:103)
	at org.eclipse.mat.hprof.HprofParserHandlerImpl.reportInstance(HprofParserHandlerImpl.java:1001)
	at org.eclipse.mat.hprof.HprofParserHandlerImpl.reportInstanceWithClass(HprofParserHandlerImpl.java:1013)
	at org.eclipse.mat.hprof.Pass1Parser.readInstanceDump(Pass1Parser.java:762)
	at org.eclipse.mat.hprof.Pass1Parser.readDumpSegments(Pass1Parser.java:440)
	at org.eclipse.mat.hprof.Pass1Parser.read(Pass1Parser.java:211)
	at org.eclipse.mat.hprof.HprofIndexBuilder.fill(HprofIndexBuilder.java:82)
	at org.eclipse.mat.parser.internal.SnapshotFactoryImpl.parse(SnapshotFactoryImpl.java:273)
	at org.eclipse.mat.parser.internal.SnapshotFactoryImpl.openSnapshot(SnapshotFactoryImpl.java:167)
	at org.eclipse.mat.snapshot.SnapshotFactory.openSnapshot(SnapshotFactory.java:147)
	at org.eclipse.mat.ui.snapshot.ParseHeapDumpJob.run(ParseHeapDumpJob.java:95)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
Comment 8 Andrew Johnson CLA 2020-11-02 05:38:03 EST
Thank you for trying the snapshot version.

There is still a limit of approximately 2^31 objects in the snapshot, but there is a new facility to discard some objects
in the initial parse to reduce the number of objects below the limit. See 'Memory Analyzer Configuration' in the help.

Please could you test that to see if the discard option is helpful and whether the help makes sense.

[The help should be updated to have the full text of that error message (it is missing java.lang. before OutOfMemoryError) to aid searches. Also, perhaps the exception should suggest trying Window > Preferences > Memory Analyzer > Enable discard ]
Comment 9 G Xu CLA 2020-11-02 09:19:48 EST
perhaps the default is on, and when this type of things happens, a warning message is generated with suggestions if available?
Comment 10 Andrew Johnson CLA 2020-11-07 03:21:22 EST
The current discard settings can't always be on, because they discard a percentage of objects of specified types, which would change the results for every dump. It might be possible to discard all objects after a certain number, but that is more likely to change the object graph than discarding objects with no outbound references (other than the class), such as char[], and objects only referencing those types of object, like String. Someone could write a query generating a list of candidate classes for discard, based on those characteristics, ready for a re-parse, but does this problem occur often enough to make that worthwhile?
Comment 11 Eclipse Genie CLA 2020-11-07 03:25:21 EST
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/171933
Comment 13 Tomas Hamal CLA 2020-11-12 09:57:26 EST
I need to say, the I found nothing in the documentation (Memory Analyzer Configuration) about discard option.
But after playing with it for a while I was able to run it successfully and also found a memory leak in our app.

So Thank You very much for pointing it out.
Comment 14 Andrew Johnson CLA 2020-11-12 10:59:44 EST
I glad it helped.
There should be documentation from within MAT from Help > Help contents > Tasks > Memory Analyzer Configuration
or by pressing F1 from Window > Preferences > Memory Analyzer.

The information is not yet in the online documentation at help.eclipse.org
Comment 15 Jason Koch CLA 2021-04-30 13:49:49 EDT
I am coming across more heaps that fit this case. Our interim solution is likely to pre-parse the heap and count how many objects, and then implement discard if it is over the limit.

Is there a longer term plan to get MAT onto 64-bit identifiers? I think this is feasible, and would actually simplify some of the codebase, at the obvious cost of increasing heap footprint. OTOH, I think this only affects parsing stage, since most of the "use" of indexes at read time doesn't retain significantly more footprint and due to SoftRefs is very manageable heap usage.

I haven't looked at how much work this would be. I suspect the main technical challenge would be (1) backwards compatibility for already-indexed heaps. I think discard & reindex is likely appropriate solution here, (2) increased memory footprint by using long[] during parsing. I think view/read phase unlikely impacted significantly. Thoughts?
Comment 16 Jason Koch CLA 2021-05-05 17:30:46 EDT
For what it's worth, with the discard solution, I see some issues, I expect we need to handle this in a couple of places. For ex, I cannot open the System Properties box as some strings have been removed. This is not unexpected, but might require some explanation for the user.
Comment 17 Andrew Johnson CLA 2021-05-06 05:00:01 EDT
The discard objects feature was a simple work-around without a major rewrite and I'm pleased it worked at all. Some minor enhancements ideas:
1. a query which suggests suitable objects to discard (from a randomly discarded heap, processed with keep unreachable objects). Objects could be simple (no object refs other than its class), basic (just refs to simple objects) and possibly one more level. These can be discarded without breaking the whole of the object tree.
2. Accumulate memory from discarded objects in the used heap of a parent (either ignore multiple parent, choose one, or average out?). This will be harder, but MAT does copy with variable sized simple objects as well as arrays
3. Possibly have some kind of parser enhancer which can operate on fields pointing to discarded objects and retrieve information directly from the dump. Not sure how this would work. Possibly via https://help.eclipse.org/2021-03/topic/org.eclipse.mat.ui.help/doc/org/eclipse/mat/snapshot/ISnapshot.html#getSnapshotAddons-java.lang.Class-

Moving to 64-bit is going to be hard. One idea to assess the scope and possibly ease the migration would be to have an annotation @ObjectID which we could use to mark up the source. A global replace '@ObjectID int' with @ObjectID long' won't work, but might show where the next problems are. A Java compiler annotation process for @ObjectID might help the migration.
Comment 18 Jason Koch CLA 2021-05-07 11:32:09 EDT
Yes, there is the dirct int->long swap, and on top of this, there is also likely to be the absence of primitive arrays of appropriate length which will drive use of some custom array layer.
Comment 19 Andrew Johnson CLA 2021-05-17 09:10:01 EDT
For comment 16, yes there are problems with discarding objects. I'm considering some improvements which could be made without really changing the API.

System properties - yes the query could return null for those invalid key or value fields, either always, or when some objects have been discarded in the snapshot.

I think there is scope to have the inspector view display unindexed objects.

Some of the MAT API uses int object IDs, some does not.
For example a IObject could exist without an object ID, though you would need to be careful.
Then IObject.getReferences() returns NamedReference and IInstance has getFields returning ObjectReference.
ObjectReference is constructed using a snapshot and address and has
getObjectAddress()
getObjectId()
getObject()

In the Inspector view unindexed fields appear as:

Type|Name|Value
--------------------
ref |[0] |0x8011b0c0
--------------------

Context query operations do not work on this as it does not have an object ID. We could fudge it to return an IContextObjectSet
IContextObjectSet
int getObjectId();
int[] getObjectIds();
String getObjectOQL();
The object IDs still can't be used, but OQL could return:
SELECT * FROM OBJECTS 0x8011b0c0
which still wouldn't work, but would return something containing the object address, and then the 'copy address' query could extract that.

I've also found it is possible to modify a parser to return unindexed objects.

IObjectReader has:

Modifier and Type 	Method and Description
void 	close()
tidy up when snapshot no longer required
<A> A 	getAddon(Class<A> addon)
Get additional information about the snapshot
void 	open(ISnapshot snapshot)
Open the dump file associated with the snapshot
IObject 	read(int objectId, ISnapshot snapshot)
Get detailed information about an object
long[] 	readObjectArrayContent(ObjectArrayImpl array, int offset, int length)
Get detailed information about a object array
Object 	readPrimitiveArrayContent(PrimitiveArrayImpl array, int offset, int length)
Get detailed information about a primitive array

now these don't directly look up an object by address, but read() returns the fields each containing an ObjectReference. If this is an HPROF subclass of ObjectReference it could do something special when the ObjectReference doesn't have a valid object ID. With getObject() it could find the object IDs with addresses before and after this object. We can then use the existing O2HPROF index to parse the HPROF file starting at the object before, finishing at the object after, looking for the required address. That relies on the HPROF dump being written in order of heap address, but avoids having to created a whole new index. E.g. discard 2,500,000,000 objects might take 40GB for the index of object address to HPROF location, plus we don't have an index which could cope with 2,500,000,000 indices. We would need to make sure we did not discard the last object in a heap dump segment or handle the case of ending one segment and starting the next.

So that handles the case of unindexed fields from a regular object.
readObjectArrayContent
readPrimitveArrayContent 
could possibly handle the array itself being unindexed. The parser shouldn't generate an IArray with getOutboundReferences() fixed to return  these special references as IArray is noextend. The unindexed object addresses returned by readObjectArrayContent couldn't be converted to objects though.

IObjectReader could be extended to add
IObject 	read(long objectAddress, ISnapshot snapshot)
but it would need a default method (Java 8?) or a new interface IObjectReader2. I'm not ready for that yet.

It's ugly, but another way is snapshot.getAddOns(ObjectReference.class) for the object instance code to get the parser create a special ObjectReference which then could have its address changed by the object instance code in the org.eclipse.mat.snapshot.model package (by adding a package private setAddress() method, and getObject() would call back into the parser object reader code. This would only work for references constructed in the org.eclipse.mat.snapshot.model package though.
Comment 20 G Xu CLA 2021-05-17 09:14:16 EDT
nice investigation.  As I see more and more objects not having valid objectId,  this can be useful.  Also, I am wondering if there is some underline reason that we still see this more often in last couple of years?
Comment 21 Jason Koch CLA 2021-05-17 15:27:43 EDT
Hardware is getting cheaper and bigger, and users are using larger application instances:

For example:
https://aws.amazon.com/about-aws/whats-new/2021/05/four-ec2-high-memory-instances-with-up-to-12tb-memory-available-with-on-demand-and-savings-plan-purchase-options/
Comment 22 Andrew Johnson CLA 2021-05-24 02:28:22 EDT
Created attachment 286443 [details]
Result of Find object by address

Unindexed object
Comment 23 Andrew Johnson CLA 2021-05-24 02:44:26 EDT
I've made the changes in discussed in comment 19 in bug 573598.
https://www.eclipse.org/lists/mat-dev/msg00647.html

Now, after discarding objects, they remain accessible via ObjectReference but not via object ID.
Net result:
As now only indexed objects appear in the list_objects query, and as children in the tree.
The resolved name involving unindexed (discarded) objects is found.
In the inspector view you can go into (and now go back from) unindexed objects.
In the inspector view, unindexed object have a warning triangle instead of the GC roots icon.
The Find object by address query finds unindexed objects.
OQL works (to a certain extent) with unindexed objects, and can now return a list of IObject
(instead of an int[] array) if not all the objects have an index. The OQL query then displays those objects as a table.
Unindexed objects have a size, but are not included in retained size calculations, so have zero retained size, and are not in the dominator tree or retained set.
The collection queries often ignore unindexed objects.
The component report works, but unindexed collections are not analyzed.
The leak suspected report will ignore the unindexed objects [but might use them in resolving names].

It won't solve a problem with objects having an outbound field pointing to an object which is not in the heap at all. 
See attachment 1 [details]

If this is useful we can look at fixing some of the other queries, so please give it a try.

This is not as good as a proper >32 bit index version. This change did show some of the difficulties which will arise.
Comment 24 Axel Uhl CLA 2022-06-01 14:52:37 EDT
I can only agree that larger applications need a remedy to this sort of problem. We just produced two heap dumps of an application that ended up in unpleasant and unexpected Full GCs, one ~270GB, the other 350GB in size, and there is no way we can get them analyzed with Memory Analyzer. Any progress here?
Comment 25 Andrew Johnson CLA 2022-06-06 09:39:18 EDT
(In reply to Axel Uhl from comment #24)
> I can only agree that larger applications need a remedy to this sort of
> problem. We just produced two heap dumps of an application that ended up in
> unpleasant and unexpected Full GCs, one ~270GB, the other 350GB in size, and
> there is no way we can get them analyzed with Memory Analyzer. Any progress
> here?
Not much progress - but some ideas in bug 572512. Breaking the 2G object limit
would be a big change and API incompatible so I don't know when anyone would
get the time to do it. MAT gets over 200,000 downloads a year - I don't know
how many users have huge heaps though.

The 270GB and 350GB heaps are huge, and might have between 3500 million and 5000 million objects, which would exceed the MAT limit. Did the object discard option https://help.eclipse.org/latest/topic/org.eclipse.mat.ui.help/tasks/configure_mat.html#task_configure_mat__discard help at all?
Comment 26 Tomas Hamal CLA 2022-06-06 09:44:59 EDT
For us the discard option worked quite well and we was able to diagnose over 100GB heap dumps. Sure, it has the consequences that you can discard also useful information as we needed to discard about 50% of all data, just basic discard options was not enough. However we were able to locate the memory leaks.
Comment 27 Andrew Johnson CLA 2022-07-09 10:42:16 EDT
Bug 580107 fixes the problem in comment 16 of system properties not displaying values when strings are discarded.
Comment 28 Eclipse Webmaster CLA 2024-05-08 14:51:14 EDT
This issue has been migrated to https://github.com/eclipse-mat/org.eclipse.mat/issues/24.