Bug 580893 - Delays expanding large arrays in the outbound tree with a core dump
Summary: Delays expanding large arrays in the outbound tree with a core dump
Status: RESOLVED FIXED
Alias: None
Product: MAT
Classification: Tools
Component: GUI (show other bugs)
Version: 1.13   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 1.14.0   Edit
Assignee: Andrew Johnson CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2022-10-11 15:53 EDT by Kevin Grigorenko CLA
Modified: 2023-02-03 12:28 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Grigorenko CLA 2022-10-11 15:53:17 EDT
Below is an example stack trace during the delay that shows the proximate driver of the slowdown being ObjectListResult$Tree.prefix(ObjectListResult.java:251). This code is adding the array index prefix(es) to each element (e.g. the "[70162]" in "[70162] java.lang.String @ 0x80024ff8  Some string"). Each child reference of the array is evaluated independently and searches up to Integer.MAX_VALUE array elements (or up to 150 duplicated element references), so even though MAT only shows up to the first 25 children of the array, that can be up to min($ARRAY_LENGTH, Integer.MAX_VALUE) * 25 searches.

The problem is reproducible consistently and shows similar stack traces. This work happens on the UI thread, so the MAT GUI locks up for dozens of seconds or more with very large arrays (e.g. tens of millions of array elements).

This problem does not happen with PHDs as the array index information is unavailable with PHDs.

4XESTACKTRACE                at com/ibm/j9ddr/corereaders/memory/AbstractMemory.getBytesAt(AbstractMemory.java:176(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/corereaders/memory/AbstractMemory.getIntAt(AbstractMemory.java:226(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/pointer/AbstractPointer.getIntAtOffset(AbstractPointer.java:360(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/pointer/generated/J9IndexableObjectContiguousCompressedPointer.size(Bytecode PC:8(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/pointer/helper/J9IndexableObjectHelper.rawSize(J9IndexableObjectHelper.java:82(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/j9/gc/GCArrayletObjectModelBase.getArrayLayout(GCArrayletObjectModelBase.java:201(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/j9/gc/GCArrayletObjectModelBase.isInlineContiguousArraylet(GCArrayletObjectModelBase.java:315(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/j9/gc/GCArrayletObjectModelBase.getElementAddress(GCArrayletObjectModelBase.java:365(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/j9/gc/GCObjectModel_V1.getElementAddress(GCObjectModel_V1.java:392(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/j9/ObjectModel.getElementAddress(ObjectModel.java:389(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/pointer/helper/J9IndexableObjectHelper.getObjectData(J9IndexableObjectHelper.java:398(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/pointer/helper/J9IndexableObjectHelper.getData(J9IndexableObjectHelper.java:302(Compiled Code))
4XESTACKTRACE                at com/ibm/j9ddr/vm29/view/dtfj/java/DTFJJavaObject.arraycopy(DTFJJavaObject.java:119(Compiled Code))
4XESTACKTRACE                at org/eclipse/mat/dtfj/DTFJHeapObjectReader.getObjectData(DTFJHeapObjectReader.java:783(Compiled Code))
4XESTACKTRACE                at org/eclipse/mat/dtfj/DTFJHeapObjectReader.readObjectArrayContent(DTFJHeapObjectReader.java:1050(Compiled Code))
5XESTACKTRACE                   (entered lock: com/ibm/j9ddr/view/dtfj/image/J9DDRImage@0x0000000081362E70, entry count: 1)
4XESTACKTRACE                at org/eclipse/mat/parser/model/ObjectArrayImpl.getReferenceArray(ObjectArrayImpl.java:110(Compiled Code))
4XESTACKTRACE                at org/eclipse/mat/snapshot/query/ObjectListResult$Tree.extractAttribute(ObjectListResult.java:305(Compiled Code))
4XESTACKTRACE                at org/eclipse/mat/snapshot/query/ObjectListResult$Outbound.fillInAttribute(ObjectListResult.java:112)
4XESTACKTRACE                at org/eclipse/mat/snapshot/query/ObjectListResult$Tree.prefix(ObjectListResult.java:251) [...]

Some ideas for solutions that come to mind:

1. Don't show prefixes if the array size is larger than some experimentally determined value (with a preference to override)
2. Start the prefix as blank and then use an async Job to add in the prefixes
3. Never show prefixes for array elements to avoid any delays showing something that is rarely needed. If the user wants to see them, they can select the element and the Inspector view shows the elements and indices. Add an option to re-enable this.

In my opinion, I suggest going with option 3. What do you think?
Comment 1 Andrew Johnson CLA 2022-10-12 11:02:23 EDT
There is similar code in:

CompareTablesQuery (maximum size 1024*1024)
DominatorQuery (maximum size 1024*1024)
MultiplePaths2GCRootsQuery (maximum size 20*1024*1024)
Paths2GCRootsQuery (maximum size 20*1024*1024)

Bug 566917
Bug 575364

On my laptop it takes about 13 seconds to expand a
14,000,000 element Object[] for a hprof.gz
If DTFJ takes much longer than that it looks like a DTFJ problem,
and it would be a shame to restrict the facility for HPROF as well.

I think I must have added the code in those other queries later, and
found it could take a while, so added the limit. ObjectListResult
didn't have a limit, so I didn't introduce one, but made the code
similar.

Perhaps it is time to introduce a limit - somewhere between 1 and 20 million.
I don't think it is worth adding a preference, few people would want to
change it and it is another thing to test, document, translate? etc.
Comment 2 Kevin Grigorenko CLA 2022-10-12 22:19:46 EDT
(In reply to Andrew Johnson from comment #1)
> If DTFJ takes much longer than that it looks like a DTFJ problem,
> and it would be a shame to restrict the facility for HPROF as well.

Another issue is that IBM Support uses MAT with dumps accessed over a networked filesystem.

> Perhaps it is time to introduce a limit - somewhere between 1 and 20 million.
> I don't think it is worth adding a preference, few people would want to
> change it and it is another thing to test, document, translate? etc.

A limit seems good to me. I was thinking of a much smaller limit (since this will pause the GUI thread) in the hundreds or thousands. What would be the use case for visualizing array element positions for an array with millions of elements? If the user wants to know the positions of the first N elements, they can always select the array and review the Inspector view.
Comment 3 Eclipse Genie CLA 2022-11-03 06:36:53 EDT
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/196594
Comment 4 Andrew Johnson CLA 2022-11-03 10:02:38 EDT
This change implements options 1 and 2.
Comment 6 Eclipse Genie CLA 2022-11-03 17:32:41 EDT
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/196602
Comment 8 Andrew Johnson CLA 2022-11-11 12:41:35 EST
I hope this solves the problem.
As we now use jobs to fill in the prefix then it should now be multithreaded to get this information which should also speed things up. There is already locking on getting the object information from the parser.