Bug 579336 - Max Size and time
Summary: Max Size and time
Status: RESOLVED FIXED
Alias: None
Product: MAT
Classification: Tools
Component: Core (show other bugs)
Version: 1.12   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 1.14.0   Edit
Assignee: Andrew Johnson CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2022-03-20 01:25 EDT by Svarnim Agarwal CLA
Modified: 2023-02-03 12:25 EST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Svarnim Agarwal CLA 2022-03-20 01:25:40 EDT
Hey,

Wanted to know if there is a max size to a file that can be processed by MAT?
How long should it take, approximately, for a 10GB hprof file? 
Is there a way to utilize multi-threading? 
What should the heapdump value be in MemoryAnalyzer.ini for a hprof of 10GB and also is there any minimum amount of RAM and CPU that would be required for the same?
Comment 1 Andrew Johnson CLA 2022-03-21 14:15:23 EDT
Have you read https://help.eclipse.org/latest/topic/org.eclipse.mat.ui.help/tasks/configure_mat.html?cp=56_3_16 ?
MAT has handled 70GB HPROF files.
I don't know how long for a 10GB file, but it might be tens of minutes to possibly an hour or two.
The parsing does utilize multi-threading automatically.
The required JVM heap size depending more on the number of objects rather than the HPROF size and the help article gives some ways of 
estimating how much heap is required. You don't want to be swapping, so need a bit more RAM than the heap size that MAT is using
(not the heap size of the JVM that created the HPROF).
There is not a minimum CPU, but obviously the slower the CPU or the fewer the cores the longer it will take. MAT doesn't have a timeout on parsing,
so should eventually complete though.
Comment 2 Svarnim Agarwal CLA 2022-03-22 12:13:46 EDT
So I have been running multiple test scenarios where I am anonymizing a huge hprof of 10GB, and in this case, the code starts anonymizing based on a log statement that I print. But I don't see it finish even after 3-4 hours. The MAT process is killed in the background after some time and I don't know why. I don't see any exceptions or errors.  Is there a way to know if the anonymization is going on instead of assuming it is? 

One more question is, I have a hprof file of size 8.5GB and I opened it using the MAT executable file, and it shows in the overview that the size is 171.2 MB. I am not sure if that's correct and just wanted to understand what this size refers to.
Comment 3 Andrew Johnson CLA 2022-03-22 13:00:04 EDT
So, is the problem that the initial parse completes in a reasonable time, but the export snapshot does not?
The size in the overview is the size of the reachable objects in the heap - the unreachable objects histogram will show you the size of the objects which could be discarded at the next GC cycle.

The export snapshot query builds a new HPROF file in segments, so you should see that file grow in length. If you reduce the segsize then
you might be able to track progress more easily.
Comment 4 Svarnim Agarwal CLA 2022-03-25 04:03:01 EDT
I usually don't see any logs... are there any flags I can set to see any information? like which step it is in and so on.

When I run it for a smaller hprof I do see:
Parsing, Scanning, Writing, Dominator Tree calculation, Prepare Classes, Prepare GC roots, etc. 
But for a big file, I don't see anything. 

I also realized that the mat process abnormally exits for big files as I don't see it in the running processes list. No idea why!

Can MAT anonymize a gzip file? or it has to be a hprof?
Comment 5 Andrew Johnson CLA 2022-03-25 13:10:17 EDT
The problem seems to be in the initial parse, not the export HPROF stage.

Is this batch mode or GUI mode?

What is in the configuration/*.log file ?

What is the MAT heap size?
Comment 6 Svarnim Agarwal CLA 2022-03-25 23:06:13 EDT
I have the mat folder in the code and I am calling it using a process builder... I'll run another test and let u know what is in the log file. 

Mat heap size is 16GB.

Just one more update: So I ran the same hprof file locally, and it finished the whole process in just a few minutes with console messages such as:

Task: Parsing <input file>
[Task: Parsing <filename>
[
Subtask: Scanning <filename>
[
[INFO] Detected compressed references, because with uncompressed 64-bit references the array at 0x640800040 would overlap the array at 0x640800018
[....................
[INFO] Wrote threads call stacks to <filename>
[....................
[INFO] Heap <filename> contains 110,218,837 objects
[....................
Subtask: Extracting objects from <filename>
[................................................................................Task: Removing unreachable objects
[
Subtask: Searching for unreachable objects
[
Subtask: Marking reachable objects
[.
Subtask: Re-indexing objects
[.
[INFO] Removed 107,452,087 unreachable objects using 7,396,202,800 bytes
[..
Subtask: Re-indexing classes
[...
Subtask: Writing <filename>
[....
Subtask: Writing <filename>
[.....
Subtask: Writing <filename>
[......
Subtask: Re-indexing outbound index
[.......
Subtask: Writing <filename>
[........
Subtask: Writing <filename>
[.........
Subtask: Writing <filename>
[.........Task: Calculating Dominator Tree
[
Subtask: Dominator Tree calculation
[
Subtask: Depth-first search
[....................
Subtask: Computing dominators
[..............................................
Subtask: Calculate retained sizes
[............................................................
Subtask: Create dominators index file
[................................................................................]

Subtask: Test 'Export HPROF' of section 'Report Root'
[
Subtask: Export to hprof_output.hprof
[
Subtask: Prepare classes
[
Subtask: Prepare GC roots
[
Subtask: Prepare thread stacks
[
Subtask: Dump strings
[
Subtask: Dump thread stacks
[
Subtask: Dump classes into heap dump segment 1
[
Subtask: Dump GC roots into heap dump segment 1
[
Subtask: Prepare objects for heap dump segment 2
[
Subtask: Dump objects into heap dump segment 2

That ending seemed abrupt. But the output file is present. Are these the expected statements in the console and also - Detected compressed references, because with uncompressed 64-bit references the array at 0x640800040 would overlap the array at 0x640800018 - is this to be worried about?
Comment 7 Andrew Johnson CLA 2022-03-26 03:33:00 EDT
The 'Detected compressed references, because with uncompressed 64-bit references the array at 0x640800040 would overlap the array at 0x640800018' is nothing to be worried about, and is normal. 64-bit JVMs have an optiimization where references are stored as 32-bit values in the heap, and are converted to addresses when accessed by a shift and add to the starting address of the heap. It saves heap space and cache space at a possible slight increase in run time, though the improved cache use might speed things up.
https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=a-compressed-references-2
https://wiki.openjdk.java.net/display/HotSpot/CompressedOops

The HPROF file format does not specify whether compressed references are used, or the actual size of objects in the heap.
MAT has to estimate this from the number and type of the fields, and also guess the size of the reference fields. If it looks like an object with uncompressed references would overlap the next object then MAT presumes compressed references were used.

Those messages look normal - the subtasks are also displayed on the GUI and the progress monitor completes when they are done so there is no
end message.

It looks like for your big file that you are just running out of memory. Perhaps -Xmx is big enough, but the heap gets expanded and you hit the operating system limit for the process and the process is killed.
Comment 8 Svarnim Agarwal CLA 2022-03-26 23:02:57 EDT
Thanks... I'll try out a few more variations and get back to u.
Comment 9 Svarnim Agarwal CLA 2022-03-29 12:58:28 EDT
So I tried running a few more scenarios:
Keeping RAM, CPU and code the same, I see that smaller size files (less than 200MB ) are working perfectly. But, if the file size is huge then I am getting this Warning:
[WARNING] (Possibly) Invalid HPROF file: Expected to read another 1,073,741,983 bytes, but only 789,575,943 bytes are available for heap dump record 0x1c at 0x408ab103.
 [....................
 [WARNING] (Possibly) Invalid HPROF file: Expected to read another 1,073,741,994 bytes, but only -284,166,049 bytes are available for heap dump record 0x1c at 0x808ab1ab.

Is this something to worry about? Can this be bypassed or mitigated? And why is this coming?
Comment 10 Andrew Johnson CLA 2022-03-30 06:29:49 EDT
The HPROF file is divided into segments and each segment holds many object records etc.
Record 0x1c (HEAP DUMP SEGMENT) at 0x408ab103 - so there are 9 bytes for the header: 1 for the id, 4 for time stamp, 4 for length: 1,073,741,983 bytes
so the next record is expected to start at 0x408ab103 + 9 + 1,073,741,983 = 0x808AB1AB

That matches the address of the next segment.
It could be that the HPROF file has been truncated; but as the next segment seems to have been found it could be that the HPROF file is gzipped and the GZIP file format only stores the uncompressed length modulo 2^32, so Memory Analyzer has to guess the uncompressed length. It then issues those warnings because it looks like the file is not big enough, but carries on reading, and all might be okay.

See also bug 567303.
Comment 11 Svarnim Agarwal CLA 2022-03-30 07:54:18 EDT
Yes u r right, it is a gzip.

Should I use this -DhprofStrictnessWarning=true? to prevent any exception just incase? 

Any where should I add it in the memoryAnalyzer.ini?
Comment 12 Andrew Johnson CLA 2022-03-30 11:07:17 EDT
That looks like just a warning, not an error so you don't need to set -DhprofStrictnessWarning=true
If you did want to set it then you would put it in the -vmargs section of MemoryAnalyzer.ini, which comes at the end of the file as all the subsequent options are taken as JVM options.
Comment 13 Svarnim Agarwal CLA 2022-03-31 01:25:32 EDT
I have made a new observation: 
If I anonymize a 8gb hprof file it takes around 3-4 mins.
If I anonymize the gzip version of the same hprof file (~1GB) it takes around 40 mins.
If I unzip it on my own and then anonymize it (both the steps through code), it takes around 6 mins.

Is there any reason for this?
Comment 14 Andrew Johnson CLA 2022-03-31 02:05:59 EDT
Memory Analyzer does handle gzip HPROF files, but random access to a gzip file is hard as gzip has up to 32kB of state for any position. MAT keeps a cache of points in the gzip, then when there is a seek to a new location it finds the cached entry closest to, but before the new location, seeks to the cached location, sets the dictionary and unzips from the cached location to the new location.

Many queries can just use the indexes, but reading fields or array lengths or array entries requires MAT to go back to the HPROF and so through the gzip reader. The Export query is the worst case as it has to read every field and array entry to export it. The export should be mainly in sequential order which will help.

JDK >=15 creates gzip HPROF files in chunks (1MB) so can be read by seeking to the start of the chunk then unzip from there, without having to restore the gzip dictionary, so that helps.

Giving MAT more heap can help as the MAT gzip reader uses a soft reference cache for the various restart points.
Comment 15 Svarnim Agarwal CLA 2022-04-05 12:53:13 EDT
Hey, So using a code snippet I am calling MAT and it is taking around 4hrs for some files and 1hr for some. Is there anything I can d o to reduce this time... I have tried increases Heap in memoryAnalyzer and also the RAM and CPU of my system.
Any other suggestions? FYI: I am using jdk11... Does going to a higher version help?
Comment 16 Svarnim Agarwal CLA 2022-04-05 23:12:18 EDT
(In reply to Svarnim Agarwal from comment #15)
> Hey, So using a code snippet I am calling MAT and it is taking around 4hrs
> for some files and 1hr for some. Is there anything I can d o to reduce this
> time... I have tried increases Heap in memoryAnalyzer and also the RAM and
> CPU of my system.
> Any other suggestions? FYI: I am using jdk11... Does going to a higher
> version help?

Also I have the Mat project imported in my code and Im calling the ParseHeapDump file using my code and it takes around 4 hours... For the same file if I do it locally its taking me 20 minutes... Is there any configurations I should check? Could there be any reason behind this?
Comment 17 Andrew Johnson CLA 2022-04-06 03:48:29 EDT
I don't know why there is the big difference in times.
Changing the Java version won't make any difference.
If you have the disk space, then use the uncompressed *.hprof file rather than the compressed *.hprof.gz file.
Comment 18 Svarnim Agarwal CLA 2022-04-11 01:08:43 EDT
After anonymizing these files are seen... 
<filename>1742430715.a2s.index
<filename>1742430715.domIn.index
<filename>1742430715.domOut.index
<filename>1742430715.hprof
<filename>1742430715.hprof.gz
<filename>1742430715.idx.index
<filename>1742430715.inbound.index
<filename>1742430715.index
<filename>1742430715.o2c.index
<filename>1742430715.o2hprof.index
<filename>1742430715.o2ret.index
<filename>1742430715.outbound.index
<filename>1742430715.threads
<filename>1742430715_Export_HPROF.zip

After a bit of digging I found out that the files other than the <filename>1742430715_Export_HPROF.zip are quick.. but the zip file is taking time.

Questions:

1) Is there any use of that zipped folder? I opened it and nothing seemed important. 
2) Can I avoid this generation?
3) Can I end the process after the other files are made? as I have what I need at that time.
Comment 19 Andrew Johnson CLA 2022-04-11 03:38:37 EDT
The zip file is just a report of running the export HPROF which I think you are using to hide some of the sensitive information in the dump. You can't stop the generation of this report, but the report is quick once the new HPROF file has been exported.

Do try running the export HPROF command on the uncompressed HPROF, not the hprof.gz file, as that should be faster.
Comment 20 Andrew Johnson CLA 2022-07-08 09:56:10 EDT
I have some ideas to improve the performance of the Export HPROF

1. When all objects are exported, export by object ID (and so by address) and not by class. This
should have better locality reading the input HPROF, which is particularly important if compressed.
2. Don't read object fields / array elements when calculating the size of objects / segments, use 0 instead. 
This avoids unnecessary reads from the HPROF.
3.Buffer the gzip output stream.
Comment 21 Eclipse Genie CLA 2022-07-08 12:09:23 EDT
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/194642
Comment 23 Andrew Johnson CLA 2022-07-08 12:36:54 EDT
I have made some changes to try to speed up ExportHPORF.
Please test a snapshot build to see if this helps: https://www.eclipse.org/mat/snapshotBuilds.php
Comment 24 Eclipse Genie CLA 2022-07-13 11:01:35 EDT
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/194698
Comment 26 Svarnim Agarwal CLA 2022-10-21 22:36:13 EDT
Thank you for all the help. We were able to make it work by creating a stand-alone program which performs the anonymizing and calling that using our main code.
Comment 27 Andrew Johnson CLA 2022-11-02 05:19:21 EDT
I'll mark this as resolved as the reporter has a work-around and I have finished the changes I made to improve performance.