Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mat-dev] Huge heaps

One other idea would be that MAT could, if possible, try to break up a dump into multiple "virtual" dumps (if there are large enough mutually exclusive subsets of the object graph) that are loaded as separate tabs and ISnapshots, but somehow still refer to each other indirectly (e.g. for the class definitions and boot classloader classes). This would also be separately interesting because I've always wanted a way to take a retained set and get the full power of MAT's ISnapshot analysis on it, instead of just a histogram and the limited investigation from there.

Also, what about having a separate build of MAT where ints are converted to longs (and alternative data structures such as BitFields would have to support longs)? This could perhaps be done by annotating ints that can be converted and creating a program that creates a copied code base with longs. The int version would be the recommended & primary version of MAT, but very advanced and large customers who have the extra memory and CPU to deal with the additional overhead of referring to longs could use this separate build. Of course this might also break any extensions that use object IDs and those would need separate builds too.

This would actually be a good case where C's #define directives would be useful to more easily create separate builds :)

--
Kevin Grigorenko
IBM WAS SWAT Team
kevin.grigorenko@xxxxxxxxxx
Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/kevgrig/


Inactive hide details for "Tsvetkov, Krum" ---03/14/2012 09:10:42 AM---Andrew, the biggest dump I've worked with was about 8Gb "Tsvetkov, Krum" ---03/14/2012 09:10:42 AM---Andrew, the biggest dump I've worked with was about 8Gb and contained about 190.000.000 objects. I d


From:

"Tsvetkov, Krum" <krum.tsvetkov@xxxxxxx>

To:

Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>,

Date:

03/14/2012 09:10 AM

Subject:

Re: [mat-dev] Huge heaps

Sent by:

mat-dev-bounces@xxxxxxxxxxx




Andrew, the biggest dump I've worked with was about 8Gb and contained about 190.000.000 objects. I don't know what the number of references was.

I don't know how often it happens that people need to analyze dumps with more than 2^31 refs.
Although it doesn't sound good that MAT has some limitations in this area, I think it will be pretty challenging to break this limit.

You already listed some of the problems which may occur by defining the Id as unsigned.
I believe there are even more, and probably some of them are difficult to spot. For example, as far as I remember our own implementation of BitField can only work properly with positive numbers.

My personal opinion is that for the moment we should rather accept the limitation and document it, rather than trying to fix it for Juno.
And in general, if we try to fix it for a later release, I would suggest that we keep an eye on the performance and our own memory usage, and try not introduce serious degradation there (because of moving away from simple int[] or using some bigger storage than an int). I believe that the dumps which have over 2^31 refs are still rather something exceptional, and would like to keep the rest (probably 99%) of the users happy :-)

What do you think?

What is the experience of others so far?

Krum

-----Original Message-----
From: mat-dev-bounces@xxxxxxxxxxx [
mailto:mat-dev-bounces@xxxxxxxxxxx] On Behalf Of Andrew Johnson
Sent: Dienstag, 13. März 2012 11:48
To: Memory Analyzer Dev list
Subject: [mat-dev] Huge heaps

I'm looking at bug 372548: ArrayIndexOutOfBoundsException with huge dump
https://bugs.eclipse.org/bugs/show_bug.cgi?id=372548

I'm working on a fix to one problem - the 1 to N indexes didn't cope with
more than 2^31 outbound references in the whole snapshot. It used an int
returned from the header index to index into the body index of all the
outbound references. I hope to be able to commit a fix shortly.

What are the biggest heaps we need to deal with, in terms of objects or
total outbound references?

What other restrictions are there for large dumps?
Do we need a LongIndex1N which can have more than 2^31 outbounds in total?
Do we need more than 2^31 objects? Currently object id < 2^31, i.e. signed
int
We could defined object id as being unsigned. Possible problems include:
Identifier.reverse - a negative number is returned for not found
inbounds - where we temporarily encode some refs as negative
int SnapshotInfo.getNumberOfObjects()
int IClass.getNumberOfObjects()
int IClass.getObjectIds()
int [] Snapshot.getOutboundReferentIds()
SetInt can't hold enough ints
int [] Snapshot.getOutboundReferentIds(int[] objectIds, IProgressListener
progressListener) - can't return more than 2^31 items
int [] Snapshot.getInboundReferentIds(int[] objectIds, IProgressListener
progressListener) - can't return more than 2^31 items
Do we need to expose a IntIndexReader which can be indexed by unsigned int
/ longs for > 2^31 entries?
Do we need to make the InboundWriter work with huge dumps. It splits the
refs into separate log files, but can the contents of the log files get
too big to sort as int arrays?
Can we save memory on building indices, doing the GC, rebuilding indices,
calculating dominator tree?

Andrew Johnson






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU






_______________________________________________
mat-dev mailing list
mat-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/mat-dev
_______________________________________________
mat-dev mailing list
mat-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/mat-dev



GIF image

GIF image


Back to the top