Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mat-dev] Parallel parsing

Hey Jason,

Our use cases are all over the place as we support thousands of different customers, many on legacy systems that are still 32-bit constrained; however, I agree with you that the customer trend is clearly towards larger heaps and, on our side, larger analysis machines with a lot of RAM and CPU, both for handling those large heaps and for security and GDPR reasons.

I'm probably one of the few production support folks that still sometimes loads dumps on a laptop while on customer sites. So my knee-jerk reaction worrying about heap is probably a mindset I need to grow out of, and on reflection, I agree with you that it's probably not needed to worry too much about MAT's heap usage if it can improve parsing times (within reason, of course).

With that said, I think we should always keep the "laptop use case" in mind. There are always going to be developers that are using dumps to profile their application during development. I can tell you that it's a rare customer that has a dedicated MAT analysis box with lots of RAM and CPU, so while IBM, Netflix, etc. may not be impacted too much by MAT's memory usage, other customers will face more difficulty. When feasible, I think it's good to add an option to the preferences dialog that directs MAT to use RAM more aggressively (maybe we just need a single checkbox for this that various parts of MAT can query?). For this case, given how deep the change is, I think it's fine to skip that, but as Andrew points out, maybe a limit to the number of threads as an option would be nice. If you want to reduce the time investment on this, I think a simple -D JVM argument would be fine.

--
Kevin Grigorenko
IBM WebSphere Application Server SWAT




From:        Jason Koch <jkoch@xxxxxxxxxxx>
To:        Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date:        04/30/2019 09:20 AM
Subject:        Re: [mat-dev] Parallel parsing
Sent by:        mat-dev-bounces@xxxxxxxxxxx




Hi Kevin / Andrew / (& all),

It might help if you can share some use cases of the sorts of scenarios where MAT is used in your environments, and environments you have seen other users using. (All feedback is welcome!). I think it would be a valuable part of the discussion, so that I can understand how unique our requirements are and what considerations I should be putting in for any potential future work.

The specific use case and goal I have may be unique to our circumstances. The bottom line for us here is that CPU and RAM is relatively plentiful, whilst engineer time is much more valuable.

- It is no secret at Netflix that we run on AWS and have a significant Java deployment for applications. We have large Cassandra and ElasticSearch deployments.
- Many of our applications run on instance sizes and Java heaps that dwarf what would be possible to reasonably download and process on developer machines. It is very common for application heaps to be in the 50-100GB range.
- This means that teams have workflows involving run Eclipse analysis of large heaps on cloud machines anyway using remote desktops. In this scenario, memory is not as constrained.
- Large memory machines come with lots of cores. Whilst we have these cores allocated, and engineers waiting on results, we would like to find a way to use them.

Let me know your thoughts and observations.

Thanks
Jason


On Mon, Apr 29, 2019 at 10:03 AM Andrew Johnson <andrew_johnson@xxxxxxxxxx> wrote:
Kevin,

We do need to understand the memory usage issue, but it need not be a blocker.


Jason explained for the large object arrays that the existing MAT code accumulates references in an ArrayLong, so has a similar problem, and we live with that. The allocation of 2 arrays (accumulating the references in a long array, then copying them across) might be a problem though with big arrays.


Disabling all parallel parsing by selectively switching to the old code would be a maintenance problem - if we did need to disable then limiting the number of threads to 1 might be an easier solution. Somehow throttling memory usage would be harder, and I'd like to see some GC statistics with large dumps to know how much of a problem we have.


Another tool, at least for IBM VMs is IBM Monitoring and Diagnostic Tools - Garbage Collection and Memory Visualizer.


Krum might be back shortly - I am switching to a new machine, and it's taking longer than expected to move things across before I can properly try things out.


Andrew Johnson





From:        
"Kevin Grigorenko" <kevin.grigorenko@xxxxxxxxxx>
To:        
jkoch@xxxxxxxxxxx, Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date:        
23/04/2019 22:36
Subject:        
Re: [mat-dev] Parallel parsing
Sent by:        
mat-dev-bounces@xxxxxxxxxxx




> Memory Usage - we don't want to break the parsing of huge dumps on large but not huge machines.
> Large object arrays - currently the parser didn't have to allocate an array as big as in the original dump. With the parallel parsing the code allocates a long array with as many outbound refs as the object array has in the original dump, which will be one more than the size of the array (as there is a class ref too).

https://git.eclipse.org/r/#/c/138031/
> Total memory usage - if several threads are running at once, can all the threads each allocate large buffers so that the total memory usage is increased beyond the limit.

This is a bit concerning as we already struggle with large dumps and the customer trend is larger heaps. Jason, can you please consider adding a preference to disable this new behavior (I guess that would mean disabling all of parallel parsing)?

--
Kevin Grigorenko
IBM WebSphere Application Server SWAT





From:        
Andrew Johnson <andrew_johnson@xxxxxxxxxx>
To:        
Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date:        
04/18/2019 03:19 AM
Subject:        
Re: [mat-dev] Parallel parsing
Sent by:        
mat-dev-bounces@xxxxxxxxxxx




Jason,

Yes, I've completed my initial reviews, and I am positive about the changes, and the code is clean and looks good.

Meanwhile we did submit the changes for a Contribution Questionnaire (CQ 19463) as it was a largish contribution. As this is a legal process it is only viewable by Eclipse committers, but there were no issues raised and it was approved, so from an Eclipse legal process point of view we can go forward.


In terms of the reviewing the code my main issues were:

1.        API compatibility
This isn't flagged by the Eclipse refactor unless you have an API baseline set up using Preferences > Plug-in Development > API Baselines

https://wiki.eclipse.org/MemoryAnalyzer/Contributor_Reference#Configure_API_Tooling_Baseline
You then get errors on code such as "The member type org.eclipse.mat.parser.index.IndexWriter.IntArray1NWriter in org.eclipse.mat.parser_1.8.1 has been removed"
I think we can fix at lot of this by moving classes back to the original locations (or at least stubs) and increasing the version number to allow the addition of classes/methods.

2.        Memory Usage - we don't want to break the parsing of huge dumps on large but not huge machines.

1.        Large object arrays - currently the parser didn't have to allocate an array as big as in the original dump. With the parallel parsing the code allocates a long array with as many outbound refs as the object array has in the original dump, which will be one more than the size of the array (as there is a class ref too).

https://git.eclipse.org/r/#/c/138031/
2.        Total memory usage - if several threads are running at once, can all the threads each allocate large buffers so that the total memory usage is increased beyond the limit.

3.        We may need to use some of the following tools to track memory usage. Ideally we could track usage from our Jenkins builds, but I don't know how to do that.

1.        -verbose:gc

2.        Jconsole - for Oracle / OpenJDK VMs

3.        IBM Healthcenter - for IBM VMs

4.        Reduce -Xmx until parsing fails, then analyze the resulting heap dump to find where MAT is using the most memory


I think Krum is working out a plan to merge the patches, I haven't used Gerrit with a complex set of patch sets like this to know how it works. As the current set of patches have been approved for the CQ it might be clearer to merge those, and have additional changes as patch sets on top. The additional changes shouldn't be as big, so we shouldn't pass the 1000 line limit for external contributions (and changes by committers don't count towards the limit).


Regards,


Andrew





From:        
Jason Koch <jkoch@xxxxxxxxxxx>
To:        
Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date:        
16/04/2019 18:06
Subject:        
Re: [mat-dev] Parallel parsing
Sent by:        
mat-dev-bounces@xxxxxxxxxxx




Hi Krum & Andrew

I noticed most of the patch set has some comments now. I think I have responded to the majority of them now, and my belief is the overall direction is positive with the bulk of the concern around the API compatibility changes.

Can you let me know how you'd like to proceed?

I can either bundle up a new set of changes over the top, or I can try to rework the commits in basically the order they were presented. (Or another order?).

Also, I did understand from the docs and prior emails with Andrew that I should have API compat issues enabled and flagged, however when I used Eclipse refactor to extract classes it seems to have not treated this as an issue.

Is there a concrete / straightforward way that I can verify the API compatibility from an eclipse perspective between two branches?

Thanks
Jason

On Thu, Mar 14, 2019 at 3:18 AM Tsvetkov, Krum <
krum.tsvetkov@xxxxxxx> wrote:
Hi,

>>
This is an exciting contribution, please take a look.
First of all - @Jason thanks a lot for the contribution! It is definitely an exciting one!

>>
I estimate there are about 3000 lines of changes (though some are lines just moved to new files), so probably needs to go through the large contribution process.
I tried to clarify the process part and I don’t think we need a CQ. Formally, the process is required for contributions (single changes) of above 1000 lines. Also, its purpose is to help in case of doubts that code may be copied inappropriately from somewhere. Jason, you’ve done a really great job in splitting the changes in small, easy to review commits! I went quickly through the changes and saw the three bigger ones are actually moving MAT’s existing coding around. The smaller ones were also very MAT specific, I have no concerns about unproperly copied code.
Therefore, I think from the process perspective we are good to go (without a CQ), and we may start reviewing/merging the pieces.

About the content - from what I’ve seen so far, the most critical part is indeed the moved Index builders, as this is an API breaking change. We’ll have to figure out a reasonable way to deal with it.
I guess sooner or later we’ll have to go to a 2.0 version of MAT and clean up the APIs, but I wouldn’t mix such activity with the current contribution.

I guess we’ll need some more time for the reviews, so please bear with us. I nevertheless wanted to thank Jason for the change and give an update on the process part.

Regards
Krum




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

_______________________________________________
mat-dev mailing list

mat-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit

https://www.eclipse.org/mailman/listinfo/mat-dev_______________________________________________
mat-dev mailing list
mat-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/mat-dev



Back to the top