Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020

From: Andrew Johnson <andrew_johnson@xxxxxxxxxx>
Date: Thu, 29 Oct 2020 10:53:29 +0000
Delivered-to: mat-dev@xxxxxxxxxxx
List-archive: <https://www.eclipse.org/mailman/private/mat-dev>
List-help: <mailto:mat-dev-request@eclipse.org?subject=help>
List-subscribe: <https://www.eclipse.org/mailman/listinfo/mat-dev>, <mailto:mat-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://www.eclipse.org/mailman/options/mat-dev>, <mailto:mat-dev-request@eclipse.org?subject=unsubscribe>

Jason,

Unfortunately there is no recording - this is a requirement of EclipseCon for all the BoF sessions:
"IMPORTANT: Please do not record the session. Recording requires written permission from all attendees, and we are not able to obtain that."

I used Zoom to give the talk, generally presenting a Memory Analyzer window as I was talking because there were no slides.

It was mainly to show the latest features and to ask for feedback - we get about 200,000 downloads a year, and just a few questions on the forum
https://www.eclipse.org/forums/index.php/f/186/
a few open bugs (of which the majority are opened be me or other committers)
https://bugs.eclipse.org/bugs/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&order=Importance&product=MAT&query_format=advanced
and some questions on StackOverflow
https://stackoverflow.com/questions/tagged/eclipse-memory-analyzer+or+eclipse-mat

We don't really know how people use the tool, how they can learn how to use it, whether they refer to the help or tutorials, and what new features are required.

In terms of the new features:

Eclipse Memory Analyzer version 1.10 was released earlier this year with the following features:

Parallel parsing of HPROF for improved performance
- Thank you, Jason, for this feature. I couldn't really show this one in the short time for the talk, but the parallel parsing is after the initial scanning phase of HPROF and on pass 2, generating outbound references and assembling them for writing to the indexes. It's a nice feature though. I've also tweaked the parallel object marking. A CPU monitor such as Intel's Extreme Tuning Utility lets you see what happens. I don't often see all 12 threads on my machine in use during a parse, but sometimes it hits the thermal limit. [Thinking about it there might be a way to do the initial scan with a thread processing each 4GB segment, but perhaps a huge array could overflow its stated segment size and mess things up.]

Direct reading of Gzipped HPROF files to reduce disk space
- this works, and I showed it for a small file, but there is a trade-off between disk space and memory space and CPU. GZip has quite a lot of state (32kB) and for random access the code maintains cached Gzipped readers for different parts of a file. For a random seek the current reader at the current location is returned to the cache, the preceding cached reader to the seek point is found, duplicated (modified version of Gzip, to keep a copy for the original location) and used to seek to the required location.

Improved OQL for more complex queries
- OQL always gave results of two types, an object list as a tree or a table with columns. You can now do more complex queries, including processing an OQL table result in another select. List flattening expands out arrays / lists in a column of a row to multiple rows.
See https://wiki.eclipse.org/MemoryAnalyzer/OQL
It get complicated though. If you know SQL, then the independent MAT/Calcite plugin extension might be useful.
https://github.com/vlsi/mat-calcite-plugin
"While MAT does have a query language, it does NOT allow to join, sort and group results. MAT Calcite plugin allows all the typical SQL operations (joins, filters, group by, order by, etc)"
It looks like there is some recent activity, and it's an open source project so if you like it then head over there and offer to help.

Eclipse Memory Analyzer version 1.11 is planned for December 2020, and we would appreciate comment on the following proposed features:
https://projects.eclipse.org/projects/tools.mat/releases/1.11.0/bugs

Improved comparison queries - including leak suspects report by comparing two snapshots
- This is a medium sized enhancement. The compare tables query from the compare basket just operated on tables. I've added trees and now dominator trees can be compared properly. There's also some quite complicate code using heuristics to try to match duplicate keys from the tables/trees. The main gap is not being able to use context menu on table/trees entries from other snapshots (not the current snapshot). It's a bit tricky to see how to solve this one as context menus assume they are running with the current snapshot. Perhaps ContextProviders (which is the way for some queries give a choice before the queries, such as Table1/Table2 or Objects/Dominated objects for Immediate dominators could help. They could provide a ISnapshot or better perhaps an IQueryContext. I haven't worked this out though. But - who uses comparisons?
The leak suspects report by snapshot comparison could be useful - but does it work in practice to find slow leaks / small changes in memory etc.?

Handling of huge heap dumps (including >2^31 objects) by randomly discarding some objects
- Raising the 2^31 object limit would be a major incompatible change. I wondered whether just discarding some objects on the initial parse would work, so I implemented it for the HPROF and DTFJ parsers. Does this work? Can people still find memory leaks in huge (>200GB) dumps?

Dark theme for reports
- Just for fun really, and some neat CSS to change the mode for reports, but unfortunately the Internet Explorer browser on Windows can't invert the colours of images automatically. It works better on Linux.

More links in HTML reports
- Just about anything I can think of has a link to continue the analysis but sometimes the set of objects is too complex to express as a small MAT URL.

MAT is internationalized, but with just a partial Japanese translation on Babel. https://babel.eclipse.org/babel/ Do people want to use MAT in their native languages? Would it reduce service calls if customers could use MAT in their own language? There are some Japanese and Simplified Chinese translations from IBM for MAT 1.4.

Other ideas - tens of thousands of people install MAT into Eclipse. Would better integration with JDT be worthwhile? What would that mean? At the moment there is just a link from classes in MAT to the source in JDT. I've added getting the list of JVMs for JDT to use for configuring the MAT acquire dump. MAT then uses that configuration to find running JVMs on the local machine, not just any running JDT debug processes.

One concern that came up was minimising memory use and that a 16GB dump couldn't be processed on a 16GB machine. Batch parsing on a big machine could help, but is building the dominator tree the huge part of the parse?

Give the snapshot builds a go, and try out the new features and let us know what you think.

Andrew Johnson

From: Jason Koch <jkoch@xxxxxxxxxxx>
To: Memory Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date: 28/10/2020 20:37
Subject: [EXTERNAL] Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020
Sent by: mat-dev-bounces@xxxxxxxxxxx

Hi Andrew

Unfortunately I missed this. Is there a recording available? I'd be interested to view it.

Thanks!
Jason

On Tue, Oct 20, 2020 at 7:14 AM Andrew Johnson <andrew_johnson@xxxxxxxxxx> wrote:
I have volunteered to host today a DIY birds-of-a-feather session about Eclipse Memory Analyzer at the virtual EclipseCon 2020.

Please come along if you are able.

https://www.eclipsecon.org/2020

You would need to register for the conference, but it is free.

Eclipse Memory Analyzer future developments
Tuesday, October 20, 2020

7:00 PM to 8:00 PM Europe/Berlin
6:00 PM to 7:00 PM Europe/London
1:00 PM to 2:00 PM America/New York

Room 3
DIY Session (BoF)
Information

Eclipse Memory Analyzer is a tool to solve Java OutOfMemoryErrors by finding the cause of memory leaks and showing how some of the heap is wasted. It operates off-line from the target process by using heap dump files.

We would appreciate comments on what you do like, what you don't, and on the proposed new features.

Eclipse Memory Analyzer version 1.10 was released earlier this year with the following features:
Parallel parsing of HPROF for improved performance
Direct reading of Gzipped HPROF files to reduce disk space
Improved OQL for more complex queries

Eclipse Memory Analyzer version 1.11 is planned for December 2020, and we would appreciate comment on the following proposed features:
Improved comparison queries - including leak suspects report by comparing two snapshots
Handling of huge heap dumps (including >2^31 objects) by randomly discarding some objects
Dark theme for reports
More links in HTML reports

Andrew Johnson

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

References:
- [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020
  - From: Andrew Johnson
- Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020
  - From: Jason Koch

Prev by Date: Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020
Next by Date: [mat-dev] Memory Analyzer 1.11 and Eclipse 2020-12
Previous by thread: Re: [mat-dev] EclipseCon 2020 Memory Analyzer Zoom session - today, 20 October 2020
Next by thread: [mat-dev] Memory Analyzer 1.11 and Eclipse 2020-12
Index(es):
- Date
- Thread

Breadcrumbs