[mat-dev] Parallel parsing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[mat-dev] Parallel parsing

From: Andrew Johnson <andrew_johnson@xxxxxxxxxx>
Date: Fri, 19 Jul 2019 17:59:47 +0100
Delivered-to: mat-dev@xxxxxxxxxxx
List-archive: <https://www.eclipse.org/mailman/private/mat-dev>
List-help: <mailto:mat-dev-request@eclipse.org?subject=help>
List-subscribe: <https://www.eclipse.org/mailman/listinfo/mat-dev>, <mailto:mat-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://www.eclipse.org/mailman/options/mat-dev>, <mailto:mat-dev-request@eclipse.org?subject=unsubscribe>

Now we have released Memory Analyzer 1.9 it seems a good time to look at parallel parsing again.

I've applied the patches locally and then tried fixing the API compatibility errors.
API baseline is set up using Preferences > Plug-in Development > API Baselines and I used a Memory Analyzer 1.9 installation as the baseline.
https://wiki.eclipse.org/MemoryAnalyzer/Contributor_Reference#Configure_API_Tooling_Baseline

We have to fix the breaking changes.

IntIndexCollector
LongIndexCollector
move these back to org.eclipse.mat.parser.index.IndexWriter (or have a stub and change the new classes to package-private).
Also add back 'extends IntIndex<ArrayIntCompressed>' and 'extends LongIndex', and possibly merge properly and use the fields in the superclasses.

The type org.eclipse.mat.parser.io.PositionInputStream has been changed from a class to an interface

Put back PositionInputStream and add a new interface.

The other errors are:
Missing @since tag on org.eclipse.mat.parser.io.BufferingRafPositionInputStream
Missing @since tag on org.eclipse.mat.parser.io.ByteArrayPositionInputStream
Missing @since tag on removeInstanceBulk(int, long)
Missing @since tag on org.eclipse.mat.parser.io.DefaultPositionInputStream
The major version should be incremented in version 1.9.0, since API breakage occurred since version 1.9.0

We could fix these by increasing the version to 1.10.The possible problem with this is that we need to update the version early in the release cycle, and so head/master isn't suitable as a service stream. We could solve this using branches but I would like to keep things simple as I don't have much time to spend on the project.

The other approach is to avoid all API changes. We move the PositionInputStream classes into o.e.mat.hprof which not an API project, so is easy to add to. We replace uses of removeInstanceBulk with a loop over removeInstance, and check it isn't a big performance problem. We then have the parallel parsing changes without any API changes. Later we can consider whether these classes are mature enough and useful enough to move to o.e.mat.parser and promote them to being APIs.

I have not applied multiple Gerrit patches before. I am not even sure of the right way of testing; I just merged the branches in order into my master. What is the best way - to modify existing patches with changes, which is more logical, but complicated (and I would need help), or to apply the existing changes, then create some more patches on top?

Any comments?

Andrew Johnson

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Follow-Ups:
- Re: [mat-dev] Parallel parsing
  - From: Jason Koch

Prev by Date: Re: [mat-dev] MAT Participation in the 2019-06 release
Next by Date: [mat-dev] Looking for Virtual Eclipse Community Meetup Presenters
Previous by thread: Re: [mat-dev] Parallel parsing
Next by thread: Re: [mat-dev] Parallel parsing
Index(es):
- Date
- Thread

Breadcrumbs