[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [mat-dev] Parallel parsing
|
Jason,
Yes, I've completed
my initial reviews, and I am positive about the changes, and the code is
clean and looks good.
Meanwhile we did
submit the changes for a Contribution Questionnaire (CQ 19463) as it was
a largish contribution. As this is a legal process it is only viewable
by Eclipse committers, but there were no issues raised and it was approved,
so from an Eclipse legal process point of view we can go forward.
In terms of the
reviewing the code my main issues were:
- API compatibility
This isn't flagged by the Eclipse refactor unless you have an API baseline
set up using Preferences > Plug-in Development > API Baselines
https://wiki.eclipse.org/MemoryAnalyzer/Contributor_Reference#Configure_API_Tooling_Baseline
You then get errors on code such as "The member type org.eclipse.mat.parser.index.IndexWriter.IntArray1NWriter
in org.eclipse.mat.parser_1.8.1 has been removed"
I think we can fix at lot of this by moving classes back to the original
locations (or at least stubs) and increasing the version number to allow
the addition of classes/methods.
- Memory
Usage - we don't want to break the parsing of huge dumps on large but not
huge machines.
- Large
object arrays - currently the parser didn't have to allocate an array as
big as in the original dump. With the parallel parsing the code allocates
a long array with as many outbound refs as the object array has in the
original dump, which will be one more than the size of the array (as there
is a class ref too).
https://git.eclipse.org/r/#/c/138031/
- Total
memory usage - if several threads are running at once, can all the threads
each allocate large buffers so that the total memory usage is increased
beyond the limit.
- We may
need to use some of the following tools to track memory usage. Ideally
we could track usage from our Jenkins builds, but I don't know how to do
that.
- -verbose:gc
- Jconsole
- for Oracle / OpenJDK VMs
- IBM Healthcenter
- for IBM VMs
- Reduce
-Xmx until parsing fails, then analyze the resulting heap dump to find
where MAT is using the most memory
I think Krum is
working out a plan to merge the patches, I haven't used Gerrit with a complex
set of patch sets like this to know how it works. As the current set of
patches have been approved for the CQ it might be clearer to merge those,
and have additional changes as patch sets on top. The additional changes
shouldn't be as big, so we shouldn't pass the 1000 line limit for external
contributions (and changes by committers don't count towards the limit).
Regards,
Andrew
From:
Jason
Koch <jkoch@xxxxxxxxxxx>
To:
Memory
Analyzer Dev list <mat-dev@xxxxxxxxxxx>
Date:
16/04/2019
18:06
Subject:
Re:
[mat-dev] Parallel parsing
Sent
by: mat-dev-bounces@xxxxxxxxxxx
Hi Krum & Andrew
I noticed most of the patch set has some
comments now. I think I have responded to the majority of them now, and
my belief is the overall direction is positive with the bulk of the concern
around the API compatibility changes.
Can you let me know how you'd like to
proceed?
I can either bundle up a new set of changes
over the top, or I can try to rework the commits in basically the order
they were presented. (Or another order?).
Also, I did understand from the docs
and prior emails with Andrew that I should have API compat issues enabled
and flagged, however when I used Eclipse refactor to extract classes it
seems to have not treated this as an issue.
Is there a concrete / straightforward
way that I can verify the API compatibility from an eclipse perspective
between two branches?
Thanks
Jason
On Thu, Mar 14, 2019 at 3:18 AM Tsvetkov,
Krum <krum.tsvetkov@xxxxxxx>
wrote:
Hi,
>>
This is an exciting contribution, please take a look.
First of all - @Jason thanks a lot for
the contribution! It is definitely an exciting one!
>> I
estimate there are about 3000 lines of changes (though some are lines just
moved to new files), so probably needs to go through the large contribution
process.
I tried to clarify the process part and
I don’t think we need a CQ. Formally, the process is required for contributions
(single changes) of above 1000 lines. Also, its purpose is to help in case
of doubts that code may be copied inappropriately from somewhere. Jason,
you’ve done a really great job in splitting the changes in small, easy
to review commits! I went quickly through the changes and saw the three
bigger ones are actually moving MAT’s existing coding around. The smaller
ones were also very MAT specific, I have no concerns about unproperly copied
code.
Therefore, I think from the process perspective
we are good to go (without a CQ), and we may start reviewing/merging the
pieces.
About the content - from what I’ve seen
so far, the most critical part is indeed the moved Index builders, as this
is an API breaking change. We’ll have to figure out a reasonable way to
deal with it.
I guess sooner or later we’ll have to
go to a 2.0 version of MAT and clean up the APIs, but I wouldn’t mix such
activity with the current contribution.
I guess we’ll need some more time for
the reviews, so please bear with us. I nevertheless wanted to thank Jason
for the change and give an update on the process part.
Regards
Krum
_______________________________________________
mat-dev mailing list
mat-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe
from this list, visit
https://www.eclipse.org/mailman/listinfo/mat-dev
_______________________________________________
mat-dev mailing list
mat-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe
from this list, visit
https://www.eclipse.org/mailman/listinfo/mat-dev
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU