Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [platform-core-dev] [perf] org.eclipse.core.runtime.Path (computeSegments) : is cache possible ?

Hi,

thanks for your answer.

> My feeling is you are looking too far down the stack.
> If there are 110GB of Path strings being allocated, it likely means something stupid is being done much higher up the call chain and it is exploding outwards to these hotspots at the leaf level.
> We could possibly optimize Path further, although adding caching of course means a  speed/space tradeoff. But optimizing at the leaf level often just masks a deeper algorithmic problem higher up. 
> I suggest you dig into the details much higher up the chain. Which builder is taking all the time, what phase of the build, etc.

Yes, your are absolutely right; I analysed it yesterday and the cause is: org.eclipse.core.internal.resources.AliasManager$2.compare together with hiearchical layout of the project: I forgot one parent project in the workspace.
More can you find more in the original thread http://dev.eclipse.org/mhonarc/lists/platform-core-dev/msg01707.html.
But this problem affects many projects, I think. You can reproduce this problem very easy:

1) Unpack the test project (in attachement)
2) Import the parentProject/childProject into workspace
3) Build -> clean all
4) Result: no invocations of AliasManager$2.compare ($2.compare is the anonymous comparator of org.eclipse.core.internal.resources.AliasManager.getComparator())
5) import the parentProject/ into workspace
6) now the AliasManager$2.compare runs many times, comparing:
parentProject/childProject/bin
parentProject/childProject/bin/org/eclipse/test
parentProject/childProject
parentProject/childProject/bin/org/eclipse
parentProject
parentProject/childProject/bin/org/eclipse
...

I can reproduce this with every hierarchical project, with Eclipse-sources too, for example with the project with AliasManager:
workspace/eclipse.platform.resources (root directory)
workspace/org.eclipse.core.resources (eclipse.platform.resources/bundles/org.eclipse.core.resources)

Is this a know feature or a bug ?


> Some common examples of problems that could cause excess path creation:
> - traversals of the resource tree. There is a heavily optimized IResourceProxyVisitor class that will only instantiate Path object lazily when requested, which can make a huge difference. Of course minimizing/avoiding deep traversals altogether is even better.
 > Resource changes occurring outside of builds or other batching operations, which triggers excess resource change events

Yes, this is the case of maven from command line / in background + refresh : this is the performance problem with m2e/maven builds.

> - Build cycles that cause builders to build and then have to throw away and build again (especially if you have a builder that generates Java source code)

About those two last cases (changes occurring outside of builds + builder that generates Java source code) : I still think here could be a cache or a improvement relevant.
Very often are the builds "file stable" - they build / generate  the same structure of files fast every time : the modifications happen inside those files (.java editing - the work of a java developer)
What do you think ?

Regards,

Martin


2013/11/13 John Arthorne <John_Arthorne@xxxxxxxxxx>
My feeling is you are looking too far down the stack. If there are 110GB of Path strings being allocated, it likely means something stupid is being done much higher up the call chain and it is exploding outwards to these hotspots at the leaf level. We could possibly optimize Path further, although adding caching of course means a  speed/space tradeoff. But optimizing at the leaf level often just masks a deeper algorithmic problem higher up.  I suggest you dig into the details much higher up the chain. Which builder is taking all the time, what phase of the build, etc. Some common examples of problems that could cause excess path creation:

 - traversals of the resource tree. There is a heavily optimized IResourceProxyVisitor class that will only instantiate Path object lazily when requested, which can make a huge difference. Of course minimizing/avoiding deep traversals altogether is even better.
- Resource changes occurring outside of builds or other batching operations, which triggers excess resource change events
- Build cycles that cause builders to build and then have to throw away and build again (especially if you have a builder that generates Java source code)

John




From:        Martin Kočí <martin.kocicak.koci@xxxxxxxxx>
To:        "Eclipse Platform Core component developers list." <platform-core-dev@xxxxxxxxxxx>,
Date:        11/12/2013 02:52 PM
Subject:        [platform-core-dev] [perf] org.eclipse.core.runtime.Path (computeSegments) : is cache possible ?
Sent by:        platform-core-dev-bounces@xxxxxxxxxxx




Hi,

hier is some background for this performance-problem: http://dev.eclipse.org/mhonarc/lists/platform-core-dev/msg01707.html (topic [perf] AbstractDataTreeNode.simplifyWithParent creates 100 mil instances during one build)

I've elimitated m2e and maven and now I have a 15min build in eclipse.

UseCase is: Build -> Clean -> Clean all projects.

The most allocations comes from  org.eclipse.core.runtime.Path.computeSegments and  org.eclipse.core.runtime.Path instance self : total 110GB (!) was allocated from org.eclipse.core.runtime.Path classes.

My quuestion is: is a cache for Path or segments or both possible ? Was this already discussed ?

A path is "/a/immutable/string":
As a quick proof of concept I've implemeted cache in org.eclipse.core.runtime.Path.computeSegments (key is String path) and for some invocation of new Path() (new method createPath(path)). The build time drops to 8 minutes with this patch = 2x faster.

Thank you for your answers


Martin_______________________________________________
platform-core-dev mailing list
platform-core-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/platform-core-dev


_______________________________________________
platform-core-dev mailing list
platform-core-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/platform-core-dev


Attachment: parentProject.tar.gz
Description: GNU Zip compressed data


Back to the top