Community
Participate
Working Groups
Created attachment 265972 [details] projects to import to reproduce the problem - Import the attached projects (there is a Viewpoint Specification Projet and a Modeling project with a single empty Ecore) - Make sure the "Perfo Tree" viewpoint is enabled and create a new Tree representation on the EPackage - Right click in the tree on "Create Many EClasses" -> the code which is launched is in perfo.tree.design.Services.createMany(EPackage) and basically create 20K new EClasses each having 10 attributes, this should end up being a tree with 20K*10=>200K tree items. -> the UI thread is blocked and CPU goes crazy for a fairly long time, I gave up after 5 minutes No doubt there is n^n algorithm working its magic here. First hint: after a bit of digging, commenting out: willBeExpandedOnSelection(RefreshPlan) to always return false makes the whole tree to be created in about 10 seconds
I've started to analyze this further. Results so far: * CreatedTreeItem.willBeExpandedOnSelection() will invoke InternalTransaction.getChangeDescription().getObjectsToDetach() once for every CreatedTreeItem, even though the result is global to the whole transaction and its computation known to be relatively costly (see https://bugs.eclipse.org/bugs/show_bug.cgi?id=460206#c4). Its not a complex algorithm by itself, but it re-creates copies of the relevant collections on each invocation, so thousands of invocations which each will copy thousands of items start to accumulate. I have a patch to cache this result at the level of the refresh's GlobalContext so it is computed only once. * TreeUIUpdater.analyseExpansionStateOfCreatedChildren() will be invoked on every newly created element, and is recursive. I'm not sure *why* I saw it appear as much in the example, which is not very deep in terms of containment, but it looks like we can easily avoid some uneeded recursive calls on already handled elements by using the result of toExpands.add(dTreeItem). I'll post the patch too, but this one is more risky and would require more analysis to make sure this does not change the behavior. With the two patches above, and changing the action to create "only" 5000 classes with 5 attritubes, I was able to get a YourKit trace in reasonable time (around 1min I believe). The overall picture, focusing on the CreateToolItemAction.run() method: * 58% of the total time is spent in pre-commit. As expected ~99% of this is in DTreeRefresh.refresh(). The good news is that most of the time (~80%) is spent in expression evaluation, probably the "aql:self.eAllContents()->select(c | c.abstract)" semanticCandidateExpression. 1) it's out of scope here, and 2) using eAllContents() instead of eClassifiers creates a lot of churn building a collections with EClasses and their EAttributes, even though we don't care at all about EAttributes here. * 37% of the total time is spent in post-commit. Most of it is spent in SelectDRepresentationElementsListener.resourceSetChanged(ResourceSetChangeEvent). About 2/3 of that is spent in EcoreUtil.isAncestor() called from isViewWithNewSemanticTarget(). I'll have to analyze further if/how this can be optimized and report here my findings. Note that this part is not tree-specific, so any improvement here will benefit other dialects.
(In reply to Pierre-Charles David from comment #1) > * 58% of the total time is spent in pre-commit. As expected ~99% of this is > in DTreeRefresh.refresh(). The good news is that most of the time (~80%) is > spent in expression evaluation, probably the > "aql:self.eAllContents()->select(c | c.abstract)" > semanticCandidateExpression. 1) it's out of scope here, and 2) using > eAllContents() instead of eClassifiers creates a lot of churn building a > collections with EClasses and their EAttributes, even though we don't care > at all about EAttributes here. Just to confirm this hypothesis: on a scaled-down version of the example (5K EClasses × 5 EAttributes): * aql:self.eAllContents()->select(c | c.abstract): 2 943ms * aql:self.eClassifiers->select(c | c.abstract): 4ms I'll be using the optimized version for the rest of the analysis to avoid polluting the measures by unrelated overheads.
New Gerrit change created: https://git.eclipse.org/r/87865
New Gerrit change created: https://git.eclipse.org/r/87864
New Gerrit change created: https://git.eclipse.org/r/87866
I've pushed 3 separate patches on Gerrit. Here are the performance improvements of each. The baseline is current master (0448c14e4), using the attached sample project with 2 modifications: * Semantic Candidate Expression changed to: aql:self.eClassifiers->select(c | c.abstract) * Model size scaled down to 5k EClass with 5 EAttributes each. Using this, current master takes ~110s (1min 50s), measured manually from the time of the action invocation to the time where all the new elements are visible in the tree *and* the new root elements (the EClasses) are selected. * With https://git.eclipse.org/r/#/c/87864/ "Make sure getObjectsToDetach() is called only once": 38s * With https://git.eclipse.org/r/#/c/87865/ "Avoid recursive calls for already handled elements in TreeUIUpdater": 31s * With https://git.eclipse.org/r/#/c/87866/ "Invert costly test to improve performance in common cases": 26s So overall a speedup of 75% in this case. 26s in the last case is still a lot. I'll make a new YourKit analysis of where time goes once all patches are applied.
I'm not sure if this has any significant impact, but note that the measure above were made with "Link with Editor" disabled in the "Model Explorer" view.
With all the optimizations mentioned above enabled, the CreateToolImtemAction.run() time is spent like this: * 39% in the actual TreeDialectServices.refresh(), which happens in pre-commit. It looks like the way we handle refreshStyle() could be optimized for newly created elements: we first create and attach a blank style, and then refresh it. Because it's blank, the refresh will touch many of its features, and because it's already attached to the model, we pay the cost of the ChangeRecorder recording all of these changes. Not sure how much this would help overall. * 27% in SelectDRepresentationElementsListener.analyseNotifications(), most of which (18% of the total CreateToolImtemAction.run() time) is spent in EcoreUtil.isAncestor(). However CreateToolImtemAction.run() now only takes 26% of the total time of the operation (I'm ignoring the time spent by the UI in Thread.sleep()). * 24% is spent in AbstractTreeViewer.creatChildren(). This is JFace code called by us from TreeUIIpdaterRunnable.run() and there's probably not much we can do to make this faster. * 15% is spent in JFace code for the Model Content view (calls end up in SiriusCommonContentProvider). Calls to EcoreUtil.isAncestor() in SelectDRepresentationElementsListener.analyseNotifications() are now the single largest time wasters that we have some control on, but they only represent 6% of the total time of the operation. The picture may be different on other OSes where SWT/JFace are faster (I'm under Linux), but in the end creating dozens of thousands of SWT tree items will probably always take significant time. Going significantly faster would probably mean not creating/expanding/selecting the tree items corresponding to the newly created elements.
Gerrit change https://git.eclipse.org/r/87864 was merged to [master]. Commit: http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=a78af68687923aa4f1f44966a25c571d41238038
Closing, even though not all candidate patches have been merged. The one that has been already significantly improves the situation.
The initial scenario is still KO with the 5.0.0.201705191401 Sirius version. After having performed the "Create Many EClasses" action, the UI Thread is blocked. I waited more than 5 minutes and I finally killed the process.
https://bugs.eclipse.org/bugs/show_bug.cgi?id=493241 has a patch that may be interesting for this. Or maybe not; I'm linking it for reference.
Created attachment 272613 [details] Updated sample project to reproduce the problem