456324 – Reduce the scope of the cross-references maintained by Sirius

Bug 456324 - Reduce the scope of the cross-references maintained by Sirius

Summary: Reduce the scope of the cross-references maintained by Sirius

Status:	NEW

Alias:	None

Product:	Sirius
Classification:	Modeling
Component:	Core (show other bugs)
Version:	1.0.0
Hardware:	All All

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	Project inbox
QA Contact:

URL:
Whiteboard:
Keywords:	triaged

Depends on:
Blocks:	456350
	Show dependency tree

Reported:	2014-12-30 05:06 EST by Pierre-Charles David
Modified:	2015-12-15 04:11 EST (History)
CC List:	3 users (show)

See Also:	Gerrit Change Git Commit Gerrit Change Git Commit

Attachments
Traces of inverse reference requests during the Sirius automated tests (25.54 KB, text/x-log) 2015-02-23 11:59 EST, Pierre-Charles David	no flags	Details
Traces containing classes of keys added into the inverse reference map during the Sirius automated tests (9.34 KB, text/x-log) 2015-03-19 05:14 EDT, Mickael LANOE	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Pierre-Charles David

2014-12-30 05:06:13 EST

Currently Sirius installs an ECrossReferencesAdapter on all the models in the session (technically a LazyCrossReferenceAdapter but it inherits from ECrossReferencesAdapter). This is costly in several ways:
- ECrossReferencesAdapter installs itself on whole hierarchies of containment, which prevents partial loading even if the storage backend supports it.
- It maintains potentially huge map(s) of EObjects in memory.
- It is yet another adapter installed on each EObject, which is notified of every model change.

We can certainly do better by analyzing the actual requirements and providing an optimized implementation for these and only these. We may need to change the contract to allow for more efficient operation in the common cases, with explicit opt-in needed for more complex (and costly) services.

Comment 1 Pierre-Charles David

2014-12-30 11:39:29 EST

A first observation: at the highest level, what we actually need is: a fast (O(1)) way to answer the question: "give me all the elements in the session (or another smaller scope) who reference this specific EObject".

A second observation is that in a Sirius session we can distinguish 3 kinds of resources:
1. Representation resources (aird).
2. VSM resources referenced by the representations (odesign).
3. Semantic resources (user-controlled).

We mostly control the metamodels of the first two (with a few exception: feature extensions, GMF Notation models, contributed dialects), so we should be able to specify exactly where this kind of information is needed, and either:
* add EOpposites in the metamodels to make these inverse references navigable and maintained directly by EMF. This has often bad consequences in terms of modularity so it should be a last resort.
* customize the cross-reference adapter to only maintain inverse references for the cases which are actually needed.

In the case of VSMs, maybe a further optimisation could be done by noting that they are immutable when loaded in a session. They only "change" when a new version is re-loaded from disk, so maybe a post-load computation is possible without the cost of adding adapters to every element.

In the case of semantic models, we need full coverage by default as we know nothing about them and need the information to ensure consistency. However we should provide APIs/extensions points to allow client code to provide custom implementation optimized for their case.

Comment 2 Pierre-Charles David

2015-01-12 08:09:41 EST

See https://git.eclipse.org/r/39390 for a first draft of a debug action to analyze and report what kinds of inter-resource references exist in a session.

Comment 3 Pierre-Charles David

2015-02-23 03:29:34 EST

Running the full JUnit test suite (which may not cover all or our code) with an instrumented version of the session's ECrossReferenceAdater give these numbers:
* calls concern 85 distinct types (EClass) of arguments (i.e. the types of elements "x" when asking "who points to x"), only considering arguments from Sirius's own metamodels and ignoring calls on semantic elements.
* the results of these calls concern 103 distinct EReferences from the Sirius metamodels (again, ignoring the EReferences from the semantic metamodels).
* a quick grep on our ecores gives an estimation of a little more than 500 EReferences defined in our metamodels.

More analysis is required, but it looks like we could avoid maintaining inverse references for a large part of the references in the models from Sirius itself. We'll also need to do some memory usage measures to evaluate how much this would allow us to gain.

Comment 4 Pierre-Charles David

2015-02-23 11:59:10 EST

Created attachment 251030 [details]
Traces of inverse reference requests during the Sirius automated tests

The attached log was produced by instrumenting SessionLazyCrossReferencer.getInverseReferences(EObject object, boolean resolve) and then executing the complete Sirius automated test suites. The raw log show 949648 calls to the method, so the attached file is a reduced version produced by:

  cat *.log | grep "^### " | sort -u | grep -Ev '\((uml|ecore|component|migrationmodeler|interactions|docbook)'

The grep part ignores the call to getInverseReferences() where the "eobject" parameter is from one of the metamodels used as semantic models by our tests. Here we are just concerned with request for inverse references which point *to* elements from the VSMs and/or aird.

The resulting summary shows 246 distinct signatures, for example:

  ### getInverseReferences(description.TableCreationDescription): description.RepresentationElementMapping::detailDescriptions tool.ToolSection::ownedTools

Which means there was a request for references towards an instance of TableCreationDescription, which return references to it from two distinct EReferenced.

Further analysis:

  % cat xrefs_traces.log | cut -d':' -f1 | sort -u | wc -l
  69

So it looks like only 69 distinct types (from Sirius, in the context of our tests) are ever subject to a getInverseReferences() query.

Further,

  % cat xrefs_traces.log | cut -d':' -f2- | sed -e 's/ /\n/g' | sort -u | wc -l
  58

Shows that there are only 58 distinct kinds of references pointing to these in the results.

Obviously this is just a first guess based on what is executed in our tests, but in this particular scope it looks like we could get away with keeping track of only these ~60 references (in the VSMs and aird) and produce the exact same results.

Comment 5 Mickael LANOE

2015-03-19 05:14:03 EDT

Created attachment 251731 [details]
Traces containing classes of keys added into the inverse reference map during the Sirius automated tests

The file "inverse-map-key-classes.log" contains classes of keys added into the inverse reference map during the Sirius automated tests.

There are thow parts into the file:

- The first part contains unused classes.
- The second part contains used classes tagged by the call to methods getNonNavigableInverseReferences(EObject, boolean) and getInverseReferences(EObject, boolean).

Some classes from "description" packages of Sirius meta-models are not used and may not be added into the inverse reference map.

Comment 6 Eclipse Genie

2015-03-19 11:33:35 EDT

New Gerrit change created: https://git.eclipse.org/r/44185

Comment 7 Eclipse Genie

2015-03-20 04:31:19 EDT

Gerrit change https://git.eclipse.org/r/44185 was merged to [master].
Commit: http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=96f3c8c7604968d4fcfdb82c27cd8d19c38bf838

Comment 8 Cedric Brun

2015-03-20 04:36:10 EDT

> * customize the cross-reference adapter to only maintain inverse references
> for the cases which are actually needed.
> 

What about a dedicated EAnnotation on the EReference to express it shoul not be cross-referenced ? That way it could be used for Semantic Models too without having to introduce an explicit coupling to Sirius. BTW I could tool that support directly in EcoreTools quite easily, all I need is a constant key.

Comment 9 Eclipse Genie

2015-03-20 04:59:33 EDT

New Gerrit change created: https://git.eclipse.org/r/44227

Comment 10 Pierre-Charles David

2015-03-20 05:10:16 EDT

(In reply to Cedric Brun from comment #8)
> > * customize the cross-reference adapter to only maintain inverse references
> > for the cases which are actually needed.
> > 
> 
> What about a dedicated EAnnotation on the EReference to express it shoul not
> be cross-referenced ? That way it could be used for Semantic Models too
> without having to introduce an explicit coupling to Sirius. BTW I could tool
> that support directly in EcoreTools quite easily, all I need is a constant
> key.

I'd rather have a runtime-configurable blacklist of EReferences, instead of a static hard-coded version. Probably relying on a specific session option (see bug #456326).

The current implicit contract of the semantic cross-referencer is that is cross-references everything, so even if we prove that Sirius itself does not need some EReferences to be maintained, there may be existing client code (e.g. Capella) which relies on this information to be available in their context, so we can not hard-code the fact that we remove information that existed before.

Now that is not incompatible with putting information in EAnnotations, which could be used to determine the default value of the blacklist, but it should be configurable so that clients which need inverse cross-references we don't do not have to install their own separate ECrossReferenceAdapter to get them.

So the idea would be:
* Determine which EReferences in our metamodels take significant space in the ECrossReferenceAdapter *and* are not needed by Sirius itself.
* Mark these with a new EAnnotation indicating they are not to be cross-referenced by default, and document the EAnnotation so that clients can also use it if they want.
* Provide an API to compute the default "blacklist" using this statically defined information.
* Make our ECrossReferenceAdapter configurable using such a Set<EReference> as a blacklist of EReferences to ignore.
* Add a session option so that clients can specify an explicit blacklist. If none is provided, the default value used would be computed using the static information from the API mentioned above. Otherwise client code can use that API to get what would be the default, and add/remove other stuff if they want.

Comment 11 Eclipse Genie

2015-03-20 05:42:05 EDT

Gerrit change https://git.eclipse.org/r/44227 was merged to [master].
Commit: http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=5ab975e97b3a304c0216a825fbc921fc2a148f77

Comment 12 Mickael LANOE

2015-03-20 05:58:20 EDT

Here below are unused map entries (EClass) for which no code requested inverse references for this big modeling projects:

https://github.com/cbrun/org.eclipse.ecoretools.performance/tree/master/plugins/org.eclipse.emf.ecoretools.design.tests.perf/projects/reverse-sirius-10times

Unused map entries sorted by number of references:

org.eclipse.sirius.diagram.description.AdditionalLayer <- 21484 refs
org.eclipse.sirius.diagram.description.NodeMapping <- 13013 refs
org.eclipse.sirius.diagram.description.DiagramDescription <- 10743 refs
org.eclipse.sirius.diagram.description.Layer <- 10742 refs
org.eclipse.sirius.diagram.description.style.WorkspaceImageDescription <- 10739 refs
org.eclipse.sirius.diagram.description.style.BundledImageDescription <- 2262 refs
org.eclipse.sirius.diagram.description.ContainerMapping <- 696 refs
org.eclipse.sirius.diagram.description.style.FlatContainerStyleDescription <- 653 refs
org.eclipse.sirius.viewpoint.description.SemanticBasedDecoration <- 605 refs
org.eclipse.sirius.diagram.description.style.EdgeStyleDescription <- 471 refs
org.eclipse.sirius.diagram.description.EdgeMapping <- 467 refs
org.eclipse.sirius.viewpoint.description.SystemColor <- 90 refs
org.eclipse.sirius.viewpoint.description.UserFixedColor <- 24 refs
org.eclipse.sirius.diagram.description.tool.DirectEditLabel <- 15 refs
org.eclipse.sirius.diagram.description.tool.ContainerDropDescription <- 11 refs
org.eclipse.sirius.diagram.description.style.CenterLabelStyleDescription <- 6 refs
org.eclipse.sirius.diagram.description.tool.DeleteElementDescription <- 3 refs
org.eclipse.sirius.table.metamodel.table.description.ElementColumnMapping <- 3 refs
org.eclipse.sirius.table.metamodel.table.description.CrossTableDescription <- 2 refs
org.eclipse.sirius.table.metamodel.table.description.FeatureColumnMapping <- 2 refs
org.eclipse.sirius.viewpoint.description.tool.PasteDescription <- 2 refs
org.eclipse.sirius.viewpoint.description.style.LabelBorderStyleDescription <- 2 refs
org.eclipse.sirius.diagram.description.style.BeginLabelStyleDescription <- 2 refs
org.eclipse.sirius.viewpoint.description.Viewpoint <- 2 refs
org.eclipse.sirius.table.metamodel.table.description.EditionTableDescription <- 2 refs
org.eclipse.sirius.table.metamodel.table.description.IntersectionMapping <- 2 refs
org.eclipse.sirius.viewpoint.DRepresentationContainer <- 2 refs
org.eclipse.sirius.viewpoint.description.SytemColorsPalette <- 1 refs

28 map entries and 72046 references to these entries are stored and unused in the cross referencer for this big modeling project.

Warning: the trace available here https://bugs.eclipse.org/bugs/attachment.cgi?id=251731 was used to determine that an entry is used or not. 
This trace was generated from Sirius automated tests which offers a good code coverage, but does not guarantee that an entry will never be used.

Comment 13 Pierre-Charles David

2015-04-20 09:13:01 EDT

More measures and analyses would be required to evaluate precisely the effective gain we can expect. We would also need to add an "compatibility mode" for existing client code which may assume that the cross-referencer from Sirius indexes everything. Moving to 3.1.

Comment 14 Pierre-Charles David

2015-04-29 11:06:11 EDT

FWIW, the patch used to produce the traces attached in comment 4 is at https://git.eclipse.org/r/#/c/46679/

Comment 15 Pierre-Charles David

2015-06-23 10:07:01 EDT

From the initial work on this, the benefits look less important than expected, so moving defering to later with reduced priority.

Comment 16 Pierre-Charles David

2015-12-15 04:11:01 EST

Moving out of the 4.0 scope for now, along with all the other issues which were there "by default". This does not mean some of these will not be re-integrated at some point, but for now these issues are not part of the roadmap for 4.0.

If you feel strongly about this removal from 4.0 and/or are ready to sponsor the corresponding work, feel free to comment.