Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mdt-papyrus.dev] Cross-reference index in di-files?

Hi Christian,

thanks a lot for your reply! I added a few comments inline below.

On Mon, Nov 28, 2016 at 2:22 PM, Christian Damus <give.a.damus@xxxxxxxxx> wrote:
Hi, Philip,

We had, in early releases, editor window layout information in the *.di file that caused headaches of constant churn and conflicts in that file in team source control (note that it is still an option to do this, which is useful when sending models to others with diagrams already open to show them).  This was a problem because the editor layout is a manifestly personal thing, user-specific information that normally should not be shared.

Yes, I agree. That caused/causes some headaches. The editor window layout information still seems to be updated, if it exists, which we had to take care of lately. Currently, we have added special handling to ignore conflicts and changes to the editor window layout information in the di files and always keep the workspace version in the customization of EMFCompare/EGit.
 
What you propose is not like this, so that is good.  However, I think I would prefer not to see this index in the *.di file mostly because it is derived information.  It is like the compiled *.class files and other resources that the Eclipse workspace treats specially:  it is computed from the information already in source control and so keeping it there is redundant.  But also this *.di resource can contain non-derived user data, so it cannot be excluded from source control (some users may like to) and it cannot be somehow treated as partially derived. And what are the potential pitfalls for team scenarios?  I can imagine cases where merge accidents corrupt these indices, for example, if done without the EMF Compare tooling to manage it carefully.

Agreed, adding the index into the di-files will require specific handling in the team tools, such as EMFCompare/EGit. There are definitely some issues that could arise when blindly merging the index, but I believe that we could put specific handling in place that make sure that the merged index would indeed be correct after merging. Anyway, this still remains a disadvantage, as it means extra work in the team components.

It also has the disadvantage that, for instance, adding a cross-reference from model A to B would change the di-file of B. So B.di would have to be included in the commit even though the user only changed A. This might be unexpected to users.
 
I would prefer to keep the index in the workspace metadata.  I don’t think that the start-up cost for a new workspace of indexing the models is so onerous, but I must admit that I haven’t ever had the opportunity to observe it on real-world models of very large scale, only with naïvely generated models.

I also don't think that the start-up costs for indexing are too problematic. Mostly because they can run asynchronously in the background, right? For the diff/merge scenario, however, users will have to wait for the indexing of the remote branch to be finished, which is why for the diff/merge scenario it would indeed pay off.
 

Will your alternative solution require adding to the workspace metadata an index of all of the model files in remote branches that the user has fetched?  Would this be so different to discovering an index that is stored within those same files?

My plan would be to compute the index, based on an extension of the OnDemandCrossReferenceIndex, for a particular remote branch on-demand when the user requests a comparison or merge with the particular remote branch. Building the index a-priori for all remote branches (or even commits) seems to be very costly for little value. Of course, once we have computed the index for a particular commit, we could store it in the workspace metadata for subsequent comparisons/merges with the same commit. But for a comparison/merge with a so far unindexed commit, the user would have to wait as it blocks the comparison/merge.

The implementation that is used in EMF Compare for this purpose currently takes up more than half of the total comparison/merge time and it has been observed that this time can go up to 10+ minutes. So there definitely is a lot of potential to improve the waiting time with the indexing mechanism in Papyrus. I haven't compared the cross-reference index of Papyrus with the model resolver in EMF Compare, but I'd assume for now that they would roughly take the same time when they have to build up the index from scratch. So we would save roughly half of the indexing time (assuming that the workspace version is already indexed), and with that a 25 % of the overall comparison time, if we only use the workspace index and on-demand indexing the remote branch. Overall, 25 % is good. But we may reach up to 50 %, if also the remote branch would be already indexed and probably 40 % if the index can just be read from a few files in the remote branch. Those numbers are of course only gut feelings and don't have been carefully evaluated. ;)
 

I hope that mine is not the only opinion on this subject.

Me too!

Thanks again Christian for your input!

Best wishes,

Philip
 

Cheers,

Christian

On 25 November, 2016 at 12:57:11, Philip Langer (planger@xxxxxxxxxxxxxxxxx) wrote:

Hi,

in the context of diff/merge with EMF Compare and EGit, I'm currently analyzing the potential usage of the new cross-reference index [1] for resolving the "logical model" that may span across many file resources.

So far things are looking good in my prototype for the workspace side. This new API is very well done and easy to integrate! However, the real challenge for me will obviously be the indexing on the remote/origin side in the repository storages. Thus, my next step will be to check whether I can extend the OnDemandCrossReferenceIndex so it'll be able to resolve the cross-references on other branches of the repository (i.e., not the workspace, but not-checked out branches).

Before I continue with that however, I wanted to get your opinion on another option. What do you think about storing the cross-reference index in the di-files? Instead of storing the index in the workspace .metadata (as it is now, I suppose), we could store them on-save in the respective di-files.

This would make it faster when someone opens a model the first time in his/her workspace, as the index could be initialized from the di-files, and it would make it very easy to determine the logical model on remote/origin sides in the context of diff/merge with repository providers, such as EGit. Instead of building up an index on the not-checked-out branches, we could just obtain the di-files and read the cross-reference index from there.

Please let me know what you think and whether this could be an option. Because if yes, it might then be unnecessary to work on an extended CrossReferenceIndex implementation for not-checked-out branches.

Thanks a lot for your opinions and best wishes,

Philip


[1] https://wiki.eclipse.org/Papyrus/Neon.1_Work_Description/Improvements/Control-Mode#Model_Cross-Reference_Index

--
Philip Langer

Senior Software Architect / General Manager
EclipseSource Services GmbH
_______________________________________________
mdt-papyrus.dev mailing list
mdt-papyrus.dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mdt-papyrus.dev

_______________________________________________
mdt-papyrus.dev mailing list
mdt-papyrus.dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mdt-papyrus.dev

Back to the top