Bug 447963 - Random problem on project deletion
Summary: Random problem on project deletion
Status: CLOSED FIXED
Alias: None
Product: Sirius
Classification: Modeling
Component: Core (show other bugs)
Version: 1.0.1   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 2.0.0   Edit
Assignee: Project inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: triaged
Depends on:
Blocks: 453037
  Show dependency tree
 
Reported: 2014-10-20 12:43 EDT by Laurent Redor CLA
Modified: 2014-11-24 09:08 EST (History)
2 users (show)

See Also:


Attachments
First possible stack on Sirius 1.0.1 (2.98 KB, text/plain)
2014-10-20 12:45 EDT, Laurent Redor CLA
no flags Details
Second possible stack on Sirius 1.0.1 (2.30 KB, text/plain)
2014-10-20 12:45 EDT, Laurent Redor CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Laurent Redor CLA 2014-10-20 12:43:50 EDT
Steps to reproduce:

* Import project https://bugs.eclipse.org/bugs/attachment.cgi?id=247959 of bug 437528
* Open the session (double click on the file testReconnect.aird)
* Open the diagram "copyPaste"
* Delete the project (use "Delete" contextual menu on the project in the Model Explorer view).
Comment 1 Laurent Redor CLA 2014-10-20 12:45:01 EDT
Under Sirius 1.0.1, there is sometimes problems like these:
* stack1.txt
* stack2.txt

First analysis:
* The ViewpointRegistryImpl is notified of the project deletion (and of the VSM deletion). The stack is something like:
** ViewpointRegistryImpl.resourceChanged(IResourceChangeEvent)
** DAnalysisSessionImpl.modelerDesciptionFilesLoaded()
** DAnalysisSessionImpl.notifyListeners(int)
** DDiagramEditorImpl.notify(int)
** DDiagramEditorImpl.modelerDescriptionFilesLoaded()
** Launch a refresh job
* On an other side, the DAnalysisSessionImpl is also notified of the project deletion (and of the VSM deletion)
** EditingSessionWorkspaceListener.resourceChanged(IResourceChangeEvent)
** ResourceSyncClientNotifier.schedule()
** ResourceSetSync.statusChanged(Resource, ResourceStatus, ResourceStatus)
** ResourceSetSync.statusesChanged(Collection<ResourceStatusChange>)
** ResourceSetSync.notifyClientsInBatch(Collection<ResourceStatusChange>)
** DAnalysisSessionImpl.statusesChanged(Collection<ResourceStatusChange>)
** DAnalysisSessionImpl.processAction(Action, Resource, IProgressMonitor)
** DAnalysisSessionImpl.close()

If the session is closed before the refresh is launched, this can explain the problem.
Comment 2 Laurent Redor CLA 2014-10-20 12:45:31 EDT
Created attachment 248024 [details]
First possible stack on Sirius 1.0.1
Comment 3 Laurent Redor CLA 2014-10-20 12:45:58 EDT
Created attachment 248025 [details]
Second possible stack on Sirius 1.0.1
Comment 4 Laurent Redor CLA 2014-10-20 13:16:49 EDT
Under Sirius 2.0.0, the problem seems different (and more critical) with a freeze of the popup "Delete Resources".

1. The UI thread (where the Delete is called) launches the delete in other thread
2. This other thread notifies EditingSessionWorkspaceListener of the workspace changes
3. In this thread, the notifications are send to listener
4. This launches a Session.close() -> EditingSession.close() -> closeEditor()
5. closeEditor() calls DDiagramEditPart.deactivate(), which deactivates its children recursivly
6. In, at least DNodeContainerEditPart.deactivate(), a resolveSemantcElement is done, which launches a TED.runExclusive(), which try to acquire the UI lock for the transaction

At least one difference between Sirius 1.0.1 and Sirius 2.0.0 is in the step 2 where the job ResourceSyncClientNotifier is launched with run() method instead of schedule method(). This change has been made in commit [1] for bug 447042.

[1] http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=b75beb3a39bbbca132ca3cbd71e3af4d2d79fe27
Comment 5 Laurent Redor CLA 2014-10-21 05:11:13 EDT
In the previous comment step 6, I said the "which try to acquire the UI lock for the transaction" but it is perhaps not exact. 

Corresponding stack:
Thread [ModalContext] (Suspended)	
	owns: TransactionImpl  (id=129)	
	waiting for: Semaphore  (id=128)	
	Object.wait(long) line: not available [native method]	
	Semaphore.acquire(long) line: 39	
	OrderedLock.doAcquire(Semaphore, long) line: 176	
	OrderedLock.acquire(long) line: 110	
	OrderedLock.acquire() line: 84	
	Lock.uiSafeAcquire(boolean) line: 357	
	TransactionalEditingDomainImpl.acquire(InternalTransaction) line: 580	
	TransactionalEditingDomainImpl.activate(InternalTransaction) line: 508	
	TransactionImpl.start() line: 204	
	TransactionalEditingDomainImpl.startTransaction(boolean, Map<?,?>) line: 424	
	TransactionalEditingDomainImpl.runExclusive(Runnable) line: 321	
	DNodeContainerEditPart(GraphicalEditPart).resolveSemanticElement() line: 740	
	DiagramElementEditPartOperation.resolveDiagramElement(IDiagramElementEditPart) line: 102	
	DNodeContainerEditPart(AbstractBorderedDiagramElementEditPart).resolveDiagramElement() line: 140	
	DiagramElementEditPartOperation.resolveAllSemanticElements(IDiagramElementEditPart) line: 354	
	DNodeContainerEditPart(AbstractBorderedDiagramElementEditPart).resolveAllSemanticElements() line: 131	
	DiagramElementEditPartOperation.deactivate(IDiagramElementEditPart) line: 476	
	DNodeContainerEditPart(AbstractBorderedDiagramElementEditPart).deactivate() line: 194	
	DDiagramEditPart(AbstractEditPart).deactivate() line: 293	
	DDiagramEditPart(AbstractGraphicalEditPart).deactivate() line: 354	
	DDiagramEditPart(GraphicalEditPart).deactivate() line: 315	
	DDiagramEditPart(AbstractDDiagramEditPart).deactivate() line: 401	
	DDiagramEditPart.deactivate() line: 95	
	DiagramDialectUIServices.closeEditor(IEditorPart, boolean) line: 291	
	DialectUIManagerImpl.closeEditor(IEditorPart, boolean) line: 200	
	EditingSession.closeEditor(DialectEditor, boolean) line: 215	
	EditingSession.closeEditors(boolean, Collection<DialectEditor>) line: 198	
	EditingSession.close(boolean) line: 272	
	EditingSession.close() line: 264	
	SessionUIManagerImpl.cleanUISessions() line: 152	
	SessionUIManagerImpl.notify(Session, int) line: 140	
	SessionManagerImpl.notifyUpdatedSession(Session, int) line: 157	
	SessionManagerImpl.access$1(SessionManagerImpl, Session, int) line: 155	
	SessionManagerImpl$1.notify(int) line: 135	
	DAnalysisSessionImpl.notifyListeners(int) line: 1461	
	DAnalysisSessionImpl.close(IProgressMonitor) line: 1906	
	DAnalysisSessionImpl.processAction(ReloadingPolicy$Action, Resource, IProgressMonitor) line: 1651	
	DAnalysisSessionImpl.statusesChanged(Collection<ResourceStatusChange>) line: 1559	
	ResourceSetSync.notifyClientsInBatch(Collection<ResourceStatusChange>) line: 382	
	ResourceSetSync.statusesChanged(Collection<ResourceStatusChange>) line: 431	
	ResourceSyncClientNotifier.run(IProgressMonitor) line: 76	
	EditingSessionWorkspaceListener.resourceChanged(IResourceChangeEvent) line: 60	
	NotificationManager$1.run() line: 291	
	SafeRunner.run(ISafeRunnable) line: 42	
	NotificationManager.notify(ResourceChangeListenerList$ListenerEntry[], IResourceChangeEvent, boolean) line: 285	
	NotificationManager.broadcastChanges(ElementTree, ResourceChangeEvent, boolean) line: 149	
	Workspace.broadcastPostChange() line: 395	
	Workspace.checkpoint(boolean) line: 557	
	PerformChangeOperation$1.run(IProgressMonitor) line: 263	
	Workspace.run(IWorkspaceRunnable, ISchedulingRule, int, IProgressMonitor) line: 2344	
	UIPerformChangeOperation(PerformChangeOperation).executeChange(IProgressMonitor) line: 306	
	UIPerformChangeOperation.executeChange(IProgressMonitor) line: 92	
	UIPerformChangeOperation(PerformChangeOperation).run(IProgressMonitor) line: 218	
	Workspace.run(IWorkspaceRunnable, ISchedulingRule, int, IProgressMonitor) line: 2344	
	WorkbenchRunnableAdapter.run(IProgressMonitor) line: 87	
	ModalContext$ModalContextThread.run() line: 121
Comment 6 Cedric Brun CLA 2014-10-21 05:44:03 EDT
Pushing the analysis further :

the "update" of the VSM  (actually a deletion here) triggers refreshes in the opened representations.
the "deletion of the session files" trigger the closing of the session and the editing session. It is not trigerred by listening to the Session events themselves but by checking that "we have an editing session which is opened whereas the corresponding session is closed". Both logics are being triggered by the same user action as the VSM is within the modeling project.

org.eclipse.sirius.diagram.ui.tools.internal.editor.DDiagramEditorImpl.modelerDescriptionFilesLoaded()
is being called from the ModalContext.
org.eclipse.sirius.business.internal.session.danalysis.DAnalysisSessionImpl.statusesChanged(Collection<ResourceStatusChange>)
is being called from the ModalContext too. 

Before the change made through commit [1] for bug 447042 it would have came from a job context, hence no deadlock but no control neither upon what we decide to do based on these events and random stack traces would pop up in the error log.

As the workspace-events (model is deleted, VSM is deleted) are coming to the session instance through different means (there are two workspace listeners) it makes it quite hard to "coordinate" what must be done based on these events and we did not even before as we can see with the 1.0.1 stack traces. In thise case at least we would rather know about the "session related events" before taking care of the "ViewpointRegistry" related events.

I can see at least two problems :
1°) VSM related events might come whereas we have "queued" Workspace events (which are actually coming from the same user operation.) which, if we had those, would mean we would know there is nothing to do besides closing the session. 
2°) The editor is trying to access model data even when *we know* the session is closed. We could reduce this by listening to  org.eclipse.sirius.business.api.session.SessionListener.CLOSING and make sure 
the editor is getting closed before the underlying session is completely closed under the rationale that closing an editor will likely trigger some model browsing to detach listeners (just like you saw with DNodeContainerEditPart.deactivate()). We can listen to org.eclipse.sirius.business.api.session.SessionListener.CLOSING to do that. 

Doing 2°) prevents the deadlock yet a "Refresh" job still might be there in zombie mode before it got triggered by the VSM reloading (which in this case is even a VSM deleting). I would argue that 2°) should be done anyway. Waiting for the session to be *completely closed* before closing the editors looks like asking for troubles.

Now for 1°) which is the root of the problem here, we can either figure out a way to use a single workspace listener and then have consistent batches of events, or  or have some kind of guarantee that VSM related events will *always* wait for workspace related events to be processed first, or make sure whatever happens the Refresh job will neither fail nor wait in limbo.


[1] http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=b75beb3a39bbbca132ca3cbd71e3af4d2d79fe27
Comment 7 Cedric Brun CLA 2014-10-21 07:52:10 EDT
https://git.eclipse.org/r/35207 is a proposed change which makes the whole editor and EditingSession more robust to such situations. 

As soon as the session enters the CLOSING state, the editing session will close the editors. This might lead to a "wait" of a RefreshJob.join() which got started when the VSM event got processed. Once this job is finished, the editor is closed, and the session can get back to closing itself.
Comment 8 Laurent Redor CLA 2014-10-22 06:21:57 EDT
The fix proposed by Cédric has been merged on master: http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=d3ba1579a38d99395c837166cdb024bf2f5ec19e
Comment 9 Laurent Redor CLA 2014-10-24 09:17:18 EDT
Verified with Sirius 2.0.0-S20141024-041312
Comment 10 Pierre-Charles David CLA 2014-10-27 06:52:45 EDT
Available in Sirius 2.0.0.
Comment 11 Pierre-Charles David CLA 2014-11-24 09:08:04 EST
Cloned as bug 453037 to backport for 1.0.2.