Community
Participate
Working Groups
Build ID: M20080221-1800 Steps To Reproduce: Two WorkspaceJobs in two worker threads: 1. Job (Thread 1) has the org.eclipse.core.internal.resources.WorkManager.lock and does a Project.delete() which calls Resource.delete().In that method the WorkManager.lock is temporarily released and deleteUnprotected() is executed w/o that lock. 2. The second Job (Thread 2) was waiting to acquire the Workmanager.lock and can obtain it now. In its Workspace.endOperation() it starts to broadcast changes and NotificationManager.notify() calls Workspace.setTreeLocked() (so its field Workspace.treeLocked contains now thread 2). 3. In the meantime in Thread 1 runs and does also a broadcast of the Project deletion, this is also done in the NotificationManager.notify() and it also calls setTreeLocked(), effectively setting the field Workspace.treeLocked to thread 1. 4. Now in Thread 2 all calls to Workspace.isTreeLocked() give the _wrong_ result false. The latter leads to Listeners trying to use Workspace.run(): -> see bug 212951. I believe this bug in the isTreeLocked/setTreeLocked implementation is the real cause for the deadlock reported in bug 212951. More information: Related bug: 29624 this is where the current behavior of isTreeLocked() is introduced.
Created attachment 95700 [details] Patch to org.eclipse.core.internal.resources.Workspace Hi, I made a patch that fixes the race on Workspace.treeLocked. It would allow only one thread at a time to own the treeLock. Sending notifications from the NotificationManager.notify() method will be serialized by the patch, if notify() is called with lockTree==true. The patch has been created with "diff -U5 Workspace.java.orig Workspace.java" against eclipse 3.3.2, buildId M20080221-1800, plug-in org.eclipse.core.resources (3.3.1.R33x_v20080205). Best Regards, Holger.
Created attachment 95701 [details] mylyn/context/zip
Thanks Holger. I will look at the patch. Is it only 3.3 issue? I'm just curious what is the performance impact when the patch is applied.
I think the bug is that we're broadcasting PRE_DELETE from the unprotected code. Szymon, I suggest pulling these events out of unprotectedDelete and unprotectedMove to run before we release the lock and call these methods. The above patch introduces a new kind of lock which creates a deadlock risk (thread A holding workManager.lock and waits on treeLockMonitor, thread B holding treeLockManager and waits for workManager.lock).
Hi John, I think there is actually no deadlock risk because a thread that holds the Workspace.treeLocked wont wait on the WorkManager.lock (because isTreeLocked() will return true, and WorkManager.checkIn() will fail before trying to acquire the lock.) But indeed I just wasnt sure if the Workspace.treeLocked is guarded by the WorkManager.lock. If that is the case, moving the notification out of the unprotected part should be the better solution, I agree. BTW: Maybe we could comment somehow the fact that the Workspace.treeLocked is guarded by the WorkManager.lock? Or even better: Could we make fail Workspace.setTreeLocked() if the WorkManager.lock is not held? Best, Holger.
I noticed that this problem concerns also the move operation. So the fix should touch both unprotectedDelete and unprotectedMove. (In reply to comment #5) > BTW: Maybe we could comment somehow the fact that the Workspace.treeLocked is > guarded by the WorkManager.lock? Or even better: Could we make fail > Workspace.setTreeLocked() if the WorkManager.lock is not held? Workspace.setTreeLocked() should not be called when the thread doesn't hold the WorkManager.lock. Such an exception would help us to find such problems in the future quicker.
Would it help if I tried to provide that fix as a patch?
Sure. I will appreciate it.
Created attachment 96625 [details] patch to move notification out of unprotectedDelete/Move The patch moves the broadcastEvent out of the unprotectedDelete() / unprotectedMove() methods in Resource. Also Workspace.setTreeLocked complains if someone wants to lock the tree without owning the WorkManager.lock. Initially I wanted to throw an exception here, but in Project.close() line 138 there is a call to broadcastEvents _before_ the prepareOperation (this seems to be intentionally, i.e. with a source comment). So I didnt change that one, but only logged the wrong access to setTreeLocked() instead of throwing an exception. I still believe that Project.close() is wrong, because in IResourceChangeEvent it is stated that the Workspace is closed during PRE_CLOSE events. And the ILifecycleEvent.PROJECT_PRE_CLOSE triggers an IResorceChangeEvent with type PRE_CLOSE in the NotificationManager. The patch itself is done in org\eclipse\core\internal with "diff -U5" and contains changes to the five files: events/NotificationManager.java resources/Resource.java resources/Workspace.java utils/Messages.java utils/messages.properties Best regards, Holger.
Created attachment 96626 [details] mylyn/context/zip
Hi Szymon, I am raising the priority of the problem since it might be the root cause of bug 212951 and it is important for us to have the deadlock in bug 212951 solved. Best Regards, Kiril
Ok. I'm on it.
(In reply to comment #9) > The patch itself is done in org\eclipse\core\internal with "diff -U5" and > contains changes to the five files: > events/NotificationManager.java > resources/Resource.java > resources/Workspace.java > utils/Messages.java > utils/messages.properties Why didn't you create the patch using Eclipse?
Because it cannot create patches (it is the team provider who could) the only thing available to me was "apply patch", so I had to do it this way. (It would be nice to have this, though... :-)
(In reply to comment #14) > Because it cannot create patches (it is the team provider who could) the only > thing available to me was "apply patch", so I had to do it this way. (It would > be nice to have this, though... :-) > I'm confused. You can't see "Create patch..." item? Is your org.eclipse.core.resources project shared with the Eclipse cvs?
Nope, I didnt see it. I imported my project from the target platform (In the plug-ins view "Import As" -> "Source project").
(In reply to comment #16) > Nope, I didnt see it. I imported my project from the target platform (In the > plug-ins view "Import As" -> "Source project"). > I see. You can create a patch using Eclipse. You should add Eclipse cvs to the repos view (:pserver:dev.eclipse.org:/cvsroot/eclipse anonymous), then check out org.eclipse.core.resources. Now when the project is shared and you do some changes in it, you can create a patch against the HEAD in the repository. It would help me, if you did it. And would be useful in the future, when you have more patches :-)
I am sorry, but I cannot access cvs from behind the firewall here.... :-(
Created attachment 98850 [details] Fix Holger's fix tidied up with tests. I think that a runtime exception (assertion) is better to warn about the workspace lock issue.
Adding John as RC1 committer/reviewer.
Created attachment 99083 [details] Fix v02 Two minor updates: - Added a sleep in the test case so that it consistently fails without the fix - There was an unnecessary arraycopy in Resource.broadcastPreDeleteEvent
+1 on updated patch. Also need to add Holger to the copyrights.
Adding the "greatbug" keyword... thank you Jan and Holger for the excellent analysis of the bug and the fix!
Holger, any special requirements about your copyrights entry? Is "Holger Oehm (holger.oehm@sap.com) - Bug 226264" ok?
Slightly modified Fix v02 (+ copyright, + comments tidied up) release to HEAD. Thanks all for working on this bug.
Hi Szymon, yes, I think thats ok. I dont think there are any special requirements. I am glad I could help. Thank you for mentioning it.
I hope to see you contributing to other core.resources stuff soon :-)
Verified by code inspection.