Bug 246136 - deadlock between build notification and setRawClasspath
Summary: deadlock between build notification and setRawClasspath
Status: RESOLVED NOT_ECLIPSE
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.4   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-03 15:20 EDT by Eugene Kuleshov CLA
Modified: 2008-09-30 13:58 EDT (History)
4 users (show)

See Also:


Attachments
full tread dump (35.33 KB, text/plain)
2008-09-03 15:20 EDT, Eugene Kuleshov CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Kuleshov CLA 2008-09-03 15:20:08 EDT
The following thread dump shows the deadlock issue between build notification and setRawClasspath call. As per JavaProject.setRawClasspat, the caller does not require to acquire the lock, so in my understanding this supposed to be legal circumstances.

"Worker-7" prio=10 tid=0x0000000054907800 nid=0x35bd waiting for monitor entry [0x0000000044d11000..0x0000000044d11b10]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.eclipse.jdt.internal.core.ClasspathValidation.validate(ClasspathValidation.java:54)
	- waiting to lock <0x00002aaae3f7cc78> (a org.eclipse.jdt.internal.core.JavaModelManager$PerProjectInfo)
	at org.eclipse.jdt.internal.core.DeltaProcessor.resourceChanged(DeltaProcessor.java:1969)
	at org.eclipse.jdt.internal.core.DeltaProcessingState.resourceChanged(DeltaProcessingState.java:410)
	at org.eclipse.core.internal.events.NotificationManager$2.run(NotificationManager.java:288)
	at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:37)
	at org.eclipse.core.internal.events.NotificationManager.notify(NotificationManager.java:282)
	at org.eclipse.core.internal.events.NotificationManager.broadcastChanges(NotificationManager.java:148)
	at org.eclipse.core.internal.resources.Workspace.broadcastBuildEvent(Workspace.java:297)
	at org.eclipse.core.internal.events.AutoBuildJob.doBuild(AutoBuildJob.java:136)
	at org.eclipse.core.internal.events.AutoBuildJob.run(AutoBuildJob.java:238)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)

"Worker-5" prio=10 tid=0x00000000548f4c00 nid=0x35bb in Object.wait() [0x0000000044b0e000..0x0000000044b0fc10]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00002aaae91aa340> (a org.eclipse.core.internal.jobs.Semaphore)
	at org.eclipse.core.internal.jobs.Semaphore.acquire(Semaphore.java:38)
	- locked <0x00002aaae91aa340> (a org.eclipse.core.internal.jobs.Semaphore)
	at org.eclipse.core.internal.jobs.OrderedLock.doAcquire(OrderedLock.java:169)
	at org.eclipse.core.internal.jobs.OrderedLock.acquire(OrderedLock.java:105)
	at org.eclipse.core.internal.jobs.OrderedLock.acquire(OrderedLock.java:82)
	at org.eclipse.core.internal.resources.WorkManager.checkIn(WorkManager.java:118)
	at org.eclipse.core.internal.resources.Workspace.prepareOperation(Workspace.java:1747)
	at org.eclipse.core.internal.resources.File.setContents(File.java:364)
	at org.eclipse.jdt.internal.core.JavaProject.setSharedProperty(JavaProject.java:2885)
	at org.eclipse.jdt.internal.core.JavaProject.writeFileEntries(JavaProject.java:2651)
	at org.eclipse.jdt.internal.core.JavaModelManager$7.run(JavaModelManager.java:1188)
	- locked <0x00002aaae3f7cc78> (a org.eclipse.jdt.internal.core.JavaModelManager$PerProjectInfo)
	at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:1800)
	at org.eclipse.jdt.internal.core.JavaModelManager$PerProjectInfo.writeAndCacheClasspath(JavaModelManager.java:1182)
	at org.eclipse.jdt.internal.core.SetClasspathOperation.executeOperation(SetClasspathOperation.java:61)
	at org.eclipse.jdt.internal.core.JavaModelOperation.run(JavaModelOperation.java:709)
	at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:1800)
	at org.eclipse.jdt.internal.core.JavaModelOperation.runOperation(JavaModelOperation.java:775)
	at org.eclipse.jdt.internal.core.JavaProject.setRawClasspath(JavaProject.java:2815)
	at org.eclipse.jdt.internal.core.JavaProject.setRawClasspath(JavaProject.java:2846)
	at org.maven.ide.eclipse.jdt.BuildPathManager.sourcesDownloaded(BuildPathManager.java:722)
	at org.maven.ide.eclipse.internal.project.MavenProjectManagerImpl.notifyDownloadSourceListeners(MavenProjectManagerImpl.java:823)
	at org.maven.ide.eclipse.internal.project.MavenProjectManagerImpl.downloadSources(MavenProjectManagerImpl.java:810)
	at org.maven.ide.eclipse.internal.project.MavenProjectManagerRefreshJob.run(MavenProjectManagerRefreshJob.java:93)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)
Comment 1 Eugene Kuleshov CLA 2008-09-03 15:20:31 EDT
Created attachment 111617 [details]
full tread dump
Comment 2 Jerome Lanneluc CLA 2008-09-04 12:00:48 EDT
What Eclipse build are you using?
Comment 3 Eugene Kuleshov CLA 2008-09-04 12:05:57 EDT
That is 3.4/Ganymede release from http://download.eclipse.org/eclipse/downloads/drops/R-3.4-200806172000/index.php
Comment 4 Jerome Lanneluc CLA 2008-09-04 12:17:47 EDT
Lock 0x00002aaae3f7cc78 was acquired inside an IWorkspaceRunnable with an IProject scheduling rule. This means that the buid notification should not be running at the same time.

Moving to Platform/Resources
Comment 5 Igor Fedorenko CLA 2008-09-04 12:33:32 EDT
(In reply to comment #4)
> Lock 0x00002aaae3f7cc78 was acquired inside an IWorkspaceRunnable with an
> IProject scheduling rule. This means that the buid notification should not be
> running at the same time.
> 
> Moving to Platform/Resources
> 

Unless I am missing something, platform does does not guarantee no resource change events are fired during execution of compound workspace modifications. See bug #227645.
Comment 6 Jerome Lanneluc CLA 2008-09-04 12:45:25 EDT
(In reply to comment #5)
> Unless I am missing something, platform does does not guarantee no resource
> change events are fired during execution of compound workspace modifications.
> See bug #227645.
I'm not sure that's the issue here. The problem is that no workspace modification are allowed during the build notification. To ensure that we can do a workspace modification, we surround our code in an IWorkspaceRunnable. In this case, it seems that this protection didn't work. So I'm at lost at what we can do. I believe this should be solved on Platform side. 

Comment 7 Szymon Brandys CLA 2008-09-30 06:17:20 EDT
Worker-7 causes the deadlock. The issue is that synchronized block in ClasspathValidation inside of the notification thread.

See bug 245573, comment 12 at John's suggestion:
"Again because this results in
insertion of locking behavior onto any client performing workspace changes,
acquiring locks within a resource change callback is potentially dangerous (unless you are certain you never call out to foreign code while holding that
lock)"

I think that, this is what happens in this particular case
- notification in Worker-7 locks the workspace
- job in Worker-5 acquires a lock (synchronized block) and can't proceed since the workspace is locked
- Worker-7 is blocked, since it tries to acquire a lock conflicting with the Worker-5 lock

I hope it's clear :-)



Comment 8 Jerome Lanneluc CLA 2008-09-30 07:08:16 EDT
Nothing in the spec of IResourceChangeListener says that a lock cannot be acquired. Even if it was specified, you cannot impose this on clients as this means that they cannot synchronize their state.

For me the problem is in Platform/Resources that allows a resource change notification while an Workspace.run is running.
Comment 9 John Arthorne CLA 2008-09-30 11:16:51 EDT
Jerome is correct that it should be impossible for these two threads to be in this state. Worker-5 owns an IProject scheduling rule, and Worker-7 owns IWorkspaceRoot, and these two method conflict with each other.

However, this bug is actually caused by the Maven scheduling rule held by MavenProjectManagerRefreshJob in Worker-5. It uses the following scheduling rule with refresh=true:

http://svn.sonatype.org/m2eclipse/trunk/org.maven.ide.eclipse/src/org/maven/ide/eclipse/internal/project/SchedulingRule.java

Note this method returns true in its contains method for any IResource, which allows it to take the IProject scheduling rule needed by JDT. However, its isConflicting method returns false, which allows it to run simultaneously with the build job, causing the deadlock.

So, I suggest entering a bug against the provider of the Maven plugin you are using. See also related discussion in bug 246840 and bug 246839.
Comment 10 Eugene Kuleshov CLA 2008-09-30 11:19:39 EDT
(In reply to comment #9)
> Jerome is correct that it should be impossible for these two threads to be in
> this state. Worker-5 owns an IProject scheduling rule, and Worker-7 owns
> IWorkspaceRoot, and these two method conflict with each other.
> 
> However, this bug is actually caused by the Maven scheduling rule held by
> MavenProjectManagerRefreshJob in Worker-5. It uses the following scheduling
> rule with refresh=true:
> 
> http://svn.sonatype.org/m2eclipse/trunk/org.maven.ide.eclipse/src/org/maven/ide/eclipse/internal/project/SchedulingRule.java
> 
> Note this method returns true in its contains method for any IResource, which
> allows it to take the IProject scheduling rule needed by JDT. However, its
> isConflicting method returns false, which allows it to run simultaneously with
> the build job, causing the deadlock.

Hmm, John, can you please advise what need to be changed in that custom SchedulingRule to eliminate this issue? Thanks.
Comment 11 John Arthorne CLA 2008-09-30 13:58:23 EDT
One problem is that contains is not consistent with isConflicting. If contains returns true that isConflicting should also return true.  

The other problem is that your isConflicting method is not symmetric as required by the contract. I.e., YourRule.isConflicting(IResource) returns true but IResource.isConflicting(YourRule) would return false. Thus you'll get random behaviour when the job scheduler is attempting to allocate waiting jobs to threads. 

You should be removing any reference to IResource in your rule, and instead use a MultiRule if you want to obtain both your rule and an IResource at the same time.