Bug 369251 - Starved worker threads
Summary: Starved worker threads
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.8   Edit
Hardware: PC Windows 7
: P3 critical (vote)
Target Milestone: 3.8 M5   Edit
Assignee: Jay Arthanareeswaran CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-20 11:14 EST by Markus Keller CLA
Modified: 2012-01-24 01:25 EST (History)
7 users (show)

See Also:


Attachments
Stacktraces (22.25 KB, text/plain)
2012-01-20 11:14 EST, Markus Keller CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Keller CLA 2012-01-20 11:14:38 EST
Created attachment 209832 [details]
Stacktraces

I20120117-0845 + a lot of stuff from master (including o.e.core.jobs/-.runtime)

I started up my target workbench and then quickly tried to open a type. I just updated a few plug-ins in the host, so the target workbench had to rebuild, and the Java search indexes were not ready yet.

All worker threads seem to wait for the same blockingJob.jobStateLock object, but no thread holds that lock.
Comment 1 John Arthorne CLA 2012-01-20 13:31:23 EST
(In reply to comment #0)
> All worker threads seem to wait for the same blockingJob.jobStateLock object,
> but no thread holds that lock.

Those threads are in a wait loop for a scheduling rule, which I highly suspect is owned by Worker-7 which is doing a build. Since build locks the entire workspace anyone attempting to get a resource rule will block on that. However I can't understand why Worker-7 is stuck. It is waiting on a monitor on a synchronized collection. The only thing I can think of is that this it is not actually blocked here but somehow stuck in an infinite loop... or maybe just taking a long time?

"Worker-7" prio=6 tid=0x26147800 nid=0x1e80 waiting for monitor entry [0x2917f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.util.Collections$SynchronizedCollection.remove(Collections.java:1580)
        - waiting to lock <0x0f5449d0> (a java.util.Collections$SynchronizedSet)
        at org.eclipse.jdt.internal.core.ExternalFoldersManager.removePendingFolder(ExternalFoldersManager.java:140)
        at org.eclipse.jdt.internal.core.ExternalFolderChange.updateExternalFoldersIfNecessary(ExternalFolderChange.java:46)
        at org.eclipse.jdt.internal.core.DeltaProcessor.resourceChanged(DeltaProcessor.java:2117)
        at org.eclipse.jdt.internal.core.DeltaProcessingState.resourceChanged(DeltaProcessingState.java:470)
        at org.eclipse.core.internal.events.NotificationManager$1.run(NotificationManager.java:291)
        at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:42)
        at org.eclipse.core.internal.events.NotificationManager.notify(NotificationManager.java:285)
        at org.eclipse.core.internal.events.NotificationManager.broadcastChanges(NotificationManager.java:149)
        at org.eclipse.core.internal.resources.Workspace.broadcastBuildEvent(Workspace.java:381)
        at org.eclipse.core.internal.events.AutoBuildJob.doBuild(AutoBuildJob.java:139)
Comment 2 Markus Keller CLA 2012-01-20 13:52:46 EST
Thanks for the analysis, John. I forgot about the (invisible) scheduling rules. I didn't see any activity in YourKit.

The only two threads dealing with <0x0f5449d0> (the Collections$SynchronizedSet lock object) are Worker-6 and Worker-7. Worker-6 keeps that lock and then tries to take a scheduling rule, which in turn takes forever because Worker-7 has the workspace rule and waits for the SynchronizedSet => Deadlock.

This very much looks like collateral damage from bug 368152.
Comment 3 Markus Keller CLA 2012-01-20 14:32:41 EST
Unfortunately, I can't reproduce this at will, but it sometimes happens again when I crash my target workspace while it is compiling, and then relaunch.

At least I can recover from the situation by canceling the "Initializing Java Tooling" job in the Progress view.

Still, this can't stay. I'll revert bug 368152 for Sunday's I-build, unless someone from the jdt.core team already took care of this by then.
Comment 4 Ayushman Jain CLA 2012-01-20 17:03:49 EST
Will see what can be done. If nothing, then rolling back bug 368152 is the only option
Comment 5 Jay Arthanareeswaran CLA 2012-01-20 18:32:02 EST
We have run in to this problem of not being able to create the schedule rule in InitializeAfterLoadJob before, bug 289560, comment #17 being one such case. 
However, I think we can still use beginRule() and endRule to achieve this - something like this:

			ISchedulingRule rule = ResourcesPlugin.getWorkspace().getRuleFactory().modifyRule(externalFoldersManager.getExternalFoldersProject());
			try {
				Job.getJobManager().beginRule(rule, monitor);				externalFoldersManager.createPendingFolders(monitor);
			}			
			finally {
				Job.getJobManager().endRule(rule);
			}

John, do you see any issues with this kind of schedule rule creation?
Comment 6 Jay Arthanareeswaran CLA 2012-01-21 01:05:17 EST
Released the fix here: 

http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=8c93d4e99b8a943865cb7391e781eba5bb83dfc9

Essentially, the patch backs out the earlier fix for bug 368152, comment #2.
Comment 7 Satyam Kandula CLA 2012-01-24 01:25:18 EST
Verified for 3.8M5 using build I20120122-2000