Bug 260968 - Deadlock in UserLibraryManager
Summary: Deadlock in UserLibraryManager
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.5   Edit
Hardware: PC Windows Vista
: P3 critical (vote)
Target Milestone: 3.5.1   Edit
Assignee: Srikanth Sankaran CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 284673 (view as bug list)
Depends on:
Blocks: 316447
  Show dependency tree
 
Reported: 2009-01-14 00:16 EST by Sergey Vladimirov CLA
Modified: 2010-06-10 07:48 EDT (History)
4 users (show)

See Also:


Attachments
stack trace (25.25 KB, text/plain)
2009-01-14 00:16 EST, Sergey Vladimirov CLA
no flags Details
Thread dump (17.38 KB, text/plain)
2009-05-01 11:26 EDT, Neil Hauge CLA
no flags Details
Proposed patch (3.96 KB, patch)
2009-06-23 04:30 EDT, Srikanth Sankaran CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Vladimirov CLA 2009-01-14 00:16:54 EST
Created attachment 122500 [details]
stack trace

Build ID: I20081211-1908

Steps To Reproduce:
Hard to reproduce. User Libraries list manipulations. I added library with 1 JAR, 2nd with 3 JARS, pressed "ok" and... eclipse stuck

More information:
stack trace in attachment
Comment 1 Neil Hauge CLA 2009-05-01 11:26:58 EDT
Created attachment 134069 [details]
Thread dump
Comment 2 Neil Hauge CLA 2009-05-01 11:35:49 EDT
This is actually not very difficult to reproduce, as I have recently seen 2 developers on our team hit this deadlock while adding user libraries.  It might be related to adding multiple jars in one user library or adding multiple jars in multiple user libraries. Eclipse deadlocks when selecting "OK" to add the libraries.

I've attached an additional thread dump that shows the deadlock.

Comment 3 Olivier Thomann CLA 2009-06-12 11:24:09 EDT
Srikanth,

could you please investigate?

Neil, do you reproduce on 3.5RC4?
Comment 4 Neil Hauge CLA 2009-06-12 11:39:51 EDT
Yes, I was just now able to reproduce on RC4.  It took me about 4 tries.   I hit it with 2 existing user libraries.  I added several new jars to both libraries and clicked the OK button.
Comment 5 Olivier Thomann CLA 2009-06-12 13:19:37 EDT
Would you have steps to reproduce?
Targetting 3.5.1 as we should know by then what we can do with this one.
Comment 6 Neil Hauge CLA 2009-06-12 14:23:36 EDT
To Reproduce:

1. Create 2 user libraries (although one may be sufficient)
2. Add any jar file to each library
3. Select OK
4. Add 2 additional jars to both libraries (although adding to one may be sufficient)
5. Select OK
6. Repeat steps 4 and 5 until you hit deadlock.
Comment 7 Srikanth Sankaran CLA 2009-06-15 07:08:42 EDT
"ModelContext" thread is holding the object monitor 
for UserLibraryManager by virtue of having entered
the synchronized method UserLibraryManager.setUserLibrary
and is trying to acquire the workspace lock as a part
of running the SetContainerOperation.

Worker-10 OTOH is holding the workspace lock having
acquired it as a part of running the auto build
operation and is further trying to enter the synchronized
method UserLibraryManager.getUserLibrary leading to
nowhere.

Comment 8 Srikanth Sankaran CLA 2009-06-22 08:08:12 EDT
At the outset, it would appear that there are two separate
mutual exclusion concerns here that have become entangled:

    - Ensuring that the get and put methods on the HashMap UserLibraryManager#userLibraries don't happen concurrently
and put and put don't happen concurrently - OIOW, structural
modifications to the Map happen under exclusive access.
 
    - Ensuring that setUserLibrary "operation" is serialized
so no two threads attempt to step on each other in updating
eclipse preferences and the attendant flush to store.

    A plausible fix is to remove the synchronized keyword from
setUserLibrary method, wrap the whole of the body of the method
in a synchronized block using separate object monitor (rather than
'this' - the UserLibraryManager) and wrapping the change to 
UserLibraryManager#userLibraries in updateUserLibrary() in a
synchronized block as in 

    synchronized(this) {
        // update user libraries map
	if (userLibrary != null) {
	    this.userLibraries.put(libName, userLibrary);
	} else {
	    this.userLibraries.remove(libName);
	}
    }

This should ensure get vs put consistency as well as serialized
preference put & flush.

Jerome, Can you comment on this approach ? Thanks!
Comment 9 Srikanth Sankaran CLA 2009-06-23 04:30:07 EDT
Created attachment 139843 [details]
Proposed patch


    This patch uses separate locks for separate concerns:

    - That the UserLibraryManager#setUserLibrary and UserLibraryManager#removeUserLibrary methods should not step
on each other toes and should rather operate serially
is ensured by the object monitor 'this.userLibraries' (Map).

    - That the get/put/keySet/remove methods should return consistent
data is addressed by the object monitor 'this' (UserLibraryManager)
via synchronized methods and synchronized(this) {} blocks. 

    While holding this latter lock we will not attempt to acquire
any other locks thereby eliminating the possibility of a deadlock.
Comment 10 Srikanth Sankaran CLA 2009-06-24 06:13:45 EDT
(In reply to comment #4)
> Yes, I was just now able to reproduce on RC4.  It took me about 4 tries.   I
> hit it with 2 existing user libraries.  I added several new jars to both
> libraries and clicked the OK button.

I wasn't very successful with reproducing it - Neil, are you
in a position to apply to the source patch to HEAD and check
if the problem goes away ? TIA.


Comment 11 Neil Hauge CLA 2009-06-24 11:27:27 EDT
(In reply to comment #10)

> I wasn't very successful with reproducing it - Neil, are you
> in a position to apply to the source patch to HEAD and check
> if the problem goes away ? TIA.
> 

I'll give it a try.
Comment 12 Neil Hauge CLA 2009-06-24 16:22:35 EDT
The proposed patch appears to fix the deadlock.  I am unable to reproduce with the patch applied.  
Comment 13 Srikanth Sankaran CLA 2009-06-25 00:39:31 EDT
(In reply to comment #12)
> The proposed patch appears to fix the deadlock.  I am unable to reproduce with
> the patch applied. 

Appreciate your help. Thanks, I'll wait for review comments from Jerome and
expect to release the patch week of Jul 6th after accommodating any review
comments.
Comment 14 Jerome Lanneluc CLA 2009-06-30 11:38:47 EDT
(In reply to comment #8)
Sorry it took so long to respond. This approach looks fine to me and the patch from comment 9 looks good as well.
Comment 15 Srikanth Sankaran CLA 2009-07-06 02:54:16 EDT
Released in HEAD for 3.6M1.

Olivier, what are your thoughts on this fix for 3.5 maintenance ?
Comment 16 Olivier Thomann CLA 2009-07-13 09:30:49 EDT
Srikanth, +1 to backport to 3.5.1 as this is causing a deadlock.
Reopening to get it fixed for 3.5.1.
Comment 17 Srikanth Sankaran CLA 2009-07-14 07:16:58 EDT
Released in 3.5 maintenance branch
Comment 18 Srikanth Sankaran CLA 2009-07-27 02:07:44 EDT
*** Bug 284673 has been marked as a duplicate of this bug. ***
Comment 19 Frederic Fusier CLA 2009-08-04 07:18:23 EDT
Verified for 3.6M1 using build 20090802-2000.
Comment 20 Frederic Fusier CLA 2009-08-04 07:18:57 EDT
Ooops...
Comment 21 Frederic Fusier CLA 2009-08-04 07:19:33 EDT
...should stay as RESOLVED for the 3.5.1 verification process!
Comment 22 Kent Johnson CLA 2009-08-27 15:22:20 EDT
Verified for 3.5.1 using build M20090826-1100