Bug 32905 - Deadlock while updating after synchronization
Summary: Deadlock while updating after synchronization
Status: RESOLVED INVALID
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 2.0   Edit
Hardware: PC Windows 2000
: P1 critical (vote)
Target Milestone: 2.1 RC2   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-02-25 05:05 EST by Philipe Mulet CLA
Modified: 2003-03-11 12:45 EST (History)
5 users (show)

See Also:


Attachments
Stack dump during deadlock (8.66 KB, text/plain)
2003-02-25 05:07 EST, Philipe Mulet CLA
no flags Details
Another stack dump (12.01 KB, text/plain)
2003-02-26 07:19 EST, Philipe Mulet CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Philipe Mulet CLA 2003-02-25 05:05:13 EST
Build 2.1RC1

Session wasn't interrupted overnight, and next morning when synchronizing with 
CVS repository, it hanged while 'updating' (after I had checked out a couple 
changes). Autobuild is off.

See below a few stack dumps.
Comment 1 Philipe Mulet CLA 2003-02-25 05:07:04 EST
Created attachment 3690 [details]
Stack dump during deadlock
Comment 2 Philipe Mulet CLA 2003-02-25 13:26:19 EST
I suspect I had done a "replace with latest" instead of "synchronizing".
Comment 3 Nick Edgar CLA 2003-02-25 23:12:43 EST
Strange.  Two threads are waiting on the workspace lock (main and Snapshot), 
but I don't see any threads that actually hold the workspace lock.
May indicate a race condition in the workspace lock itself?
I'm also puzzled by the fact that the lock has different ids in the two threads:
waiting on <02DD8CE8> in main,
waiting on <02DF07C0> in Snapshot

DJ, any idea?
Comment 4 Philipe Mulet CLA 2003-02-26 05:43:02 EST
It actually occurred to me twice yesterday, performing a "replace with latest" 
action.
Comment 5 Philipe Mulet CLA 2003-02-26 07:19:42 EST
Created attachment 3710 [details]
Another stack dump
Comment 6 Philipe Mulet CLA 2003-02-26 07:21:59 EST
I suspect my last stack dump is unrelated, though steps were similar. In both 
cases I had CVS decorators enabled.
Comment 7 Philipe Mulet CLA 2003-02-26 07:48:42 EST
Filled separate bug against VCM for last deadlock (bug 33243)
Comment 8 Philipe Mulet CLA 2003-02-26 08:34:29 EST
The UIWorkspaceLock acquire/release lock is performing unsafe slot modification 
(#ui), couldn't this be causing some grief?
Comment 9 John Arthorne CLA 2003-02-26 10:18:01 EST
The fact that the two threads are waiting on different locks is expected.  Each
waiting thread gets its own Semaphore object to wait on.

What's missing from the picture in the first stack trace is the ModalContext
thread.  Main is waiting on the ModalContext, but the thread is missing.  I see
a couple of possibilities:

1. ModalContext somehow died without relinquising the lock (I don't see how that
could happen, but just throwing it out there
2. The stack trace is missing the thread.  Sometimes I find java stack dumps are
incomplete.  In this case it could be a duplicate of the problem in the second
stack dump.

We can tell from the stack trace that the ModalContext operation was almost
finished, because the main thread is processing asyncExec messages from the
FileDocumentProvider's resource change listener, which are fired at the end of
the operation.
Comment 10 Philipe Mulet CLA 2003-02-26 10:39:16 EST
If an exception occurs during #checkOut, the lock will be kept for ever.
Comment 11 Nick Edgar CLA 2003-02-27 10:56:37 EST
Moving to VCM to consider whether this is likely a dup of bug 33243 based on 
the info above.
See also Philippe's last comment.
Comment 12 Michael Valenta CLA 2003-02-27 11:22:36 EST
No, this is not a dup of bug 33243. From the original stacktrace, it looks like 
the workspace lock was not released by the ModalContext thread when it shut 
down. Bug 33243 involves deadlock caused by the CVS lock which is absent from 
the original stachtrace. This is not to say that bug 33243 isn't responsible 
for surfacing this problem, just that it is not the same problem.
Comment 13 Nick Edgar CLA 2003-03-03 08:22:03 EST
I do not see how it's possible for the workspace lock to be not be released by 
the ModalContext thread.  If the background operation does a workspace 
operation, the operation should complete before the thread terminates.
Comment 14 John Arthorne CLA 2003-03-05 17:31:22 EST
Likely a duplicate of bug 33243.  We don't see anywhere that the lock can be
kept indefinitely after an operation has completed (outside of VM errors
occurring at unexpected times).  Closing.  Will reopen if this happens again.
Comment 15 Philipe Mulet CLA 2003-03-06 05:34:08 EST
Maybe paranoid, but shouldn't checkOut be protected against exception occurring 
inside ? What if some out of memory occurs, then the lock is never released.
Comment 16 John Arthorne CLA 2003-03-06 10:41:24 EST
We looked at it, but it's hard to be 100% safe in that code.  All this method
does is decrement the operation depth, and then release the lock if the depth is
zero.  If we fail with a VM error while decrementing the operation depth, or
while comparing the depth with zero, then we don't know whether the lock should
be released or not.  If we release the lock incorrectly while inside a nested
operation, we open ourselves up to other concurrency problems.  With a
significant rewrite it might be possible to make the method safer (i.e., avoid
object creations), but we weren't comfortable doing that at this stage.