32905 – Deadlock while updating after synchronization

Bug 32905 - Deadlock while updating after synchronization

Summary: Deadlock while updating after synchronization

Status:	RESOLVED INVALID

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Resources (show other bugs)
Version:	2.0
Hardware:	PC Windows 2000

Importance:	P1 critical (vote)
Target Milestone:	2.1 RC2
Assignee:	Platform-Resources-Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2003-02-25 05:05 EST by Philipe Mulet
Modified:	2003-03-11 12:45 EST (History)
CC List:	5 users (show)

See Also:

Attachments
Stack dump during deadlock (8.66 KB, text/plain) 2003-02-25 05:07 EST, Philipe Mulet	no flags	Details
Another stack dump (12.01 KB, text/plain) 2003-02-26 07:19 EST, Philipe Mulet	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Philipe Mulet

2003-02-25 05:05:13 EST

Build 2.1RC1

Session wasn't interrupted overnight, and next morning when synchronizing with 
CVS repository, it hanged while 'updating' (after I had checked out a couple 
changes). Autobuild is off.

See below a few stack dumps.

Comment 1 Philipe Mulet

2003-02-25 05:07:04 EST

Created attachment 3690 [details]
Stack dump during deadlock

Comment 2 Philipe Mulet

2003-02-25 13:26:19 EST

I suspect I had done a "replace with latest" instead of "synchronizing".

Comment 3 Nick Edgar

2003-02-25 23:12:43 EST

Strange.  Two threads are waiting on the workspace lock (main and Snapshot), 
but I don't see any threads that actually hold the workspace lock.
May indicate a race condition in the workspace lock itself?
I'm also puzzled by the fact that the lock has different ids in the two threads:
waiting on <02DD8CE8> in main,
waiting on <02DF07C0> in Snapshot

DJ, any idea?

Comment 4 Philipe Mulet

2003-02-26 05:43:02 EST

It actually occurred to me twice yesterday, performing a "replace with latest" 
action.

Comment 5 Philipe Mulet

2003-02-26 07:19:42 EST

Created attachment 3710 [details]
Another stack dump

Comment 6 Philipe Mulet

2003-02-26 07:21:59 EST

I suspect my last stack dump is unrelated, though steps were similar. In both 
cases I had CVS decorators enabled.

Comment 7 Philipe Mulet

2003-02-26 07:48:42 EST

Filled separate bug against VCM for last deadlock (bug 33243)

Comment 8 Philipe Mulet

2003-02-26 08:34:29 EST

The UIWorkspaceLock acquire/release lock is performing unsafe slot modification 
(#ui), couldn't this be causing some grief?

Comment 9 John Arthorne

2003-02-26 10:18:01 EST

The fact that the two threads are waiting on different locks is expected.  Each
waiting thread gets its own Semaphore object to wait on.

What's missing from the picture in the first stack trace is the ModalContext
thread.  Main is waiting on the ModalContext, but the thread is missing.  I see
a couple of possibilities:

1. ModalContext somehow died without relinquising the lock (I don't see how that
could happen, but just throwing it out there
2. The stack trace is missing the thread.  Sometimes I find java stack dumps are
incomplete.  In this case it could be a duplicate of the problem in the second
stack dump.

We can tell from the stack trace that the ModalContext operation was almost
finished, because the main thread is processing asyncExec messages from the
FileDocumentProvider's resource change listener, which are fired at the end of
the operation.

Comment 10 Philipe Mulet

2003-02-26 10:39:16 EST

If an exception occurs during #checkOut, the lock will be kept for ever.

Comment 11 Nick Edgar

2003-02-27 10:56:37 EST

Moving to VCM to consider whether this is likely a dup of bug 33243 based on 
the info above.
See also Philippe's last comment.

Comment 12 Michael Valenta

2003-02-27 11:22:36 EST

No, this is not a dup of bug 33243. From the original stacktrace, it looks like 
the workspace lock was not released by the ModalContext thread when it shut 
down. Bug 33243 involves deadlock caused by the CVS lock which is absent from 
the original stachtrace. This is not to say that bug 33243 isn't responsible 
for surfacing this problem, just that it is not the same problem.

Comment 13 Nick Edgar

2003-03-03 08:22:03 EST

I do not see how it's possible for the workspace lock to be not be released by 
the ModalContext thread.  If the background operation does a workspace 
operation, the operation should complete before the thread terminates.

Comment 14 John Arthorne

2003-03-05 17:31:22 EST

Likely a duplicate of bug 33243.  We don't see anywhere that the lock can be
kept indefinitely after an operation has completed (outside of VM errors
occurring at unexpected times).  Closing.  Will reopen if this happens again.

Comment 15 Philipe Mulet

2003-03-06 05:34:08 EST

Maybe paranoid, but shouldn't checkOut be protected against exception occurring 
inside ? What if some out of memory occurs, then the lock is never released.

Comment 16 John Arthorne

2003-03-06 10:41:24 EST

We looked at it, but it's hard to be 100% safe in that code.  All this method
does is decrement the operation depth, and then release the lock if the depth is
zero.  If we fail with a VM error while decrementing the operation depth, or
while comparing the depth with zero, then we don't know whether the lock should
be released or not.  If we release the lock incorrectly while inside a nested
operation, we open ourselves up to other concurrency problems.  With a
significant rewrite it might be possible to make the method safer (i.e., avoid
object creations), but we weren't comfortable doing that at this stage.