Bug 297770 - [Undo] Deadlock in DefaultOperationHistory
Summary: [Undo] Deadlock in DefaultOperationHistory
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.5   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: ---   Edit
Assignee: Platform UI Triaged CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-14 15:27 EST by Min Idzelis CLA
Modified: 2019-09-16 12:35 EDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Min Idzelis CLA 2009-12-14 15:27:56 EST
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5
Build Identifier: 3.6m4

There is a deadlock that is caused by the listener notification of DefaultOperationHistory. 

It turns out that DefaultOperationHistory is holding a lock while calling listener code, which is a big no-no. That is because the listener can switch threads and then call back into DefaultOperationHistory which will cause a deadlock. 

The deadlock I'm seeing is this:

From a refactoring (the ModalContextThread) 
DefaultOperationHistory.add -> checkUndoLimit -> forceUndoLimit (acquires lock) -> internalRemove -> notifyRemoved -> notifyListeners -> Listener performs Display.syncExec() and blocks until done

On UI thread: UI is executing the anonymous asyncExec inside of OperationHistoryHandler$1 ( line 144) which is calling UndoActionHandler.update -> shouldBeEnabled -> DefaultOperationHistory.canUndo -> BLOCKED on entry to synchronized block on getUndoOperation


My recommendation -> do NOT hold locks when invoking notifyListeners in forceUndoLimit. 

Reproducible: Always
Comment 1 Paul Webster CLA 2009-12-14 18:15:49 EST
Possibly changed by bug 238397

PW
Comment 2 Paul Webster CLA 2009-12-14 18:16:59 EST
(In reply to comment #1)
> Possibly changed by bug 238397

uh, or not.
PW
Comment 3 Susan McCourt CLA 2009-12-15 13:02:55 EST
Yes, this problem was introduced by the fix to bug 238397.

When the locking mechanism was originally implemented, the locks were structured so that there would never be a call out to listeners while holding the locks.  You can see the full history of this discussion in bug 91343.

At one time (see bug 91343 comment 27, first bullet) locking the list during the entire forceUndoLimit method would have demonstrated the deadlock immediately in the test cases.  It's a bummer that this wasn't caught by the test cases later on, I'm not sure why.

I think the fix for bug 238397 needs to be revisited.  Note that we were worried about that kind of bug when the locks were first added, but we were trying to balance thread safety with deadlock risk at the end of a release.
Comment 4 Paul Webster CLA 2014-07-17 10:30:38 EDT
This is still theoretically possible, but it is hard to reproduce in the wild.

PW
Comment 5 Eclipse Genie CLA 2019-09-16 12:35:36 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.