Bug 199552

Summary: [efs][dstore] Deadlock when creating a folder on dstore-backed remote project
Product: [Tools] Target Management Reporter: Martin Oberhuber <mober.at+eclipse>
Component: RSEAssignee: Martin Oberhuber <mober.at+eclipse>
Status: RESOLVED FIXED QA Contact: Martin Oberhuber <mober.at+eclipse>
Severity: critical    
Priority: P1    
Version: 2.0   
Target Milestone: 2.0.1   
Hardware: All   
OS: All   
Whiteboard:
Bug Depends on:    
Bug Blocks: 187690    
Attachments:
Description Flags
Thread dump showing the deadlock none

Description Martin Oberhuber CLA 2007-08-10 08:21:14 EDT
Created attachment 75833 [details]
Thread dump showing the deadlock

In RSE Perspective, Connect to Linux-dstore host.
Select a folder, right-click > Create Remote Project.
Switch to Resource perspective.
Select the remote project, right-click > New > Folder.
  Name "aaa bbb", OK.

All of Eclipse hangs due to deadlock.

Attached file efs_dstore_deadlock.txt shows the thread dump with the deadlock:

* From the "New Folder" dialog, main thread walks into EFS 
     RSEFileStore.fetchInfo() in order to get project attributes
* from there, it goes into synchronized 
     synchronized RSEFileStoreImpl#getRemoteFileObject()
* This goes to 
     DStoreStatusMonitor.waitForUpdate()
  which runs a nested event loop (Display.waitAndDispatch()

* The nested event loop switches to the ModalContext Thread and delivers 
  events to it
* In the ModalContext Thread, the createFolder() operation also needs
     synchronized RSEFileStoreImpl#getRemoteFileObject()
  but cannot get it because it is locked

--> ModalContext thread never returns due to the lock, so the lock cannot
    be lifted. We have a classic deadlock!
Comment 1 Martin Oberhuber CLA 2007-08-10 08:35:08 EDT
Once again, the problem was that a method was declared "synchronized" although it called out to other, unknown methods (which, in this case, led to the nested event loop in dstore, which did the thread switch, and thus led to the deadlock).

Investigation shows that there is no need to have
   RSEFileStoreImpl#getRemoteFileObject()
be synchronized. It looks like the original intention was that all threads have the same understanding of what the remote file object for the file store is; this is achieved in a more elegant way by declaring the cached _remoteFile variable volatile:
   volatile RSEFileStoreImpl#_remoteFile

The volatile modifier asks Java to put that instance variable in a memory area that looks the same to all threads.

Fix committed:

[199552] fix deadlock with dstore-backed efs access
   RSEFileStoreImpl.java  1.5