Community
Participate
Working Groups
Created attachment 94558 [details] dump of hung eclipse Build ID: M20071023-1652 Steps To Reproduce: Eclipse will no longer boot up in my default workspace. I've dumped a log while it's hung and there are several references to waiting on an object monitor, followed by some references to RSE classes, e.g.: org.eclipse.rse.internal.subsystems.files.core.Activator$1.run(Activator.java:54) It was working ok yesterday, except I noticed that at one point I had to restart to browse one remote connection. I can load a different workspace (not my default) after I have hung the original. It says something like "default workspace is in use", unfortunately I don't have another workspace with my info in it. I'm kinda screwed here as I need to use the workspace, and I'm not sure if I can some how migrate my settings from this crashed workspace to another. Please help get my eclipse back up and running. More information:
Created attachment 94596 [details] copy of .log file with error messages. copy of .log file with error messages.
After further investigation, I attempted to follow these instruction (from http://dev.zhourenjian.com/blog/2007/11/07/eclipse-freezing-on-start.html): I opened file workspace\.metadata\.log there was a line saying: The workspace exited with unsaved changes in the previous session; refreshing workspace to recover changes After taking many tries, I found a solution for this bug. Just remove folder workspace\.metadata\.plugins\org.eclipse.core.resources\.root\.indexes and restart Eclipse. That didn't work, Eclipse was still hanging. So I removed .metadata/.plugins/org.eclipse.core.resources/.safetable and restarted. That gave me a popup error stating "An error has occurred. See the log file ... \workspace\.metadata\.log" In looking in the log file, I get the following sorts of error messages: Caused by: org.eclipse.core.internal.dtree.ObjectNotFoundException: Tree element '/RemoteSystemsTempFiles/SERVER/file' not found. I've attached a complete .log file. Also I've verified that the file in question existed in the temp files directory. I've also tried completely removing the directory, but I still get the same error message saying the tree element can't be found. Is there a way to "clean up" so eclipse will start without this problem? I don't really need the temp files as I can just download new copies.
Ok, now I deleted the .snap file in: workspace\.metadata\.plugins\org.eclipse.core.resources as it seemed to be in the same general area as the other files I deleted and when I viewed it, the contents seemed related to all of the RSE connections I had. I then started eclipse. And it started!!! But it had a couple errors, which is fine. I closed the two files that the editor attempted to open and I think I'm back to normal.
Analyzing the Log, the following seems to happen: 1.) Thread "files.ui adapter loader": On Eclipse Start, the subsystems.files.core plugin starts the adapter loader Thread. 1.1.) That Thread forces activation of rse.files.ui plugin very early. As part of its Activator initialization sequence, it calls SystemRemoteEditManager.refreshRemoteEditProject() which performs a Workspace Operation (and thus acquires a Workspace Lock) very early. 2.) Thread "[main]": The Dispatch thread is used at Eclipse Start by Eclipse trying to restore the editors that you had open during the previous session: EditorManager.setVisibleEditor() This, in turn, leads to loading the text editor (also on the main Thread): StatusTextEditor.doSetInput() which in turn leads the FileDocumentProvider to refresh its file: File.refreshLocal() This is a Workspace Operation, so it requires the workspace lock that had been obtained in (1.1) above, and locks the Main Thread until that lock is available. 3.) There is two more Threads which require a Display.syncExec() access to the main Thread in order to do their work. Since that one is locked due to (2) they cannot continue: 3.1.) Thread "Worker-0" is busy refreshing an EFS-Shared file: Display.syncExec() RSEFileStoreImpl.getConnectedFileSubSystem() localstore.UnifiedTree.createChildForLinkedResource() localstore.FileSystemResourceManager.refresh() but cannot continue because the Dispatch Thread is currently taken. 3.2.) Thread "Thread-2" is busy initializing the Workbench itself: Display.syncExec() EditorManager.restoreState() WorkbenchPage.restoreState() WorkbenchConfigurer.restoreState() WorkbenchAdvisor$1.run() This Thread seems to be the reason for the editor activation taking place in step (2). Looking at this analysis, I see two possible causes for the deadlock that you observe: (a) You had quit the Workbench with an open editor that was editing a file on an EFS-shared file provided by RSE. On Workbench Startup, Eclipse tries to re-open that editor on the dispatch thread, but RSE cannot provide the editor contents because it also requires the dispatch thread in order to do the subSystem.connect() that's required. (b) A somewhat more complex scenario that also involves the "files.ui adapter loader" - which cannot continue because it's being blocked by a related Workspace Job on the same scheduling rule. Now in case option (b) is true, that would be good for us because the problem would most likely be fixed with the fix for bug 197167, which defers the UI Adapter Loading to a later time. So the issue would be fixed with RSE 3.0M6. In case the adapter loader is not related, and option (a) is true, there's really only two ingredients in the deadlock: The editor performing a load-file on the dispatch Thread (which it shouldn't do), and the RSE EFS Provider requiring the dispatch Thread for connect (which it also shouldn't do). Following these thoughts, there are two possibilities breaking the deadlock and thus fixing the issue: (i) Platform Editor not performing any load-file operations while the dispatch thread is owned. This is IMHO a no-no anyways because load-file can be a long running operation and should thus not happen while the dispatch thread is owned. We should either file a bug against the Platform for this, or (most likely) find an existing bug in the Platform for this and link to it. At the very least, we can expect the Platform Editor to have a watchdog which kills a hanging thread that tries to load something after a fixed timeout - just to avoid deadlock (like OSGi Bundle Activators do it). (ii) In RSE, we could avoid the Display.syncExec() in the subSystem connect Thread, if we can. This means that if we already have a saved password, we should use it without switching to the dispatch thread; and, only switch to the dispatch thread if we need to ask for a password. Now this would fix the problem in most cases (where a stored password) is available; but it would not fix the problem when we need to ask for a password. In any case, problem (ii) can most likely be addressed along with bug 190231 so I'm marking that one as dependent bug. Based on the Analysis, I set severity CRITICAL since it locked out all of Eclipse. I also changed the Summary, previous value was: eclipse won't start up with RSE Ryan -- in order to verify the theory, and in order to see whether problem (b) can be ruled out: Can you confirm that you had an EFS-shared RSE-provided file open in the editor before you quit, and that this was in fact the problem? And: Can you please update to RSE 3.0M6 and see whether you can still reproduce the problem there, or whether it's indeed fixed with RSE 3.0M6? Thanks!
Ryan: At any rate, please provide as exact as possible description what file you had in the editor before you quit Eclipse and made your workspace hang; and, what sort of RSE connection it was; and, whether you have saved a password for that connection; and, whether you can reproduce it with RSE 3.0M6 or not. Thanks!
Correction: The problem (b) would not be fixed with bug 197167, but with bug 218304. The fix for bug 218304, however, apparently has some other problematic implication as shown in bug 227944: In that case, deferred loading of the adapters again leads to issues though not as problematic as here (only an errorlog entry, but not a deadlock).
>please provide as exact as possible description what file you had in the editor before you quit Eclipse and made your workspace hang; I believe I had two files in the editor, both would have been temp files that were downloaded from RSE using SSH Only. Both would have been saved, and should have been in sync with the server when I exited eclipse. I believe they were either shell files or ant files, if they were shell files, they would have used the ShellEd editor, and if ant, then the built in ant editor. It's also possible they were xml files and used the xml editor provided by web tools. I would have had saved passwords for the connections the files used. It's possible they were on different servers, or on the same server using different connections (because of a different username) It occurs to me that if eclipse were to hang for another reason and exit ungracefully (which NEVER happens), that could leave the files in an 'unsaved changes' mode, which might be a different case altogether. I'm probably not going to be able to reproduce this bug (unless it just happens again). I think when I deleted the various files to get eclipse to boot up, I also lost some connection information that I had to recreate. I've recently downloaded 3.0M6 and am using that in conjunction with eclipse 3.4. I'm not using it for all my daily work yet, but certainly if the error happens again, you'll be the first to know. It would be good to know a safe way to restore a workspace, and a safe way to copy/backup all of the connection settings. It's a fair amount of work to recreate connections with all of the filters, I don't mind so much if a temp copy of a file I have on the server is lost in the event of a system failure.
A feature for exporting all your profiles, connections and filters onto a ZIP file will be added with bug 216858 / bug 189274. Until this is complete, you can simply backup your connections with zip -a C:/rse_backup.zip <workspace>/.metadata/.plugins/org.eclipse.rse.core/Profiles Are you sure that you did not have any EFS-provided files? (File > New > Advanced > Link to folder in file system > RSE > ... or RSE "Create Remote Project") ? So you were using the RSE SystemView in the Remote Systems Perspective only? With 3.0M6. the deadlock should no longer happen and you should be in the bug 227944 situation. I think, though, that the fix for this should be relatively easy by moving SystemRemoteEditManager initialization out of the files.ui Activator.start() method and into a deferred startup Thread -- I'll file a separate bug for that. The final solution, however, will be with bug 182363 I guess.
>Are you sure that you did not have any EFS-provided files? (File > New > Advanced > Link to folder in file system > RSE > ... or RSE "Create Remote Project") ? So you were using the RSE SystemView in the Remote Systems Perspective only? I am pretty sure I was using the Remote Systems View in the Remote Systems Perspective. I haven't really done anything with File > New [folder] > Advanced Link to folder in file system > RSE > ... or RSE "Create Remote Project", I'm sure you'll hear about it when I do though ;-)
Thanks. Based on your word, I change the summary - previous value was: [efs] Deadlock on Startup with an EFS-shared RSE-provided file in the Editor The thing, however, is that this part of your thread dump definitely identifies an EFS-shared file: 3.1.) Thread "Worker-0" is busy refreshing an EFS-Shared file: Display.syncExec() RSEFileStoreImpl.getConnectedFileSubSystem() localstore.UnifiedTree.createChildForLinkedResource() localstore.FileSystemResourceManager.refresh() but cannot continue because the Dispatch Thread is currently taken. So you MUST have created a remote project or remote linked resource somehow. Is there any chance that your workspace (.project file(s)) is still available? The .project file(s) hold the information about the linked resources.
Bulk update of target milestone
After checking the logs again, I'm very confident that this has actually been fixed with the fix for bug 228353. We haven't got any report about such behavior again, and bug 228353 does address the most problematic issues. *** This bug has been marked as a duplicate of bug 228353 ***