Community
Participate
Working Groups
Driver: eclipse-SDK-3.1-win32 with eclipse-test-framework-3.1 Every we open and close an editor. That editor instance is being leaked. We have a testcase that can demostrate the problem. The testcase is really simple. It creates a new simple project and a new file. It opens up the new file in the editor that comes with the testcase, then close the editor. Repeat 500 times. What's interesting is that the editor, upon open, will allocate a 200000 size String array as a private field. So this String array can be GC-ed if the editor itself can be GC-ed. If you run this testcase with -Xmx256M, you will run out of memory. However, if you explicitly set the String array to null in the dispose() method of the editor, then the same testcase will not run out of memory. This leads us to believe that the editor instance is being leaked.
Created attachment 24562 [details] Test plugin that demostrates the memory leak
I see your code uses a multipage editor. Can you reproduce this using a direct subclass of EditorPart?
I'm going to try to reproduce right now.
Any relation to bug 102787?
This problem is reproducible on my Linux GTK+ box, but not if I convert the XSDEditor to a plain EditorPart. I'll trace the leak with OptimizeIt....
Same here. I converted it to EditorPart and the leak is gone.
I don't think this has much to do with the navigation history (Bug 102787). The navigation history holds on to the IEditorInput, not the part itself. The problem is that the WorkbenchPage is not calling dispose on an UndoActionHandler. This means that the UndoActionHandler remains in the part listener list. The UndoActionHandler holds on to the MultiPageEditorSite, which in turn holds on to the TextEditor created by XSDEditor. There are three things I think should be done: 1.) MultiPageEditorSite should null out its reference to multiPageEditor in the dispose method. 2.) OperationHistoryActionHandler should null out its reference to site in its dispose method. 3.) We must find out why dispose is not being called on OperationHistoryActionHandler and resolve the problem. If I were to guess, I'd bet that the dispose is never called because a particular event doesn't arrive for the nested part.
Okay, I'm done looking at this bug for now. I'll wait for Nick to ring in with his thoughts.
The dispose is called by the action handler itself when it receives the partClosed event. However it relies on the part parameter in partClosed to match the site's part. (see OperationHistoryActionHandler.PartListener.partClosed(IWorkbenchPart)): if (part.equals(site.getPart())) { dispose() } Is this a bad assumption for multi-page editors?
It's reasonable to assume that the part is 1:1 with its site, but the workbench does not know about nested parts and will not issue partClosed() for any nested editor. Looks like the undo/redo action handlers will need to be disposed explicitly by the editor itself. I agree that the site can let go of the editor when it is disposed. Stefan, it looks like PartSite will has the same problem. The part is assigned in the constructor but is not cleared in dispose(). IWorkbenchPartSite.getPart() (new in 3.1) does not specify that it can return null, but it probably should return null after the part and its site have been disposed (and the site should be disposed after the part).
In ViewReference.doDisposePart() and EditorReference.doDisposePart() the site and its services are being disposed before the part. This seems wrong, as the part's dispose() method may need to talk to the site or its services. At the least, (part.getSite().getPart() == part) should hold during dispose().
Susan, would you be able to change the action handler lifecycle so that the editors dispose the action explicitly?
if (part.equals(site.getPart())) { dispose(); } else if ((site instanceof MultiPageEditorSite) && (part.equals(((MultiPageEditorSite) site) .getMultiPageEditor()))) { dispose(); }
That would work, but it's yucky to introduce special case checks for the MPE case. We don't do this anywhere else in the workbench. It would be better if the action could get notified of disposal via a service directly on the part's site, rather than registering on the page's part service, which assumes that the part is top-level.
Perhaps we don't have special case for a MultiPageEditorPart yet, but we do have extensive special cases for MultiEditor. My thought was that Stefan's component framework would keep this class of problems from new code, while such a small fix would be a low risk way of addressing existing code. But either way. It doesn't really matter.
In response to comment #12, it's not a problem for me to change the action handler so that it doesn't rely on part events to dispose itself, but it would then be up to all clients to dispose. The current behavior was added due to bug #88352. See the discussion from bug #88352 comment #10 on. Do you think we should revert back to having clients do it? I agree that the comment #13 solution is kind of yucky. It seems odd for such a special case to be handled by an action handler that can be placed on any view or editor. (I'd feel better about the hack if the handler were editor- specific).
Sorry I didn't think of the nested part case when the part listener change for bug 88352 went in. Since we shipped 3.1 with the undo/redo action handlers disposing themselves, maybe it would be best to just use Doug's suggestion for now.
What about this as an alternative to Doug's suggestion. It's more clear what the underlying problem is (the site's part is null) but is a more general change (more risky?) public void partClosed(IWorkbenchPart part) { IWorkbenchPart currentPart = site.getPart(); // Dispose if the part is closing. The site's part may be null // in certain cases at this point, but this also means we should dispose. // See https://bugs.eclipse.org/bugs/show_bug.cgi?id=103379 if (part.equals(currentPart) || currentPart == null) { dispose(); } }
Never mind my last remark, that snippet was bogus. I was trying the different proposed fixes against the leak test and observing that they passed with the fix. However the leak test will pass without any changes to OperationHistoryActionHandler, as long as the MultiPageEditorSite change (nulling out multiPageEditor in dispose()) is made. I assume that Doug's proposed OperationHistoryActionHandler changes (the change in comment #13 and also nulling out the site in the dispose() method) are also needed based on the trace from OptimizeIt, even though the absence of them does not cause the supplied test case to fail. Doug, is this true? If so, I don't mind releasing the proposed changes to OperationHistoryActionHandler, but note that neither will make a difference until MultiPageEditorSite is also fixed.
Susan: The test case provided in comment #1 fails for me when run with the steps in the description. If you have been unable to reproduce the failure, could you please try so again. The (1) fix in comment #7 is only for robustness, and I feel it shouldn't be relied on for solving this kind of problem.
Doug - Here's what I observed yesterday (and verified again today) running the provided test case in a as a JUnit-Plugin test in a workspace with the latest from HEAD loaded. (I won't list all the plug-ins in my workspace, but can minimize my workspace to only the affected plug-ins if we believe this is affecting the difference in observations). - Latest code from HEAD unaltered - test case runs with java.lang.OutOfMemory Error - Apply the (1) fix you propose in comment #7 - test case passes, no errors - Apply the OperationHistoryActionHandler fixes - (2) fix from comment #7 and the fix from comment #13, without applying the (1) fix - test case runs with java.lang.OutOfMemoryError That's why I think the (1) fix must be made also, and why I can't prove with only the test case that the other two fixes are correct. If you are seeing different results I can debug this further. I don't have a problem with making the OperationHistoryActionHandler fixes, but need a way to prove they do something. I don't have OptimizeIt (yet) so I'm relying strictly on the test case. I realize that the nature of the failure (out of memory) is such that there could still be a leak and the test could pass if only a partial fix is made, but what I'm seeing is that the test will never pass without also making the MultiPageEditorSite fix. Are you seeing something different?
With only the patch suggested in comment #13, the leak is fixed. The other suggested fixes are only for robustness. The remaining problem can be traced back to the following missing lines from the test case (append to the openXSD method): Display display = workbenchWindow.getShell().getDisplay(); while (display.readAndDispatch());
Yes, the patch in comment 13 should suffice to fix the patch. If not, then the site is still being held onto. Sites should still release their reference to the part (and vice versa) when the part is disposed, but this requires a bit of tweaking to part lifecycle in PartSite, PartReference and subclasses, not just MultiPageEditorSite. I've filed bug 103810 for the site changes. Reassigning this one to Susan for the operation changes.
Doug, do you know why we need to spin the event loop for this to work? Lifecycle like disposing a part and the corresponding cleanup should not depend on having to spin the event loop.
The chain holding on to these instances is roughly as follows: Display.synchronizer -> UISynchronizer.messages -> RunnableLock -> AbstractTextEditor.TextListener.partSite -> MultiPageEditorSite.multiPageEditor Loose translation: The AbstractTextEditor has a listener for text changes that posts an asyncExec. When the part is created, this listener is attached. After the listener is attached, a text changed event is fired (initializeSourceViewer).
With Doug's change to the testcase, I can now verify that the OperationHistoryActionHandler fixes will prevent the java.lang.OutOfMemory error, so I've released the changes. Fixed >20050714. thanks, Doug, for tracking down and for making it simpler for me to verify.
Doug, Kim - I think this is a candidate for 3.1.1. Thoughts? (Would include the related fixes in bug #104657).
Nick says +1. This is good for 3.1.1.
committed change to R3_1_maintenance branch. Available >20050801
Verified in I20050808-2000 (3.2 M1 test pass) using the OpenEditorOAGISXSDTestcase with the event loop mods in comment #22. The test case passes with no out of memory error. Leaving this bug marked "fixed" (vs. "verified") so it can be verified again against 3.1.1 when the time comes
verified on 3.1.1 (M20050923-1430) on Linux Motif