Bug 150994 - Eclipse Update Site Manager Leaks File Handles
Summary: Eclipse Update Site Manager Leaks File Handles
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Update (deprecated - use Eclipse>Equinox>p2) (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P3 critical (vote)
Target Milestone: 3.2.1   Edit
Assignee: Platform-Update-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 107051 151251 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-07-18 14:57 EDT by Chris McGee CLA
Modified: 2008-08-11 14:35 EDT (History)
19 users (show)

See Also:


Attachments
A quick fix for the problem we are observing (2.90 KB, patch)
2006-07-19 11:49 EDT, Chris McGee CLA
no flags Details | Diff
1089051361234 (deleted)
2007-11-21 06:47 EST, asarie CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris McGee CLA 2006-07-18 14:57:12 EDT
In a large product with thousands of plugins and hundreds of features we are noticing that file handles are getting leaked.

I am able to run one simple use case end-to-end with no problem. However, if I first do a Help -> Software Updates -> Manage Configuration after eclipse starts up the class loader will then fail to load one of the plugin JARs in the use case.

I tried debugging the scenario and found that the class loader was throwing a ZipException with description ("Too Many Files Open ..."). I tried the whole use case again with an exception breakpoint on ZipException. I noticed the exact same sort of exception being thrown in the update site manager code before I even start my use case. A CoreException is thrown and logged but not all of the file handles are released because it interferes with the use case that follows.

This problem is blocking us because we have code that uses the update site API's to identify if a particular feature is installed. Whenever this code gets executed, it will cause other parts of the application to fail.
Comment 1 Eric Moffatt CLA 2006-07-18 15:17:22 EDT
McQ, you may know more about this issue than anybody available now...I'm adding you to the CC list just as a 'heads up'.

Comment 2 Mike Wilson CLA 2006-07-18 15:31:48 EDT
I'm not sure what's going on here, but we've seen similar issues before. Adding John and Pascal to widen the visibility.
Comment 3 Vishy Ramaswamy CLA 2006-07-18 16:06:46 EDT
All fundamental modeling scenarios in our products are blocked by this defect. Requesting to fix this in 3.2.1.
Comment 4 John Arthorne CLA 2006-07-18 17:28:15 EDT
I can confirm that there is a significant leak when opening the "Manage Configuration" window. In fact, the window itself, and the view, tree viewer, and all other widgets within it are leaked. If I open/close this window ten times in a profiler, I see 70 classes with exactly ten extra instances, and another 30 or so with more than 10 leaked instances.  From looking at a few of the objects, the reference graphs mostly lead back to a static field on InternalSiteManager.  Here is an allocation trace of the "ConfigurationView" object, as a representative example:

ConfigurationView
 Object[]
  listeners of org.eclipse.update.internal.core.ListenerList
   listeners of org.eclipse.update.internal.core.LocalSite
    static variable of org.eclipse.update.internal.core.InternalSiteManager

However, this doesn't look like the cause of leaking file handles.
Comment 5 Dejan Glozic CLA 2006-07-18 21:25:49 EDT
(In reply to comment #4)
> ConfigurationView
>  Object[]
>   listeners of org.eclipse.update.internal.core.ListenerList
>    listeners of org.eclipse.update.internal.core.LocalSite
>     static variable of org.eclipse.update.internal.core.InternalSiteManager

It smells like some part of the config view is not removing itself as a listener somewhere.
Comment 6 Mike Wilson CLA 2006-07-19 08:05:48 EDT
Since I'm about to go on vacation, I can't track the progress on this. John/Dejan, you need to follow this far enough to at least find a real owner for it, and make sure that it gets fixed.

Note that the original report is about _file_handles_. Other leaks may be related, but aren't the primary cause of the blockage.
Comment 7 Chris McGee CLA 2006-07-19 08:37:03 EDT
I just found something very interesting. There is a private static list in JarContentReference called referenceList that contains every instance of JarContentReference ever constructed. Each instance holds on to a File object, which could be causing the file handle leak. The parent class, ContentReference, has no such static list and this list would only be used for JAR'ed plugins, which is new since the last time that we released our product with the thousands of plugins in it.

Here's the only method that actually reads the referenceList (it is private to this class so its the only one that could ever have access to the list):

/**
 * Perform shutdown processing for jar archive handling.
 * This method is called when platform is shutting down.
 * It is not intended to be called at any other time under
 * normal circumstances. A side-effect of calling this method
 * is that all jars referenced by JarContentReferences are closed.
 * 
 * @since 2.0
*/
public static void shutdown() {
	for (int i = 0; i < referenceList.size(); i++) {
		JarContentReference ref = (JarContentReference) referenceList.get(i);
		try {
			ref.closeArchive(); // ensure we are not leaving open jars
		} catch (IOException e) {
			// we tried, nothing we can do ...
		}
	}
}

I see no factory methods that would allow reuse of the same JarContentReference objects so this referenceList will become larger every time someone tries to grab the local site and therefore, leak more file handles.
Comment 8 Chris McGee CLA 2006-07-19 11:48:14 EDT
This has been confirmed as the source of the file handle leak.

Java has implemented ZipFile to include a special close() method that will close all of the input streams that have been opened for entries in that zip file as well as releasing the primary file descriptor. The close method can be called as a result of a garbage collection(from finalize()) but update core is maintaining this static list, which prevents this process from occurring.

I have tried hacking the update core code so that we close all of these JarContentReference objects' JarFiles after parsing the manifest and plugin.xml. This completely solves our problem and the manage configuration dialog actually shows some content. The patch will be attached shortly.

I don't know why this static list is being kept. It must (have) serve(d) some useful purpose.
Comment 9 Chris McGee CLA 2006-07-19 11:49:42 EDT
Created attachment 46510 [details]
A quick fix for the problem we are observing

This patch is far from complete but it illustrates how I was able to coerce the update manager to free up its OS file handles.
Comment 10 John Arthorne CLA 2006-07-19 17:55:32 EDT
Chris, this seems to be a misuse of the API on your part.  If you are programmatically creating JarContentReference instances, and not later calling closeArchive(), then it will certainly leak a file.  If I open the "Manage Configuration" window and then close it again, I don't see a leak of JarFile instances because the JARs are later closed.  I see from browsing the update code that JarContentReference.shutdown() is used in various places to avoid these file leaks.  It is possible that this shutdown() method has an inappropriate name and should be specified as a general method that clients can use to clean up cached jar files when you are done with them. If you call JarContentReference.shutdown() in your use case at then end of any update work, it should avoid the leak.

Note also that the update API, although it has been around for a few years, is really in an interim, embryonic form, and is not really battle-hardened and recommended for general consumption. It would be best if you could avoid using this API entirely.
Comment 11 Chris McGee CLA 2006-07-20 09:05:55 EDT
John,

I am not using the JarContentReference class at all on my side. I had fixed up the update site code in the provided patch so that it doesn't keep all of these references in that static map and calls close on each JarFile (ZipFile) to allow all of the OS file handles to get cleaned up. I am sure that there is some good reason for the static map and the shutdown() method but I know that removing these things fixes the problems that we are encountering.

For the purpose of testing this defect, I have removed any use of the update site manager public API's (SiteManager, ILocalSite, etc.), which my component is using. All that I am doing right now is opening the "Manage Configuration" dialog before executing my use case (simulating the effect of my code that looks for a particular feature that is installed using public API's) and class loaders start failing to load our plugin JAR's. At the very least the "Manage Configuration" dialog is not working properly because it doesn't show any of the hundreds of feature we have currently installed.

Is there another way that I can discover an installed feature other than using the SiteManager API's? If there is a more stable API then I could migrate my code to use that instead.
Comment 12 Chris McGee CLA 2006-07-21 10:46:40 EDT
I am changing this bugzilla from a blocker to just critical. 

We no longer depend on the update site manager for our use case. However, the original defect still stands because someone could open up the "Manager Configuration" dialog and disrupt many of our use cases because of class loader excptions.
Comment 13 John Arthorne CLA 2006-07-21 16:07:32 EDT
This needs more investigation. I wasn't able to find a file leak from repeatedly opening the Manager Configuration dialog.
Comment 14 Chris McGee CLA 2006-07-21 17:03:52 EDT
I have dug into the JDK code a little bit to try to understand how file handles (descriptors) are consumed in Java. One way to consume file handles, is of course to construct a FileInputStream object. The constructor for this class actually constructs a FileDescriptor object. That file descriptor class appears to employ a simple reference counting scheme to keep track of all of the different streams that may have opened this file before allowing the OS to reclaim the file handle.

The JDK ZipFile class (update manager code uses a subclass called JarFile) makes no use of this FileDescriptor class. Instead, it has alot of native calls including this mysterious native close(long jzfile) method.

Is it possible that way the native code for handling zips in java is confusing your file handles analyzer?
Comment 15 Steven Wasleski CLA 2006-07-27 12:06:56 EDT
I believe Kit and I are seeing this same problem while in TVT for a large project.   There are many plugins and nl fragments loaded.  We only see this on Linux.  All we have to do is open the Manage Configuration dialog and we see loads of errors, not because the configuration is bad but because the SiteFileFactory.parsePackagedPlugins method can not open all the jars.  A ZipException with the message "Too many open files..." is thrown.  Note that this exception is NOT being logged.  It probably should be.

Also, note that we did not see this during TVT of a project roughly the same size as Callisto.  It appears a larger project built on top of most of Callisto is required to trigger the error.  Contact Kit for a testcase.
Comment 16 Steven Wasleski CLA 2006-08-09 16:56:21 EDT
Kit, have you been contacted for the testcase?

John, getting this one fixed is critical for any large products based on Eclipse that want to run on Linux.
Comment 17 Kit Lo CLA 2006-08-09 17:00:08 EDT
No one contacted me for the testcase.
Comment 18 John Arthorne CLA 2006-08-09 17:04:21 EDT
Branko, have you looked at this?
Comment 19 Branko Tripkovic CLA 2006-08-11 10:39:56 EDT
I am looking at the patch and it looks like it will introduce performance hit, and i am still not sure it is safe. Better way would be to limit the size of cache (to something that is less then default allowed number of file handlers per process, usually 1024) then to eliminate it completely. Also for Linux problem there is an easy work-around, and that would be to increase number of allowed file handlers per process (as far as I remember this works on all unixes too like Solaris, AIX.. but it's been almost 2 years since I worked with them so I am not 100% sure). This can be done as part of installation process.
Comment 20 Branko Tripkovic CLA 2006-08-11 11:16:52 EDT
*** Bug 151251 has been marked as a duplicate of this bug. ***
Comment 21 Chris McGee CLA 2006-08-22 15:01:37 EDT
Is there any workaround for windows? 

The problem was originally discovered on windows.
Comment 22 Branko Tripkovic CLA 2006-08-22 15:09:47 EDT
I do not know how to increase number of handlers on windows, but I will commit the patch today, so if you plan to be based on 3.2.1. This might be good enough.
Comment 23 Branko Tripkovic CLA 2006-08-22 15:58:11 EDT
committed.
Comment 24 Barys Dubauski CLA 2006-08-24 19:19:45 EDT
Here is an update on the patch behaviour. I've picked the bits from 08/24/2006 Eclipse nightly build. Running this on Linux RHD4.

I'm still getting problems with launching Eclipse that has big number of plugins. Our setup currently has approximately 2500 plugins and fragments. Here is the exception I'm getting:

!ENTRY system.bundle 4 0 2006-08-24 16:00:14.270
!MESSAGE FrameworkEvent.ERROR
!STACK 0
java.util.zip.ZipException: Too many open files /home/eclipse/plugins/com.ibm.ccl.mapping.codegen.xslt.ui_1.0.0.v20060802.jar
        at java.util.zip.ZipFile.open(Native Method)
        at java.util.zip.ZipFile.<init>(ZipFile.java:238)
        at java.util.zip.ZipFile.<init>(ZipFile.java:268)
        at org.eclipse.osgi.framework.util.SecureAction.getZipFile(SecureAction.java:226)
        at org.eclipse.osgi.baseadaptor.bundlefile.ZipBundleFile.basicOpen(ZipBundleFile.java:79)
        at org.eclipse.osgi.baseadaptor.bundlefile.ZipBundleFile.getZipFile(ZipBundleFile.java:92)
        at org.eclipse.osgi.baseadaptor.bundlefile.ZipBundleFile.checkedOpen(ZipBundleFile.java:65)
        at org.eclipse.osgi.baseadaptor.bundlefile.ZipBundleFile.getEntry(ZipBundleFile.java:234)
        at com.ibm.cds.CDSBundleFile.getEntry(CDSBundleFile.java:83)
        at org.eclipse.osgi.baseadaptor.BaseData.getEntry(BaseData.java:93)
        at org.eclipse.osgi.internal.baseadaptor.AdaptorUtil.loadManifestFrom(AdaptorUtil.java:189)
        at org.eclipse.core.runtime.internal.adaptor.EclipseStorageHook.getGeneratedManifest(EclipseStorageHook.java:294)
        at org.eclipse.core.runtime.internal.adaptor.EclipseStorageHook.createCachedManifest(EclipseStorageHook.java:290)
        at org.eclipse.core.runtime.internal.adaptor.EclipseStorageHook.getManifest(EclipseStorageHook.java:395)
        at org.eclipse.osgi.internal.baseadaptor.BaseStorage.loadManifest(BaseStorage.java:247)
        at org.eclipse.osgi.internal.baseadaptor.BundleInstall.begin(BundleInstall.java:82)
        at org.eclipse.osgi.framework.internal.core.Framework.installWorkerPrivileged(Framework.java:823)
        at org.eclipse.osgi.framework.internal.core.Framework$2.run(Framework.java:739)
        at java.security.AccessController.doPrivileged(AccessController.java:242)
        at org.eclipse.osgi.framework.internal.core.Framework.installWorker(Framework.java:790)
        at org.eclipse.osgi.framework.internal.core.Framework.installBundle(Framework.java:734)
        at org.eclipse.osgi.framework.internal.core.BundleContextImpl.installBundle(BundleContextImpl.java:221)
        at org.eclipse.update.internal.configurator.ConfigurationActivator.installBundles(ConfigurationActivator.java:197)
        at org.eclipse.update.internal.configurator.ConfigurationActivator.start(ConfigurationActivator.java:82)
        at org.eclipse.osgi.framework.internal.core.BundleContextImpl$2.run(BundleContextImpl.java:995)
        at java.security.AccessController.doPrivileged(AccessController.java:242)
        at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:989)
        at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:970)
        at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:317)
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:329)
        at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1037)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:573)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:495)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:275)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:455)
        at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:189)
        at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:291)


Note, that the plugin name that is mentioned in the first line of the exception is a random one. Which means that this is 'out of file handles' issue and not an issue with a specific plugin/fragment.

Is there anything else that could be done to fix this problem on Linux?
Comment 25 asarie CLA 2007-11-21 06:47:38 EST
Created attachment 83419 [details]
1089051361234

my carbide ui can not run
Comment 26 Denis Roy CLA 2007-11-21 13:10:53 EST
The content of attachment 83419 [details] has been deleted by
    Denis Roy <>
who provided the following reason:

Garbage. See bug 210560

The token used to delete this attachment was generated at 2007-11-21 13:10:37 -0400.
Comment 27 John Arthorne CLA 2008-08-11 14:35:26 EDT
*** Bug 107051 has been marked as a duplicate of this bug. ***