Bug 129940 - Improve performance of split packages
Summary: Improve performance of split packages
Status: RESOLVED WONTFIX
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.3   Edit
Assignee: equinox.framework-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2006-03-01 09:00 EST by Thomas Watson CLA
Modified: 2006-11-14 09:28 EST (History)
2 users (show)

See Also:


Attachments
proposed patch (3.42 KB, patch)
2006-03-01 10:05 EST, Thomas Watson CLA
no flags Details | Diff
Better patch (6.01 KB, patch)
2006-03-01 12:54 EST, Thomas Watson CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Watson CLA 2006-03-01 09:00:53 EST
The org.eclipse.core.runtime package has been split across many bundles in 3.2.  As a result we may be seeing "lots" of class load misses when loading classes from that package.  An investigation should be done to see if this is causing slowdown.

A possible improvement would be to consult each of the split package classloader caches for the class before consulting each classloader of a split package.
Comment 1 Thomas Watson CLA 2006-03-01 10:05:35 EST
Created attachment 35540 [details]
proposed patch

This patch will consult the class cache of each classloader on a split package before delegating to each one when loading a class.

Unfortunately I actually see a slowdown with this patch because of the extra synchronization done on the classloader to call findLoadedClass.  Removing the synchronization seems to improve proformance a little bit (in MultiSourcePackage#loadClass).  I have no hard numbers on the performance degrade/improvement.  This is purely off of the timing information that is displayed when tracing is enabled.  I think NOT synchronizing on the classloader could lead to problems.
Comment 2 Thomas Watson CLA 2006-03-01 12:54:47 EST
Created attachment 35551 [details]
Better patch

This patch is better.  It solves two issues with the current MultiSourcePackage implementation

- Currently MultiSourcePackage objects are not shared across bundles even when the suppliers are the same for each bundle.  This patch solves that by sharing common MultiSourcePackage objects when the sources are identical

- Each MultiSourcePackage keeps a cache of loaded classes so that each source does not need to be consulted if a class is loaded successfully.  The cache makes it very important that the MultiSourcePackage objects are shared otherwise the cache would be of no value and would just take up extra memory.

Sharing the MultiSourcePackage objects should help offset the amount of memory it will take to keep a cache.

Oleg, can you do some performance tests with this patch.  Thanks.
Comment 3 Oleg Besedin CLA 2006-03-01 14:34:26 EST
It looks like the "Better patch" makes no significant difference in the headless startup timing. (There might be a difference < 1% but that is hard to measure due to the high variablity of the startup time.)
Comment 4 Thomas Watson CLA 2006-03-01 14:59:44 EST
An actual UI application (SDK) startup may have better improvements because most every bundle in the SDK requires core.runtime.  Also ui has many split packages as well.

Also, I used a Hashtable for the class cache because of synchronization concerns in MultiSourcePackage.  I would like some review of that logic to see if there is a way to avoid synchronized code.
Comment 5 Jeff McAffer CLA 2006-03-01 23:28:27 EST
not sure that the explicit caching of loaded classes by the bundle loaders is a great idea.  As you point out, the synchronization has to be done.  The syncin on findLoadedClass can't be that bad because we are doing it already every time we consult a classloader.  On cache misses we will sync, on average, n + n/2 times where n = number of sources.  but on second load request/reference we will sync only n/2 times.  this is exactly the same number of times as without the cache checking.

It is possible that somehow the class cache data structure is not as effecient at lookup as a Hashtable but that is kinda hard ot believe.

is there a patch that does findLoadedClass and does NOT sync?  It would be interesting to try that against a non-trivial test case.

Also, do we have actual numbers for the lookup failures due to split packages?  If this is only a few then we are barking up the wrong tree.
Comment 6 Thomas Watson CLA 2006-04-09 17:10:33 EDT
Moving out of NEW inbox.

Oleg do we have any numbers that indicate the split packages are causing slowdown?  Unless it is significant I would prefer not to do this in 3.2.
Comment 7 Oleg Besedin CLA 2006-04-10 09:45:23 EDT
I would move this to 3.3.

"Merge" of the split packages didn't have any significant effect on the UI startup test; in the headless startup test the difference was about 35ms out of the about 700ms total startup time. 

Based on this data, I would say that split packages have some negative impact on performance, but probably not enough to warrant any risky changes. 
Comment 8 Thomas Watson CLA 2006-04-10 09:55:58 EDT
Moving to 3.3.  But we should exercise caution if no significant gain is achieved with the change.
Comment 9 Thomas Watson CLA 2006-11-14 09:28:53 EST
The performance gains are not significant and do not warrent the change.