Bug 308157 - 10x slower bundle classpath resolution in Eclipse 3.5.2+ over 3.4.2-
Summary: 10x slower bundle classpath resolution in Eclipse 3.5.2+ over 3.4.2-
Status: RESOLVED WORKSFORME
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: equinox.framework-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2010-04-06 04:04 EDT by Wendell Beckwith CLA
Modified: 2010-04-15 04:34 EDT (History)
3 users (show)

See Also:


Attachments
New load times after moving hibernate jars to front on classpath (2.63 KB, text/plain)
2010-04-06 04:12 EDT, Wendell Beckwith CLA
no flags Details
Manifest for the platform plug-in (86.94 KB, application/octet-stream)
2010-04-06 04:28 EDT, Wendell Beckwith CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wendell Beckwith CLA 2010-04-06 04:04:10 EDT
Build Identifier:  I20100312-1448

We have a plug-in offering that has been stuck at Eclipse 3.3.2 due to a customer's issue.  The customer is now moving to Eclipse 3.5.2 and we now see startup performance that is on average 10x slower with 3.5.2 over 3.3.2.  I downloaded every released SDK from Eclipse 3.3.2 to 3.5.2 and even tried with 3.6M6, and after multiple runs, the results still hold that startup performance is roughly 10x slower.

The following results where achieve by only setting the target platform to each of the SDK downloads.

#
# Eclipse 3.3.2 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 902
JPA Providers DB Start Time (ms): 366
Total JPA Providers DB Start Time (ms): 1268
JPA Providers Rebuild Session Time (ms): 1610
Total JPA Providers Rebuild Session Time (ms): 2878
JPA Providers Validate Schema Time (ms): 169
Total JPA Providers Validate Schema Time (ms): 3047
*** loading data
JPA Providers Load Data Time (ms): 727
Total JPA Providers Load Data Time (ms): 3774
Total JPA Providers Load Time (ms): 3774

#
# Eclipse 3.4.0 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 892
JPA Providers DB Start Time (ms): 372
Total JPA Providers DB Start Time (ms): 1264
JPA Providers Rebuild Session Time (ms): 2114
Total JPA Providers Rebuild Session Time (ms): 3378
JPA Providers Validate Schema Time (ms): 725
Total JPA Providers Validate Schema Time (ms): 4103
*** loading data
JPA Providers Load Data Time (ms): 1152
Total JPA Providers Load Data Time (ms): 5255
Total JPA Providers Load Time (ms): 5255

#
# Eclipse 3.4.1 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 1359
JPA Providers DB Start Time (ms): 322
Total JPA Providers DB Start Time (ms): 1681
JPA Providers Rebuild Session Time (ms): 1558
Total JPA Providers Rebuild Session Time (ms): 3239
JPA Providers Validate Schema Time (ms): 452
Total JPA Providers Validate Schema Time (ms): 3691
*** loading data
JPA Providers Load Data Time (ms): 402
Total JPA Providers Load Data Time (ms): 4093
Total JPA Providers Load Time (ms): 4093

#
# Eclipse 3.4.2 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 1230
JPA Providers DB Start Time (ms): 284
Total JPA Providers DB Start Time (ms): 1514
JPA Providers Rebuild Session Time (ms): 1824
Total JPA Providers Rebuild Session Time (ms): 3338
JPA Providers Validate Schema Time (ms): 159
Total JPA Providers Validate Schema Time (ms): 3497
*** loading data
JPA Providers Load Data Time (ms): 411
Total JPA Providers Load Data Time (ms): 3908
Total JPA Providers Load Time (ms): 3908

#
# Eclipse 3.5.0 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 12338
JPA Providers DB Start Time (ms): 2497
Total JPA Providers DB Start Time (ms): 14835
JPA Providers Rebuild Session Time (ms): 18635
Total JPA Providers Rebuild Session Time (ms): 33470
JPA Providers Validate Schema Time (ms): 3292
Total JPA Providers Validate Schema Time (ms): 36762
*** loading data
JPA Providers Load Data Time (ms): 8164
Total JPA Providers Load Data Time (ms): 44926
Total JPA Providers Load Time (ms): 44926

#
# Eclipse 3.5.1 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 19910
JPA Providers DB Start Time (ms): 2651
Total JPA Providers DB Start Time (ms): 22561
JPA Providers Rebuild Session Time (ms): 25876
Total JPA Providers Rebuild Session Time (ms): 48437
JPA Providers Validate Schema Time (ms): 4176
Total JPA Providers Validate Schema Time (ms): 52613
*** loading data
JPA Providers Load Data Time (ms): 11439
Total JPA Providers Load Data Time (ms): 64052
Total JPA Providers Load Time (ms): 64052

#
# Eclipse 3.5.2 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 16077
JPA Providers DB Start Time (ms): 2634
Total JPA Providers DB Start Time (ms): 18711
JPA Providers Rebuild Session Time (ms): 20225
Total JPA Providers Rebuild Session Time (ms): 38936
JPA Providers Validate Schema Time (ms): 3264
Total JPA Providers Validate Schema Time (ms): 42200
*** loading data
JPA Providers Load Data Time (ms): 9317
Total JPA Providers Load Data Time (ms): 51517
Total JPA Providers Load Time (ms): 51517

#
# Eclipse 3.6M6 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 16595
JPA Providers DB Start Time (ms): 2611
Total JPA Providers DB Start Time (ms): 19206
JPA Providers Rebuild Session Time (ms): 22332
Total JPA Providers Rebuild Session Time (ms): 41538
JPA Providers Validate Schema Time (ms): 3550
Total JPA Providers Validate Schema Time (ms): 45088
*** loading data
JPA Providers Load Data Time (ms): 8235
Total JPA Providers Load Data Time (ms): 53323
Total JPA Providers Load Time (ms): 53323


Reproducible: Always
Comment 1 Wendell Beckwith CLA 2010-04-06 04:12:54 EDT
Created attachment 163867 [details]
New load times after moving hibernate jars to front on classpath

After identifying what seems to be class load issues with searching through a biggee of a plug-in, I made a single change where I move the hibernate jars and their dependencies to the front of the classpath.  This dropped approx. 30 seconds off the load times as seen in the loadtimes2.txt attachment.
Comment 2 Wendell Beckwith CLA 2010-04-06 04:28:22 EDT
Created attachment 163872 [details]
Manifest for the platform plug-in

The plug-in basically wraps the classpath for a standalone java application which must be isolated and exports its packages to other plug-ins that integrate the java application into the workbench.
Comment 3 Thomas Watson CLA 2010-04-06 12:32:00 EDT
I noticed that you are doing buddy class loading (Eclipse-BuddyPolicy: registered).  There was a change to fix some threading issues in buddy class loading in bug 254542.  Perhaps this fix is adding to your slowdown.  It seem unlikely though since I would not expect class path order cause more slowdown because of the buddy class loading fix.

If you have any profile data available between 3.4 vs 3.5 that would help.
Comment 4 Thomas Watson CLA 2010-04-06 14:45:58 EDT
I have identified one bottleneck that has changed from 3.4 to 3.5.  We changed the default value of the osgi.bundlefile.limit configuration property to be 100.  This limits the number of bundle files we keep open at a time in the framework.  The default in 3.5 is to only keep 100 bundle files open at a time.  Your bundle has well over 300 jars on its class path :-)

It is easy to see how a value of 100 would be bad for performance for you.  Since you have a huge flat bundle class path you will get constant thrashing of zip files being open/closed over and over just to search through the bundle classpath for a single class/resource.  And if the resource does not even exist then you have to go through the complete classpath.  This will result in over 200 zip files being open/closed for each class/resource load from your bundle.

Try setting osgi.bundlefile.limit=1000 in your config.ini to see if it makes a difference.  You can also try setting it to zero osgi.bundlefile.limit=0 to completely disable the bundle file limit.
Comment 5 Wendell Beckwith CLA 2010-04-06 16:26:44 EDT
(In reply to comment #4)
> I have identified one bottleneck that has changed from 3.4 to 3.5.  We changed
> the default value of the osgi.bundlefile.limit configuration property to be
> 100.  This limits the number of bundle files we keep open at a time in the
> framework.  The default in 3.5 is to only keep 100 bundle files open at a time.
>  Your bundle has well over 300 jars on its class path :-)
> 
> It is easy to see how a value of 100 would be bad for performance for you. 
> Since you have a huge flat bundle class path you will get constant thrashing of
> zip files being open/closed over and over just to search through the bundle
> classpath for a single class/resource.  And if the resource does not even exist
> then you have to go through the complete classpath.  This will result in over
> 200 zip files being open/closed for each class/resource load from your bundle.
> 
> Try setting osgi.bundlefile.limit=1000 in your config.ini to see if it makes a
> difference.  You can also try setting it to zero osgi.bundlefile.limit=0 to
> completely disable the bundle file limit.

I set the osgi.bundle.limit=1000 as a vm arg and the startup was approx. 9.1 secs.  I then reran it about 12 times in run as well as debug mode and the times were on average like the following:

LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 936
JPA Providers DB Start Time (ms): 280
Total JPA Providers DB Start Time (ms): 1216
JPA Providers Rebuild Session Time (ms): 1601
Total JPA Providers Rebuild Session Time (ms): 2817
JPA Providers Validate Schema Time (ms): 176
Total JPA Providers Validate Schema Time (ms): 2993
*** loading data
JPA Providers Load Data Time (ms): 357
Total JPA Providers Load Data Time (ms): 3350
Total JPA Providers Load Time (ms): 3350


I did have a couple runs where the times were like 15 secs, but on those runs the entire workbench was slow to load even before it ever got to me trying to open the editor to trigger plug-in activation so I threw those values out.
Comment 6 Thomas Watson CLA 2010-04-06 16:58:48 EDT
(In reply to comment #5)
> I set the osgi.bundle.limit=1000 as a vm arg and the startup was approx. 9.1
> secs.  I then reran it about 12 times in run as well as debug mode and the
> times were on average like the following:
> 
> LISA_HOME set to C:\Lisa\
> Total JPA Providers Configure Time (ms): 936
> JPA Providers DB Start Time (ms): 280
> Total JPA Providers DB Start Time (ms): 1216
> JPA Providers Rebuild Session Time (ms): 1601
> Total JPA Providers Rebuild Session Time (ms): 2817
> JPA Providers Validate Schema Time (ms): 176
> Total JPA Providers Validate Schema Time (ms): 2993
> *** loading data
> JPA Providers Load Data Time (ms): 357
> Total JPA Providers Load Data Time (ms): 3350
> Total JPA Providers Load Time (ms): 3350
> 
> 
> I did have a couple runs where the times were like 15 secs, but on those runs
> the entire workbench was slow to load even before it ever got to me trying to
> open the editor to trigger plug-in activation so I threw those values out.

Just to be sure, this is about the same as the run on 3.2.2, correct?

#
# Eclipse 3.3.2 SDK
#
LISA_HOME set to C:\Lisa\
Total JPA Providers Configure Time (ms): 902
JPA Providers DB Start Time (ms): 366
Total JPA Providers DB Start Time (ms): 1268
JPA Providers Rebuild Session Time (ms): 1610
Total JPA Providers Rebuild Session Time (ms): 2878
JPA Providers Validate Schema Time (ms): 169
Total JPA Providers Validate Schema Time (ms): 3047
*** loading data
JPA Providers Load Data Time (ms): 727
Total JPA Providers Load Data Time (ms): 3774
Total JPA Providers Load Time (ms): 3774

If so, can we close this as worksforme (as long as you set the osgi.bundlefile.limit to the appropriate number for your application)?  I think 100 is a reasonable default for a vast majority of Equinox applications.
Comment 7 Wendell Beckwith CLA 2010-04-07 01:40:44 EDT
(In reply to comment #6)
> 
> Just to be sure, this is about the same as the run on 3.2.2, correct?
> 
> #
> # Eclipse 3.3.2 SDK
> #
> LISA_HOME set to C:\Lisa\
> Total JPA Providers Configure Time (ms): 902
> JPA Providers DB Start Time (ms): 366
> Total JPA Providers DB Start Time (ms): 1268
> JPA Providers Rebuild Session Time (ms): 1610
> Total JPA Providers Rebuild Session Time (ms): 2878
> JPA Providers Validate Schema Time (ms): 169
> Total JPA Providers Validate Schema Time (ms): 3047
> *** loading data
> JPA Providers Load Data Time (ms): 727
> Total JPA Providers Load Data Time (ms): 3774
> Total JPA Providers Load Time (ms): 3774
> 
> If so, can we close this as worksforme (as long as you set the
> osgi.bundlefile.limit to the appropriate number for your application)?  I think
> 100 is a reasonable default for a vast majority of Equinox applications.

Execution times are back to Eclipse 3.3/3.4 levels and this is great.  This will help us when we move to using an RCP if we haven't broken up the platform plug-in before then.  However, before we close this out can anyone shed light on why the value was changed?  Is there a bugzilla that deals with the osgi.bundle.limit option?  

When you say you think "100 is a reasonable default for a vast majority of Equinox applications" what do you mean?  How is this valued applied within the framework because as an example, Eclipse 3.5.2 has 366 plug-ins?

A lot of questions I know, but I plan to update my newsgroup post with the relevant info so Google and the like can index it for others to find.
Comment 8 Thomas Watson CLA 2010-04-07 09:15:54 EDT
(In reply to comment #7)
> Execution times are back to Eclipse 3.3/3.4 levels and this is great.  This
> will help us when we move to using an RCP if we haven't broken up the platform
> plug-in before then.  However, before we close this out can anyone shed light
> on why the value was changed?  Is there a bugzilla that deals with the
> osgi.bundle.limit option?  
> 
> When you say you think "100 is a reasonable default for a vast majority of
> Equinox applications" what do you mean?  How is this valued applied within the
> framework because as an example, Eclipse 3.5.2 has 366 plug-ins?

Please see bug 234069 for the discussion.  As to Eclipse's 366 bundles, half of these are source bundles that never get loaded.  The other half are either single jars or have only a few jars (2 - 3) on their bundle classpath.  When loading a single resource/class for a client bundle in OSGi we find a list of providers for the package the resource is in.  Typically there is only one provider for the resource (except in cases where the package is split).  When searching a provider for a resource we must search the providers bundle class path.  Since most bundles are either single jars or only have a small number of jars on their class path this will result in only a small number of jars to search for the resource.  In your case we must search over 300 jars each time.  This is far from typical in my opinion.
Comment 9 Wendell Beckwith CLA 2010-04-07 12:27:26 EDT
(In reply to comment #8)
> 
> Please see bug 234069 for the discussion.  As to Eclipse's 366 bundles, half of
> these are source bundles that never get loaded.  The other half are either
> single jars or have only a few jars (2 - 3) on their bundle classpath.  When
> loading a single resource/class for a client bundle in OSGi we find a list of
> providers for the package the resource is in.  Typically there is only one
> provider for the resource (except in cases where the package is split).  When
> searching a provider for a resource we must search the providers bundle class
> path.  Since most bundles are either single jars or only have a small number of
> jars on their class path this will result in only a small number of jars to
> search for the resource.  In your case we must search over 300 jars each time. 
> This is far from typical in my opinion.

I do agree that our current classpath setup is severely non-optimal, however if I understand correctly this limit is not per bundle but per framework, and with eclipse itself constantly getting bigger not to mention with all the other plug-ins from mylyn, emf, gef and the commercial offerings, this limit seems really low.  Your own comments in https://bugs.eclipse.org/bugs/show_bug.cgi?id=162799#c3 push for a higher limit and that was found in real products 3 years ago.  The plug-in universe hasn't shrunk and is only getting bigger, so this should likely be adjusted for 3.6.
Comment 10 Thomas Watson CLA 2010-04-07 12:45:39 EDT
(In reply to comment #9)
> I do agree that our current classpath setup is severely non-optimal, however if
> I understand correctly this limit is not per bundle but per framework, and with
> eclipse itself constantly getting bigger not to mention with all the other
> plug-ins from mylyn, emf, gef and the commercial offerings, this limit seems
> really low.  Your own comments in
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=162799#c3 push for a higher limit
> and that was found in real products 3 years ago.  The plug-in universe hasn't
> shrunk and is only getting bigger, so this should likely be adjusted for 3.6.

My comment in bug 162799 has proven not to be true.  It is true that on initial startup where we have to read every bundle installed there is a bit of thrashing to get through all the bundles.  If you have 1000s of bundles then many bundle jars will be opened/closed in the coarse of initializing the cache on initial startup.  But on restart the thrashing is greatly reduced.  We have products well over 2500 bundles and the value of 100 has not shown any measurable startup performance issues as long as we are launching from a cached configuration.

In the typical execution scenario unless you have a very high number of threads doing concurrent resource/class loading then you should not see a high level of thrashing of the jar files.  We have had the default at 100 for well over one release and this is the first time we have had any reports of slowdown.  I will also mention that I know of a number of our products based on 3.4 which are setting the value explicitly to 100 and have not seen issues (many of these products have well over a 1000 bundles).

The one enhancement that I can see is that we try to be a bit smarter about setting the default.  We could consider setting it to something like: 

(Highest # of entries on a single bundles class-path) + 100

I would like to avoid doing something magical there because then another corner case will come up where some bundle has over 1000 jars on their classpath so we will have a limit of 1100 which I know some linux configuration will not allow.
Comment 11 Wendell Beckwith CLA 2010-04-07 13:33:10 EDT
Well I'm not one to unnecessarily beat a horse to death, nor to add N new options to admittedly solve a "bad/excessive" plug-in classpath.  The only thing I can point to is a vague tremble in the force that the value _may_ be too low, but since I plan to rework our dependencies RSN anyway, I wouldn't likely use any new settings as the current limit would work just fine.  Thus, your bigger @$$ product deployments trump my vague feelings, so I think we're good.  Now we can close it. :)
Comment 12 Thomas Watson CLA 2010-04-07 13:50:44 EDT
Thanks, closing.