Bug 309968 - Avoid call to File.isDirectory() to improve the performance of a cold start
Summary: Avoid call to File.isDirectory() to improve the performance of a cold start
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.7 M3   Edit
Assignee: Thomas Watson CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2010-04-21 10:13 EDT by Markus Schorn CLA
Modified: 2010-10-20 13:47 EDT (History)
4 users (show)

See Also:


Attachments
possible solution (1.54 KB, patch)
2010-04-21 12:19 EDT, Thomas Watson CLA
no flags Details | Diff
better solution (2.05 KB, patch)
2010-10-20 12:38 EDT, Thomas Watson CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Schorn CLA 2010-04-21 10:13:56 EDT
Build ID: Eclipse 3.6m6, Sun JVM

I am testing with an eclipse application that shows the Project Explorer, an
editor with a c++-file and the Outline View. I have opened the application with
a workspace several times. Now I reboot my machine and open the application on
the same workspace. That's my cold-start.

Profiling the cold-start reveals that there is a significant portion of time spent in determining whether bundle-files are directories or files. This is done in BaseStorage.createBundleFile(...). 

How often that has to be done depends on the size of the installed application. In my case there are > 700 bundles. By making the assumption that a file named xxx.jar is not a directory I can speed up the cold start by ~1 sec out of 40 sec.
Comment 1 Thomas Watson CLA 2010-04-21 11:49:30 EDT
(In reply to comment #0)
> How often that has to be done depends on the size of the installed application.
> In my case there are > 700 bundles. By making the assumption that a file named
> xxx.jar is not a directory I can speed up the cold start by ~1 sec out of 40
> sec.

This is not a good assumption.  The framework does not care what the file name is of the bundle it is installing.  Depending on the file name will lead to some hard to debug issues.  We could cache the type of file it is in BaseStorageHook and make the decision based off of that.  That would only work for the root bundle files.  Any nested files on the class path would still need an isDirectory check.
Comment 2 Thomas Watson CLA 2010-04-21 12:19:21 EDT
Created attachment 165597 [details]
possible solution

Here is one possible solution.  We assume reference URLs will end with '/' if they are a directory.  Could you measure with this patch?
Comment 3 Pascal Rapicault CLA 2010-04-21 14:12:32 EDT
We definitely have to be cautious here because p2 rightfully allows for jars to be named otherwise than just .jar.
Comment 4 Markus Schorn CLA 2010-04-22 02:53:28 EDT
(In reply to comment #2)
> Created an attachment (id=165597) [details]
> possible solution
> 
> Here is one possible solution.  We assume reference URLs will end with '/' if
> they are a directory.  Could you measure with this patch?

Thanks, I'll test the patch.
Comment 5 Markus Schorn CLA 2010-04-22 10:04:57 EDT
Thomas, your patch works fine in my test scenario.
Comment 6 Thomas Watson CLA 2010-04-22 10:42:11 EDT
(In reply to comment #5)
> Thomas, your patch works fine in my test scenario.

Does fine == performance improvement?  ;-)
Comment 7 Markus Schorn CLA 2010-04-23 04:39:43 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > Thomas, your patch works fine in my test scenario.
> 
> Does fine == performance improvement?  ;-)

Yes, it does. The improvement is the same as when I use the file-extension to avoid the isDirectory() check. To satisfy my curiosity I have also tested for how many xxx.jar files the optimization cannot be used and found just a few of them.
Comment 8 Martin Oberhuber CLA 2010-05-03 05:18:34 EDT
This is performance related, and the patch was successfully validated by us.
Can we have this committed to 3.6 RC1 ?
Comment 9 Thomas Watson CLA 2010-05-03 10:27:44 EDT
(In reply to comment #8)
> This is performance related, and the patch was successfully validated by us.
> Can we have this committed to 3.6 RC1 ?

I'm not that happy with the approach.  It makes the assumption that the reference URL handler always normalizes connections to a directory to have a trailing '/' and non-directory connections to not have a trailing '/'.

I think the storage hook should do a bit of extra work when bundles are installed to ensure that the bundle file name is normalized that way we need it to be to do this performance improvement.  The reference URL handler connection code is too far removed from the BundleFile creation code.  I think depending on reference URL normalization could lead to unexpected behavior if we change the reference handler in the future.
Comment 10 Thomas Watson CLA 2010-05-03 10:32:11 EDT
I will see what can be done for RC1.  But the improvement needs to be significant and the solution needs to be very safe at this point.

Is it safe to say this improvement would give you about a 2% improvement on cold start?  Of coarse, it all depends on the number of bundles installed.
Comment 11 Martin Oberhuber CLA 2010-05-04 08:10:38 EDT
(In reply to comment #10)
Commercially, the issue is not so urgent for us any more since we now do our own OSGi framework extension as suggested in bug 309961. Since we know that in our environment *.jar bundles are always files, we are putting this into our framework extension. Bottom line is, we can certainly live with deferring this.

In terms of the potential performance gain, it's hard to give a number. It depends on the speed of the storage being used, but clearly in a system with 700 bundles of which only 50 are loaded on startup, the extra overhead of performing an unnecessary "stat" on 650 bundles is significant. I don't want to make the effort measuring this again, but 2.5% seems approximately what we have seen in our system as per comment 0.

My take is that in a system like Equinox which puts a lot of effort into caching bundle status, it's a pity that much of the performance improvement that could be gained from caching is lost again when each and every bundle is subject to a stat call on startup, regardless of whether it's going to be loaded or not.
Comment 12 Thomas Watson CLA 2010-05-10 09:21:54 EDT
deferring to next release.
Comment 13 Thomas Watson CLA 2010-10-20 12:38:25 EDT
Created attachment 181312 [details]
better solution

Here is a better solution that marks the type of bundledata as a directory based or file based bundle data.  This way we can quickly check to see if the bits are set on start up, if not do the isDirectory check and then set the correct bits appropriately for next startup.
Comment 14 Thomas Watson CLA 2010-10-20 13:46:54 EDT
A modified patch (only to add comments) has been released to head.