Bug 239529 - CU attachment expensive due to large number of equal capabilities
Summary: CU attachment expensive due to large number of equal capabilities
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M1   Edit
Assignee: equinox.framework-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks: 238312 240933
  Show dependency tree
 
Reported: 2008-07-03 17:44 EDT by Simon Kaegi CLA
Modified: 2008-07-15 14:01 EDT (History)
3 users (show)

See Also:


Attachments
patch (1.61 KB, patch)
2008-07-09 11:50 EDT, Simon Kaegi CLA
no flags Details | Diff
insertionIndex patch (3.05 KB, patch)
2008-07-14 17:31 EDT, Simon Kaegi CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Kaegi CLA 2008-07-03 17:44:45 EDT
In the planner when calling attachCUs we use the equinox resolver. While profiling large installs I noticed a lot of overhead associated with adding objects to the VersionHashMap it uses. In installs with many IUs this ends up dominating the time spent in the planner.

It looks like we're ending up adding a number of the identical meta capabilities that we're using as tags and this ends up causing problems. The VersionHashMap keys off the capability name which is always the same for these meta capabilities. The VersionHashMap value is a sorted array of the capabilities that unfortunately gets re-sorted every time a new entry is added and this ends up being expensive when each IU provides the identical capability.

Here's a list of capabilities that I noticed are causing problems:
<provided namespace='org.eclipse.equinox.p2.eclipse.type' name='bundle' version='1.0.0'/>
<provided namespace='org.eclipse.equinox.p2.localization' name='df_LT' version='1.0.0'/>
<provided namespace='org.eclipse.equinox.p2.eclipse.type' name='source' version='1.0.0'/>

Talking to John, short term we might want to filter these out in the planner. Longer term we might want to make these capabilities more unique.
Comment 1 Pascal Rapicault CLA 2008-07-04 09:22:37 EDT
This is a good finding and something we should look at addressing for 3.4.1 if it makes a difference.
Long term, the CU attachment information should directly be obtained from the result of the solver.
Comment 2 John Arthorne CLA 2008-07-07 10:52:50 EDT
Simon, can you get some timing results on the impact of fixing this? When we looked at it last week it appeared in the profiler to be a hotspot.
Comment 3 Simon Kaegi CLA 2008-07-07 11:51:00 EDT
Doing a prototype is pretty easy so I will post the before and after when I get some results. For now what I can say is that from the profiler it looks like we would save about 15% on an 80s install. e.g. 10s

Comment 4 Simon Kaegi CLA 2008-07-09 11:50:09 EDT
Created attachment 106963 [details]
patch

This is a patch to the VersionHashMap used in the frameworks resolver. Instead of re-sorting each time we had a new VersionSupplier we just insert in the right place. Since the set of VersionSuppliers already present are pre-sorted we can do a binary search to find the right place to insert so the operation is O(log n) vs. O(n log n).
Comment 5 Simon Kaegi CLA 2008-07-10 09:40:44 EDT
This patch is done in the framework, so moving it and CC'ing Tom.
Comment 6 Thomas Watson CLA 2008-07-14 16:08:54 EDT
(In reply to comment #4)
> Created an attachment (id=106963) [details]
> patch
> 
> This is a patch to the VersionHashMap used in the frameworks resolver. Instead
> of re-sorting each time we had a new VersionSupplier we just insert in the
> right place. Since the set of VersionSuppliers already present are pre-sorted
> we can do a binary search to find the right place to insert so the operation is
> O(log n) vs. O(n log n).
> 

I think this is safe for 3.4.1, but for 3.5 I would like to consider restructuring the code to get ride of the protected sort() method.  Perhaps a protected method that returns an insertionIndex (e.g. insertionIndex(Object[], object)).  This could help reduce the duplicate code.
Comment 7 Simon Kaegi CLA 2008-07-14 17:31:47 EDT
Created attachment 107388 [details]
insertionIndex patch

Here's a patch that adds and uses insertionIndex. Tom I've removed "sort" altogether and directly call Arrays.sort(values, this) in the "reorder" method. Not sure if this is what you meant or if we still want to keep "sort" around for another reason.

I've tested this with the p2 junit tests and also in the performance benchmark I've been using to test scalability in p2.
Comment 8 Thomas Watson CLA 2008-07-15 13:58:59 EDT
(In reply to comment #7)
> Created an attachment (id=107388) [details]
> insertionIndex patch
> 
> Here's a patch that adds and uses insertionIndex. Tom I've removed "sort"
> altogether and directly call Arrays.sort(values, this) in the "reorder" method.
> Not sure if this is what you meant or if we still want to keep "sort" around
> for another reason.
> 
> I've tested this with the p2 junit tests and also in the performance benchmark
> I've been using to test scalability in p2.
> 

That is what I was thinking.  I release the insertionIndex patch to 3.5 M1.  I will clone this defect for 3.4.1 consideration.