239529 – CU attachment expensive due to large number of equal capabilities

Bug 239529 - CU attachment expensive due to large number of equal capabilities

Summary: CU attachment expensive due to large number of equal capabilities

Status:	RESOLVED FIXED

Alias:	None

Product:	Equinox
Classification:	Eclipse Project
Component:	Framework (show other bugs)
Version:	3.4
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.5 M1
Assignee:	equinox.framework-inbox
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:	238312 240933
	Show dependency tree

Reported:	2008-07-03 17:44 EDT by Simon Kaegi
Modified:	2008-07-15 14:01 EDT (History)
CC List:	3 users (show)

See Also:

Attachments
patch (1.61 KB, patch) 2008-07-09 11:50 EDT, Simon Kaegi	no flags	Details \| Diff
insertionIndex patch (3.05 KB, patch) 2008-07-14 17:31 EDT, Simon Kaegi	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Simon Kaegi

2008-07-03 17:44:45 EDT

In the planner when calling attachCUs we use the equinox resolver. While profiling large installs I noticed a lot of overhead associated with adding objects to the VersionHashMap it uses. In installs with many IUs this ends up dominating the time spent in the planner.

It looks like we're ending up adding a number of the identical meta capabilities that we're using as tags and this ends up causing problems. The VersionHashMap keys off the capability name which is always the same for these meta capabilities. The VersionHashMap value is a sorted array of the capabilities that unfortunately gets re-sorted every time a new entry is added and this ends up being expensive when each IU provides the identical capability.

Here's a list of capabilities that I noticed are causing problems:
<provided namespace='org.eclipse.equinox.p2.eclipse.type' name='bundle' version='1.0.0'/>
<provided namespace='org.eclipse.equinox.p2.localization' name='df_LT' version='1.0.0'/>
<provided namespace='org.eclipse.equinox.p2.eclipse.type' name='source' version='1.0.0'/>

Talking to John, short term we might want to filter these out in the planner. Longer term we might want to make these capabilities more unique.

Comment 1 Pascal Rapicault

2008-07-04 09:22:37 EDT

This is a good finding and something we should look at addressing for 3.4.1 if it makes a difference.
Long term, the CU attachment information should directly be obtained from the result of the solver.

Comment 2 John Arthorne

2008-07-07 10:52:50 EDT

Simon, can you get some timing results on the impact of fixing this? When we looked at it last week it appeared in the profiler to be a hotspot.

Comment 3 Simon Kaegi

2008-07-07 11:51:00 EDT

Doing a prototype is pretty easy so I will post the before and after when I get some results. For now what I can say is that from the profiler it looks like we would save about 15% on an 80s install. e.g. 10s

Comment 4 Simon Kaegi

2008-07-09 11:50:09 EDT

Created attachment 106963 [details]
patch

This is a patch to the VersionHashMap used in the frameworks resolver. Instead of re-sorting each time we had a new VersionSupplier we just insert in the right place. Since the set of VersionSuppliers already present are pre-sorted we can do a binary search to find the right place to insert so the operation is O(log n) vs. O(n log n).

Comment 5 Simon Kaegi

2008-07-10 09:40:44 EDT

This patch is done in the framework, so moving it and CC'ing Tom.

Comment 6 Thomas Watson

2008-07-14 16:08:54 EDT

(In reply to comment #4)
> Created an attachment (id=106963) [details]
> patch
> 
> This is a patch to the VersionHashMap used in the frameworks resolver. Instead
> of re-sorting each time we had a new VersionSupplier we just insert in the
> right place. Since the set of VersionSuppliers already present are pre-sorted
> we can do a binary search to find the right place to insert so the operation is
> O(log n) vs. O(n log n).
> 

I think this is safe for 3.4.1, but for 3.5 I would like to consider restructuring the code to get ride of the protected sort() method.  Perhaps a protected method that returns an insertionIndex (e.g. insertionIndex(Object[], object)).  This could help reduce the duplicate code.

Comment 7 Simon Kaegi

2008-07-14 17:31:47 EDT

Created attachment 107388 [details]
insertionIndex patch

Here's a patch that adds and uses insertionIndex. Tom I've removed "sort" altogether and directly call Arrays.sort(values, this) in the "reorder" method. Not sure if this is what you meant or if we still want to keep "sort" around for another reason.

I've tested this with the p2 junit tests and also in the performance benchmark I've been using to test scalability in p2.

Comment 8 Thomas Watson

2008-07-15 13:58:59 EDT

(In reply to comment #7)
> Created an attachment (id=107388) [details]
> insertionIndex patch
> 
> Here's a patch that adds and uses insertionIndex. Tom I've removed "sort"
> altogether and directly call Arrays.sort(values, this) in the "reorder" method.
> Not sure if this is what you meant or if we still want to keep "sort" around
> for another reason.
> 
> I've tested this with the p2 junit tests and also in the performance benchmark
> I've been using to test scalability in p2.
> 

That is what I was thinking.  I release the insertionIndex patch to 3.5 M1.  I will clone this defect for 3.4.1 consideration.