Bug 251630 - [repo] Intern version objects
Summary: [repo] Intern version objects
Status: RESOLVED INVALID
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.5   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: P2 Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2008-10-21 18:06 EDT by Pascal Rapicault CLA
Modified: 2011-03-04 13:50 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pascal Rapicault CLA 2008-10-21 18:06:10 EDT
While profiling, John noticed a large number of Version and VersionRange objects. We are thinking that it may worth "interning" the version objects when loading them from repositories.
Comment 1 Henrik Lindberg CLA 2008-12-30 15:08:32 EST
I assume they should be garbage collected at some point - so what controls the lifecycle of the interned versions? Were you thinking that Version and VersionRange should handle the interning on their own, or that something that makes use of them handles this?

If every version is interned (i.e. kept in a cache), it would perhaps also be of value to have this cache be sorted
Comment 2 Pascal Rapicault CLA 2008-12-30 15:59:39 EST
The initial thinking was to "intern" by repository. If you had two repositories, they would *not* share the same version object for "1.0.0" for example. This had the advantage of being relatively simple and still get us some good benefit compare to what we have today.
Comment 3 Thomas Hallgren CLA 2008-12-30 17:21:02 EST
The solution I'm working on now maintains a cache of integers in the range 0-128 since Integer.valueOf(xxx) doesn't comply with the required execution environment. That will save some memory too.

For the final solution, why not use a WeakHashMap? That way the version will go out of scope as soon as no one's referencing it and all versions would be shared.

Either way, we must switch from using "new Version(xxx)" to a version factory.

Comment 4 Henrik Lindberg CLA 2008-12-30 18:13:14 EST
Seems easier to use a WeakHashMap and a factory method than having to create Version instances in some context such as a repository. Isn't there is already a "parseVersion" method that could be used for this purpose, and the constructor would just create the instance.
Comment 5 Pascal Rapicault CLA 2008-12-30 19:42:03 EST
I think we are not on the same wavelength. We are not trying to uniquify every single version number in p2 whatever the way they are created because we don't see this as a big enough win in comparison to the breadth of the changes required (moving to a factory). 
The initial bug report stemmed from a simple observation, under the vast majority of p2 use cases (install / updating / uninstalling) there is a lot of duplicate version and version range objects all coming from repositories. We have already solved a similar issue for the identifiers contained in namespace and ids and we were proposing to do the same here. The "advantage" here is that we don't need a buy in from everyone, no problem with lifecycle of the content of the cache (the cache is an implementation detail of the repo) since it is discarded at the same time the repo is (softcache in the repo mgr).
Comment 6 Henrik Lindberg CLA 2008-12-30 20:05:06 EST
(In reply to comment #5)
> I think we are not on the same wavelength. We are not trying to uniquify every
> single version number in p2 whatever the way they are created because we don't
> see this as a big enough win in comparison to the breadth of the changes
> required (moving to a factory). 
> The initial bug report stemmed from a simple observation, under the vast
> majority of p2 use cases (install / updating / uninstalling) there is a lot of
> duplicate version and version range objects all coming from repositories. We
> have already solved a similar issue for the identifiers contained in namespace
> and ids and we were proposing to do the same here. The "advantage" here is that
> we don't need a buy in from everyone, no problem with lifecycle of the content
> of the cache (the cache is an implementation detail of the repo) since it is
> discarded at the same time the repo is (softcache in the repo mgr).
> 
Aha - so you are saying that this takes place outside of Version. That works, and works with current implementation of Version too. Sorry for the noise :)
Comment 7 John Arthorne CLA 2009-01-02 09:43:36 EST
I should also mention that when I did profiling on this, the space usage from version objects was not large compared to other objects held by p2 (mostly strings). I will do more investigation late in 3.5 time permitting but I don't currently see this as a high priority enhancement.
Comment 8 John Arthorne CLA 2009-04-16 14:14:18 EDT
See some related discussion in bug 272386. I'm moving this back to the inbox because I have no near term plans to work on this.
Comment 9 John Arthorne CLA 2011-03-04 13:50:07 EST
I think my original analysis is no longer useful given all the optimization work Dean has done in 3.7. We would need to take a fresh look to see what optimizations would give the best improvement.