Bug 60825 - Registry cache reader needs to be optimized
Summary: Registry cache reader needs to be optimized
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Incubator (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.0 M9   Edit
Assignee: Pascal Rapicault CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2004-05-03 17:41 EDT by Chris Laffra CLA
Modified: 2006-03-29 11:56 EST (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Laffra CLA 2004-05-03 17:41:12 EDT
Plugin registry loading is expensive (involves XML parsing). Therefore, 
Eclipse has a cache. However, the algorithm for reading this cache should
be optimized to read less, and burn less memory.

Current consumption of startup resources is around 10%. This is too high.

I know Pascal and Jeff are working on this, but I could not find a PR on it.
Comment 1 Jeff McAffer CLA 2004-05-03 22:56:02 EDT
Agreed that it takes alot of time.  I am curious about the burn you mention 
however.  RegistryCacheReader seems to read about as directly as you could 
imagine.  It may appear to burn a number of Strings due to intern()ing but the 
trade off is to take more memory in the long term.  For the rest of the data 
the objects are read directly using readInt() etc style calls.  Arrays of 
subelements are correctly and directly sized (no growing).

As for partial loading, there is no silver bullet.  There will be a trade off 
point between loading it all up front thus eating memory and loading the config 
elements lazily saving memory (and some initial time) but then paying for it 
with increased overhead to load extensions one by one.  The breakeven point 
will depend on the exact composition of plugins installed (i.e., what the 
registry actually looks like) and the relative speeds of the disk, CPU and 
memory.

This is not to diminish the importance of this issue but simply to point out 
that the problem is more fundamental.  We are being forced to (eagerly or 
lazily) load massive quantities of information that is either not needed, could 
be in memory temporarily, could be loaded on demand or is inefficiently 
represented.  

We can address some of this but may miss the point in various scenarios (e.g., 
lazy cache loading is slower for base Eclipse but may well be faster for large 
systems like WSAD).  It even depends on what views/perspectives are open as 
this influences what extensions get traversed.

I am frankly skeptical that we can do much more than offer loading option 
choices (i.e., lazy or not) as we do now.  Lazy loading is currenly turned off 
because of a cache coherence problem but it works just fine if you can ensure 
there will be only one instance of a given Eclipse configuration (We have a 
plan for fixing this problem for M9).
Comment 2 Chris Laffra CLA 2004-05-03 23:08:17 EDT
The figures I saw were:

   CPU                      2.5s          9%  
   number of method calls   2451700       11%
   number of new objects    324207        9%

Comment 3 Jeff McAffer CLA 2004-05-03 23:20:01 EDT
what was the burn rate? (i.e., how many of those objects were actually 
garbage?) Were you able to compare with lazy loading turned on?
Comment 4 Pascal Rapicault CLA 2004-05-06 11:23:01 EDT
As Jeff mentionned in comment #1, the registry was eagerly loaded because of
some cache consistency problems.
I just released a fix in HEAD for this problem, and also reenabled the lazy
loading of the registry cache.

However from previous measurements, I doubt that this will make eclipse startup
faster (it might actually make it slower because in comparison to 2.1, a lot
more things are read from the registry on startup). However  I think the lazy
loading can actually be beneficial to wsad.

Chris could you please try again (you should only need to rebuild runtime and osgi).
Comment 5 Jeff McAffer CLA 2004-05-10 22:12:40 EDT
I have implemented a slight modification after observing with XRay that 11% of 
the CPU time was spent in readUTF.  Rather than store the registry UTF8 
encoded, I changed to store the strings in raw unicode format.  In preliminary 
tests this reduced registry read time for regular eclipse from ~160-170ms to 
~130-140ms.  Need to test for larger registries like WSAD etc. and redo the 
comparison on tomorrow's integration build.
Comment 6 Chris Laffra CLA 2004-05-11 09:34:58 EDT
I have seen similar UTF slowdown in JDT core in the past. I cannot remember 
exactly where.
Comment 7 Pascal Rapicault CLA 2004-05-19 10:48:23 EDT
As in M9, the registry cache reading has been improved.
The net result is that on wsad the registry reading takes around 250ms instead
of 510ms.

Here is a brief description of what has been done:
- Share the name of the configuration element and configuration property using
our own unification mechanism instead of interning. This resulted in a reduction
of the size of the registry (from 2.4M to 1.9M).
- Remove other String interning, and replace them by our string unification
mechanism (which is a constant time on read since it accesses an array).
- When lazy loaded the unecessary UTF8 strings are not read but skipped.

Comment 8 Pascal Rapicault CLA 2004-05-20 17:05:18 EDT
Marking as fixed since we've tried out all the ideas we had so far :)

If you have other ideas, feel free to try them out. The classes doing the
persistence of the registry are
org.eclipse.core.internal.registry.RegistryCacheReader and RegistryCacheWriter