90947 – [registry] Platform Extension Registry: TableReader reads too much

Bug 90947 - [registry] Platform Extension Registry: TableReader reads too much

Summary: [registry] Platform Extension Registry: TableReader reads too much

Status:	RESOLVED DUPLICATE of bug 121766

Alias:	None

Product:	Equinox
Classification:	Eclipse Project
Component:	Compendium (show other bugs)
Version:	3.1
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	---
Assignee:	equinox.compendium-inbox
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:

Reported:	2005-04-10 21:11 EDT by Chris Laffra
Modified:	2006-04-10 16:16 EDT (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Chris Laffra

2005-04-10 21:11:26 EDT

On 3.1M6, I ran filemon.exe while loading M6 into an empty workspace. I 
captured filemon's read log, and my eye directly fell on 
<eclipse>/configuration/org.eclipse.core.runtime/.mainData.9
On disk, on my M6 installation (which has one extra tiny plugin), this file 
measures 702,669.

When I exported filemon's log, and processed the "read" numbers, I saw that in 
total, for startup, TableReader reads 5,162,288 bytes from .mainData.9.
And this is just for going into an empty workspace into Resource perspective.
After studying the code in TableReader, I noticed a new TableReader is created 
for each registry access, and it open the mainData file, and creates a new 
InputStream on that (as a result, hundreds of InputStreams are created and 
destructed just during startup). To get the actual data, a given number of 
bytes are skipped on the InputStream to jump to an offset where the extension 
point is actually located in the registry cache.

A better solution is to use a RandomAccessFile (see 
http://java.sun.com/docs/books/tutorial/essential/io/rafs.html for 
argumentation). I propose TableReader.<clinit> to open mainData as a random 
access file and keep the file open for the duration of the platform's 
execution. Then, each registry access would synchronize on the raf and get its 
entry at the proper offset. Result is less I/O, and fewer JNI calls.

I do understand that when using the currently applied "lazy registry loading" 
technique, the registry cache will be in the OS disk cache, so no expensive 
disk accesses are really made for subsequent registry accesses. However, 
opening/reading/closing an InputStream is always slower than using a static 
RandomAccessFile, I think. The API is very similar, in fact almost identical 
in its use, so a quick verification using RandomAccessFile is not hard.

Even better is for the registry to save the registry in another way by 
generating java bytecodes on the fly. Instead of serializing the registry into 
a couple of binary files, save it into a set of Java classes that implement 
the registry directly, as Java class (in the form of binary Java bytecodes). 
No file I/O is done at all. The platform classloader will do the lazy loading 
for us in this case, by loading the required classes on demand. 
The platform is littered with various registries, and each has its own 
serialization model. A common model should be designed (org.eclipse.registry?) 
that provides the option for coordinated efforts to optimize registry loading.

If you like, I could prototype the binary class solution for the extension 
registy, and prove/disprove the benefits.

Comment 1 Jeff McAffer

2005-04-10 21:55:19 EDT

There is probably some room for optimization here.  Perhaps recognizing the 
special character of startup.  Some additional thoughts.

- Pascal benched RandomAccessFile and found it to be quite slow.  Perhaps there 
is a better way of using them?

- The registry is not just lazily loaded but also flushed.  How would that work 
with the class approach?  

- How does the class approach work for representing trees?  I.e., how many 
classes are there?  What are the methods?  Who instantiates them (if 
anyone), ... (I'm curious, tell me more!)

- another approach that we thought of was to keep a pool of say 10 regular 
input streams open.  We would always toss the one that was furthest along in 
favour of one positioned earlier.  Then when you need to read something, you 
look in the pool for one that is nearest (but before) the position you need to 
read.  Take it out of the pool, do your reading and then put it back (or toss 
it).  We'd have to try it to see if the cache management overhead is worth it.

- A related note, we need to investigate how to handle these caches when 
running from a shared drive.  We may want to cache the cache locally or turn it 
off.

Comment 2 Chris Laffra

2005-04-11 07:53:47 EDT

I figured Pascal tried RandomAccessFile already, as he has done, what is it, 
some 15 iterations now, with each one a bit faster than the previous.
I am surprised RandomAccessFile should be slower. Is this on all VMs?

I thought of the pool of inputstreams also, I just thought of it as having one 
element in it, not 10.

The class approach is best modeled in your mind by doing this thought 
experiment: write the registry cache by hand in Java source. What would it 
look like then? Once you have this 'prototype', when you serialize the 
registry, generate similar code (preferably in bytecodes directly). If you use 
weak references in the connections between the tree elements, the instances 
will eventually be collected. In theory, even the classes should be unloaded, 
at some point, but this depends on the VM being used.

Of course, the problem is those looooooong names everyone uses for their 
plugins and extension points. When you use classes with static String fields, 
those will be interned by the VM, rather than duplicated on the heap. That 
keeps the cost minimal. Some VMs have trouble with performance of user-defined 
interned Strings, but those in class files are well dealt with.

Comment 3 Oleg Besedin

2006-04-05 11:21:05 EDT

This has been fixed in the 3.2 - we now keep the streams open and re-use-them. 

Also BufferedRandomInputStream is added - a combination of the random access and buffering. It is not perfect yet in terms of performance, but works reasonably well. (As Pascal specified above, RandomAccessFile is unbuffered and in our case doesn't provide much of a benefit.)

Just tried with M6 and we read about 137K out of the 805K main table on startup.

On a side note, I would be careful about using classes to store data. In the recent startup performance tests, it seemed that class loading is a rather expensive operation and represents a major portion of the headless startup time. Loading an "average" class takes about 0.3ms on my computer; reading and processing 30K table takes 1.5ms. Seems good so far, but loading a little more complicated classes often takes 10 - 20ms (e.g., 10ms for the RegistryObjectManager which is not all that complex).

Comment 4 Thomas Watson

2006-04-10 16:16:55 EDT

fix was dropped in M5 from comment 3 in bug 121766.

*** This bug has been marked as a duplicate of 121766 ***