Bug 62131 - CodeStream should do bounds checks
Summary: CodeStream should do bounds checks
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.0 RC1   Edit
Assignee: Kent Johnson CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2004-05-13 12:24 EDT by Kent Johnson CLA
Modified: 2004-05-28 15:37 EDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kent Johnson CLA 2004-05-13 12:24:55 EDT
Based on bug 38839, we should consider changing the CodeStream to do bounds 
checks instead of catching IndexOutOfBounds exceptions.

Or at least make the initial size of the code stream & constant pool big 
enough to handle most .class files... plus adjust the grow size so that the 
exception is thrown once every 5-10,000 accesses & not every 500.
Comment 1 Kent Johnson CLA 2004-05-14 13:05:22 EDT
I'm seeing each ClassFile get its own CodeStream & 1000's of grow operations 
during a full build. Add this line to CodeStream.resizeByteArray()

  System.out.println("Resizing bCodeStream " + actualLength + " -> " + 
requiredSize);

We should really consider sharing the same CodeStream between all ClassFiles & 
if that is not possible, then grow the bCodeStream more efficiently... while 
changing every catch IndexOutOfBoundsException into a normal bounds check.

Likewise for the ConstantPool.
Comment 2 Philipe Mulet CLA 2004-05-14 15:58:36 EDT
We did tune the default sizes a while ago, and there were little resizing.
Reusing would be interesting, but risky at this stage.

Can you try performance with bound checks ?
Comment 3 Kent Johnson CLA 2004-05-14 17:04:53 EDT
"Can you try performance with bound checks?"

Already did - the numbers are posted in bug 38839.

Given the typical size the bCodeStream, its not worth using catch 
(IndexOutOfBounds) for control flow... we should just do the bounds check.

As for tuning the sizes, the grow amount of 400 seems low. There are cases 
when the bCodeStream needs to be 5-20K... that means 1000, 1400, 1800, 2200, 
2600, 3000, 3400, 3800, 4200, 4600, 5K.

So for a 10K array, we allocate 134,400 bytes in 20 arrays as we grow it by 
400 bytes... far from efficient. If we doubled each time, it would be 1K, 2K, 
4K, 8K & 16K (that's 100K less!).

We should also re-examine the initial size since we should try to handle 60-
80% of the cases with the initial size.

I got some #'s from rebuilding a workspace with swt & jdt core/debug/ui... 
CodeStream.resizeByteArray() needs to create 600+ new arrays >= 10K... it 
allocates new arrays totalling 20,596,605 bytes.

By using 5K for the initial size & doubling each time we grow (ClassFile has 
its own grow amount which I didn't change), then we get: 189 new arrays >= 
10K... totalling 5,538,000 bytes.

That's 15Mb we don't allocate just by choosing a larger default size & 
doubling when we grow.
Comment 4 Kent Johnson CLA 2004-05-14 17:08:56 EDT
Do we know the size of the source file when we create the CodeStream instance?

If we do then we should set the initial size of the bCodeStream based on the 
size of the source file... its the best guess we have whether we need 1K or 
20K to start.
Comment 5 Olivier Thomann CLA 2004-05-19 12:00:31 EDT
Should this be investigated for RC1?
Comment 6 Kent Johnson CLA 2004-05-19 12:03:47 EDT
I think so... if we can create the bCodeStream based on the number of methods 
to be put in the .class file, we will seldom need to grow.
Comment 7 Philipe Mulet CLA 2004-05-19 12:19:24 EDT
At least a better default value would be a good improvement already, even if 
not based on method count.
Comment 8 Kent Johnson CLA 2004-05-26 14:20:16 EDT
So for a full build of the eclipse 2.1.x source workspace, we get:

13690 calls to ClassFile.getBytes(),
32,118,242 bytes used of 72,262,000 allocated bytes for the header,
18,293,437 bytes used of 26,733,053 allocated bytes for the contents.

That's almost 100,000,000 bytes allocated BUT only half (50,411,679) were 
needed.

When you add in the amount of the header & content arrays that grew during the 
build, we get:
32,118,242 bytes of 92,794,000 allocated bytes for the header,
18,293,437 bytes of 149,243,226 allocated bytes for the contents

That's 242,037,226 bytes allocated, where we used 20.8%.

So I tried a simple change of 2500 & 1200 bytes as the initial sizes for the 
header(1000) & contents(1000), and changed the resize methods to double the 
arrays (instead of growing by a fixed amount):
32,118,242 bytes of 78,560,000 allocated bytes for the header,
18,293,437 bytes of 49,812,000 allocated bytes for the contents.

That's 128,372,000 bytes allocated, where we used 39.3%.

Next I'll try some initial & grow sizes based on method count.
Comment 9 Kent Johnson CLA 2004-05-26 14:31:14 EDT
Also tried 2000 & 1000 bytes as the initial sizes:
32,118,242 bytes of 76,948,000 allocated bytes for the header,
18,293,437 bytes of 48,736,000 allocated bytes for the contents.

That's 125,684,000 bytes allocated, where we used 40.1%.

So bigger initial sizes are not better when we double to grow.
Comment 10 Olivier Thomann CLA 2004-05-28 11:43:18 EDT
Fixed and released in HEAD.
Comment 11 Olivier Thomann CLA 2004-05-28 15:37:31 EDT
There is no more try/catch IndexOutOfBoundsException in the CodeStream and the
ConstantPool.
Verified in 200405281200