Bug 84872 - Improve string sharing in JavaModelCache
Summary: Improve string sharing in JavaModelCache
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.1 M6   Edit
Assignee: Jerome Lanneluc CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2005-02-10 06:44 EST by Jerome Lanneluc CLA
Modified: 2018-08-05 07:08 EDT (History)
4 users (show)

See Also:


Attachments
compress strings using utf-8 encoding (4.93 KB, patch)
2005-04-20 09:34 EDT, Noel Grandin CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jerome Lanneluc CLA 2005-02-10 06:44:06 EST
3.1 M4

From an email from Ed Burnette:

You asked me to track down the mail about JDT memory usage. Sorry it took so long.

http://www.eclipsepowered.org/archives/2004/11/30/eclipse-grand-challenges/#comment-652

Quote:

See these postings for comments on Eclipse memory usage:
http://dev.eclipse.org/mhonarc/lists/platform-core-dev/msg00648.html
http://dev.eclipse.org/mhonarc/lists/equinox-dev/msg00348.html

With a smaller workspace, under a newer Eclipse (3.1M4), the JavaModelCache and
things
under it were still using 25% of total memory. Most of that was strings. In all,
53% of memory was going into text strings. This is with YourKit 3.2.
Comment 1 Tod Creasey CLA 2005-03-07 11:57:29 EST
Adding my name to the cc list as we are now tracking performance issues more
closely. Please remove the performance keyword if this is not a performance bug.
Comment 2 Jerome Lanneluc CLA 2005-03-18 05:34:05 EST
Changed ClassFileInfo and CompilationUnitStructureRequestor to intern the field
names, the method selectors and the method parameter types. Note the
JavaModelManager intern mechanism is used. Using the VM intern would prevent the
strings from being garbagge collected on some VMs.
Comment 3 Jerome Lanneluc CLA 2005-03-25 09:42:18 EST
Noticed that String#substring(...) doesn't make a copy of the underlying char
array. As a result, ClassFile handles still point to the whole zip entry path.

Changed ClassFile constructor to make a copy of the given name.

Similarly changed ImportContainer#getImport(String) to make a coy of the string
without the trailing '.*'.

Finally changed JavaModelManager#intern(String) to catch other cases.

Total memory gain: 440KB out of 3,380 KB (size of the Java model cache in my
test workspace)
Comment 4 Jerome Lanneluc CLA 2005-03-25 10:33:47 EST
All ClassFile handles have an element name that ends with ".class". 
Changed ClassFile to store only the element name without the ".class".
This saves another 120KB.
Comment 5 Jerome Lanneluc CLA 2005-03-25 11:12:28 EST
Added WeakHashSetOfCharArray and added JavaModelManager#intern(char[]) that uses
this weak hashset.
Changed CompilationUnitStructureRequestor to intern the following char arrays:
- field info's type name
- method info's parameter names, return type and exception types
- type info's superclass and superinterfaces
This saves another 87KB.
Comment 6 Ed Burnette CLA 2005-03-25 11:34:34 EST
Excellent; if I'm reading this right you've eliminated about 20% of the Java
Model cache memory so far.
Comment 7 Jerome Lanneluc CLA 2005-03-25 12:10:28 EST
Thanks ! Along with fixes to bug 89090, bug 89092 and bug 89110, the saved
memory in JDT Core is 22.5%.

Unfortunately, I think I hit the limit of string sharing. ClassFile's names take
the most space now, but interning them has the opposite effect (the weak hashset
entries take more space than sharing the file names saves). It is because there
is   not enough common names in class libraries.
Comment 8 Jerome Lanneluc CLA 2005-03-28 12:25:44 EST
Since there is no other opportunity to share strings in the Java model cache
(that would improve memory), I'm marking this bug as fixed. Note that people are
welcome to open new bugs if other memory performance problems are noticed in JDT
Core.
Comment 9 David Audel CLA 2005-03-31 12:30:39 EST
Verified in I20050330-0500
Comment 10 Noel Grandin CLA 2005-04-20 09:34:54 EDT
Created attachment 20123 [details]
compress strings using utf-8 encoding

I had a weird idea, so I thought I'd try it out.

Since most strings are ascii, the normal java string encoding (utf-16) is thus
50% efficient.

So I converted ClassFile to store it's strings in utf-8 form using byte arrays.


This reduced the memory retained by ClassFile instances by 34% (as measured by
YourKit).

I'm not sure what this will do to CPU performance, since there is now
additional conversion happening, but the memory saved may be worth it for
larger projects.

Also, I tried converting PackageFragment#names to perform the same trick, but
that only effected a 1.2% saving.