Bug 92339 - Reduce memory usage of strings in JavaModelCache
Summary: Reduce memory usage of strings in JavaModelCache
Status: RESOLVED WONTFIX
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows 2000
: P3 enhancement (vote)
Target Milestone: 3.3   Edit
Assignee: JDT-Core-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-04-22 03:07 EDT by Noel Grandin CLA
Modified: 2009-08-30 02:17 EDT (History)
0 users

See Also:


Attachments
compress strings using utf-8 encoding (4.93 KB, patch)
2005-04-22 03:08 EDT, Noel Grandin CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Noel Grandin CLA 2005-04-22 03:07:58 EDT
Hi

This is linked to 
<a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=84872">bug 84872</a>

I had a weird idea, so I thought I'd try it out.

Since most java identifier strings are ascii, the normal java string encoding
(utf-16) is thus about 50% efficient.

So I converted ClassFile to store it's strings in utf-8 form using byte arrays.

This reduced the memory retained by ClassFile instances by 34% (as measured by
YourKit).

I'm not sure what this will do to CPU performance, since there is now
additional conversion happening, but the memory saved may be worth it for
larger projects.

Also, I tried converting PackageFragment#names to perform the same trick, but
that only effected a 1.2% saving, so I dumped that part of the patch.
Comment 1 Noel Grandin CLA 2005-04-22 03:08:43 EDT
Created attachment 20225 [details]
compress strings using utf-8 encoding
Comment 2 Philipe Mulet CLA 2005-04-22 03:51:21 EDT
Recreating strings all the time must be deadly for GC.
It would benefit from recoding algorithms to perform on byte[] as well, so as to
save internal string creations.
Comment 3 Noel Grandin CLA 2005-04-22 04:04:23 EDT
Not necessarily. 
Virtually of the "re-created" objects would be short-lived and would cycle
through the eden space almost immediately. We create temporary objects
everywhere constantly. 

I contemplated writing a "UTF8String" class that would reduce the conversions
required and also interoperate with String, StringBuffer, etc. but concluded
that until a real need grew, we were better off with a simple solution.

But I could be persuaded -grin- ...

Also, writing a UTFString class would remove some of the memory benefit, since
an additional object would be required. 
Unless I made all of the methods on UTF8String static, and left the stored type
as a byte []... Hmmm, slightly awkward to use, but it could be workable.

Comment 4 Jerome Lanneluc CLA 2006-03-28 10:52:45 EST
Interesting idea. Will consider post 3.2.
Comment 5 Denis Roy CLA 2009-08-30 02:17:12 EDT
As of now 'LATER' and 'REMIND' resolutions are no longer supported.
Please reopen this bug if it is still valid for you.