92339 – Reduce memory usage of strings in JavaModelCache

Bug 92339 - Reduce memory usage of strings in JavaModelCache

Summary: Reduce memory usage of strings in JavaModelCache

Status:	RESOLVED WONTFIX

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	3.1
Hardware:	PC Windows 2000

Importance:	P3 enhancement (vote)
Target Milestone:	3.3
Assignee:	JDT-Core-Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2005-04-22 03:07 EDT by Noel Grandin
Modified:	2009-08-30 02:17 EDT (History)
CC List:	0 users

See Also:

Attachments
compress strings using utf-8 encoding (4.93 KB, patch) 2005-04-22 03:08 EDT, Noel Grandin	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Noel Grandin

2005-04-22 03:07:58 EDT

Hi

This is linked to 
<a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=84872">bug 84872</a>

I had a weird idea, so I thought I'd try it out.

Since most java identifier strings are ascii, the normal java string encoding
(utf-16) is thus about 50% efficient.

So I converted ClassFile to store it's strings in utf-8 form using byte arrays.

This reduced the memory retained by ClassFile instances by 34% (as measured by
YourKit).

I'm not sure what this will do to CPU performance, since there is now
additional conversion happening, but the memory saved may be worth it for
larger projects.

Also, I tried converting PackageFragment#names to perform the same trick, but
that only effected a 1.2% saving, so I dumped that part of the patch.

Comment 1 Noel Grandin

2005-04-22 03:08:43 EDT

Created attachment 20225 [details]
compress strings using utf-8 encoding

Comment 2 Philipe Mulet

2005-04-22 03:51:21 EDT

Recreating strings all the time must be deadly for GC.
It would benefit from recoding algorithms to perform on byte[] as well, so as to
save internal string creations.

Comment 3 Noel Grandin

2005-04-22 04:04:23 EDT

Not necessarily. 
Virtually of the "re-created" objects would be short-lived and would cycle
through the eden space almost immediately. We create temporary objects
everywhere constantly. 

I contemplated writing a "UTF8String" class that would reduce the conversions
required and also interoperate with String, StringBuffer, etc. but concluded
that until a real need grew, we were better off with a simple solution.

But I could be persuaded -grin- ...

Also, writing a UTFString class would remove some of the memory benefit, since
an additional object would be required. 
Unless I made all of the methods on UTF8String static, and left the stored type
as a byte []... Hmmm, slightly awkward to use, but it could be workable.

Comment 4 Jerome Lanneluc

2006-03-28 10:52:45 EST

Interesting idea. Will consider post 3.2.

Comment 5 Denis Roy

2009-08-30 02:17:12 EDT

As of now 'LATER' and 'REMIND' resolutions are no longer supported.
Please reopen this bug if it is still valid for you.