Bug 227986 - Avoid duplicated strings in Java model
Summary: Avoid duplicated strings in Java model
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M4   Edit
Assignee: Jerome Lanneluc CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2008-04-21 07:11 EDT by Martin Aeschlimann CLA
Modified: 2008-12-09 10:07 EST (History)
8 users (show)

See Also:


Attachments
Proposed fix (4.82 KB, patch)
2008-11-26 10:43 EST, Jerome Lanneluc CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Aeschlimann CLA 2008-04-21 07:11:23 EDT
20080421

Using Yourkit 7.0 on my development workspace, I found that there where 2'600 instances of the string 'refactoring' in memory, taking 2.7 MB.

The references to this string came almost all from instances of type IPackageFragment. Each instanceof of a IPackageFragment seems to have it's own string instance for each package name segment.

When creating a IPackageFragment maybe you can use the string from the IPath of the underlying resource: the resource model already makes sure that all IPath elements share their segment Strings.
Comment 1 Martin Aeschlimann CLA 2008-04-21 07:15:18 EDT
Other duplicates are 'org' (1.8 MB waste), 'eclipse' (1.9 MB), 'jdt' (1.6 MB waste), 'ui' (1.5 MB), 'internal' (1.1 MB) ..

So a fix in this area could really pay off...
Comment 2 Jerome Lanneluc CLA 2008-04-23 09:43:25 EDT
Martin, do you have more details on your scenario? Interning Strings has a cost, and I don't want to slow down every IPackageFragment handle creation. So I will optimize the space only if this is not at the cost of the speed. This is why I need to know more on you scenario, so that I know how the IPackageFragments that you saw are created.
Comment 3 Martin Aeschlimann CLA 2008-04-23 10:37:00 EDT
The memory snapshot I took was after a day of work. So I can't say it exactly. But just before taking the snapshot I was doing searches in the jdt.ui source code, for example searching for 'IJavaElement.getElementName', and looking at all search results in the search view. Hope this helps.

If I find some time I'll try to construct some steps.


Comment 4 Markus Kohler CLA 2008-05-20 05:00:23 EDT
Hi all,
I can help you out with this. 
See my blog at http://kohlerm.blogspot.com/2008/05/analyzing-memory-consumption-of-eclipse.html of how to analyze this with the Eclipse Memory Analyzer. 

If Martin could provide us an hprof heap dump, the analysis should be easy. 

Regards,
Markus
Comment 5 Jerome Lanneluc CLA 2008-09-02 05:23:57 EDT
Steps or a hprof heap dump are still needed
Comment 6 Jerome Lanneluc CLA 2008-11-26 10:41:21 EST
I was able to find a case where package fragments hold duplicate strings. To observe this I took a snapshot of my development workspace using YourKit 7.5.11, and I ran the "Duplicate Strings" inspections. It showed that "jdt" and "org" was mostly duplicate.

After investigation, it appears that NameLookup would create instances of PackageFragment with the String[] resulting of splitting the package name, instead of reusing the String[] from the packageFragments cache.
Comment 7 Jerome Lanneluc CLA 2008-11-26 10:43:15 EST
Created attachment 118802 [details]
Proposed fix

Note that no regression tests can be written for memory improvements. So to verify, either run Yourkit's inspection, or check the code.
Comment 8 Jerome Lanneluc CLA 2008-11-27 07:13:44 EST
Fix released for 3.5M4
Comment 9 David Audel CLA 2008-12-09 10:07:49 EST
Verified for 3.5M4 using build I20081208-1800