245008 – Workspace's ElementTree object holding 500MB in large workspace

Bug 245008 - Workspace's ElementTree object holding 500MB in large workspace

Summary: Workspace's ElementTree object holding 500MB in large workspace

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Resources (show other bugs)
Version:	3.4
Hardware:	PC Windows XP

Importance:	P3 critical (vote)
Target Milestone:	3.5.2
Assignee:	Platform-Resources-Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:	292267 294429
Blocks:
	Show dependency tree

Reported:	2008-08-22 18:03 EDT by Ernest Mah
Modified:	2010-10-07 06:44 EDT (History)
CC List:	15 users (show)

See Also:

Attachments
Workspace's ElementTree object memory use (1.41 MB, image/jpeg) 2008-08-22 18:09 EDT, Ernest Mah	no flags	Details
Image of JavaBuilder's ElementTree (333.28 KB, image/jpeg) 2008-08-25 12:47 EDT, Ernest Mah	no flags	Details
Patch 1 (671 bytes, patch) 2009-09-29 11:25 EDT, James Blackburn	no flags	Details \| Diff
Multiple refs to Folder name (153.69 KB, image/png) 2009-10-02 07:05 EDT, James Blackburn	no flags	Details
Patch instrumenting AbstractDataTreeNode to find substring memory pigs (1.87 KB, patch) 2009-10-05 10:18 EDT, Martin Oberhuber	no flags	Details \| Diff
Benchmark of string shortening (2.10 KB, text/x-java) 2009-10-28 11:41 EDT, John Arthorne	no flags	Details
Lots of core.resource heap even after pruning fs tree (168.79 KB, image/png) 2010-07-30 08:19 EDT, James Blackburn	no flags	Details
char arrays in core.resources (268.15 KB, image/png) 2010-07-30 08:45 EDT, James Blackburn	no flags	Details
large_project.tar.bz2 (437.87 KB, application/x-bzip2) 2010-08-02 08:14 EDT, James Blackburn	no flags	Details
Leaked ProjectDescriptionReader. (99.41 KB, image/png) 2010-08-02 15:40 EDT, Konstantin Scheglov	no flags	Details
String duplicates (21.05 KB, image/gif) 2010-08-03 12:22 EDT, Martin Oberhuber	no flags	Details
Minipatch to suppress duplicates due to symlinks (1.01 KB, patch) 2010-08-03 12:48 EDT, Martin Oberhuber	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ernest Mah

2008-08-22 18:03:59 EDT

Build ID: v20080604-1400

Steps To Reproduce:
We have a customer workspace that is imported and a clean build run against it.  During the course of the build (84%) I get an OOME.  This is running with a max heap size of 768M.

In the heap dump, one of the larger objects is org.eclipse.core.internal.watson.ElementTree (hosted by the Workspace object).  This is 500MB.  It contains 2 large objects:
DeltaDataTree (283MB)
DataTreeLookup (217MB)


Workspace details: 548MB, 166 projects and 40,841 files

I can provide the heapdump on request.

Comment 1 Ernest Mah

2008-08-22 18:09:29 EDT

Created attachment 110723 [details]
Workspace's ElementTree object memory use

Add screen shot of heapdump details for the Workspace's ElementTree object

Comment 2 Thomas Watson

2008-08-25 12:01:09 EDT

com.ibm.cds.CDSBundleFile is not part of Eclipse.  This class is used to support shared classes on the IBM VM.  The memory held by that class may be a red herring.  CDSBundleFile has references to objects from the Shared Classes support in the IBM VM which is likely shows large chunks of memory used for the shared class data.

It would be interesting to know if you have the same OOME when com.ibm.cds is disabled.

Comment 3 Ernest Mah

2008-08-25 12:47:36 EDT

Created attachment 110817 [details]
Image of JavaBuilder's ElementTree

295MB ElementTree retained by JavaBuilder

Comment 4 Ernest Mah

2008-08-25 12:49:24 EDT

I'll see if I can retry the run again without the shared class support enabled.  Could someone explain why the ElementTree can grow to such a size during a build?  Looks like in the JavaBuilder case, almost half the size of the workspace.

Comment 5 John Arthorne

2008-08-25 13:28:35 EDT

The ElementTree really *is* the workspace. This structure holds the tree of every project, folder, and file in the workspace. These resources also have metadata such as markers, session properties, repository sync info, etc. The tree also holds onto to many objects owned by other components, such as builders. It's hard to look at these numbers and trace it back to any particular culprit. 

The first trace shows that a URLImageDescriptor that someone is stashing in a resource property transitively refers to the memory held by CDS. Since as you point out it is only a weak reference, it means someone else has a strong reference to this data and the ImageDescriptor object isn't actually responsible for holding onto this large amount of memory. The second trace with the Java builder is pointing back to the same tree so it's likely the same culprit.

Comment 6 Ernest Mah

2008-08-25 23:01:18 EDT

Understood that ElementTree represents the workspace.  Just wondering if the size is justified.  In the scenario, the raw space used by the files is 548MB, so a little over a half of that size is in the representation of the workspace.  Anything I can do to dig into what might be the culprit(s) here?

Comment 7 James Blackburn

2008-09-25 07:10:11 EDT

I've had a similar OOM exception when building a workpsace consisting of CDT + a couple of core platform plugins.  My default heap size is 768M as well.

Eclipse Memory Analyzer reports 2 leak suspects:
1) SaveManager 10% (~69M) for 1 instance
2) DeltaDataTree: 36.9% (~237M) for 52 instances

Comment 8 Martin Oberhuber

2009-09-29 08:19:20 EDT

A similar discussion is currently on the platform-core-dev mailing list:
   http://dev.eclipse.org/mhonarc/lists/platform-core-dev/msg01378.html

It would certainly be good if those interested in this issue always run with a
JVM option to have a heap dump generated on OutOfMemoryError. See
   http://wiki.eclipse.org/index.php/MemoryAnalyzer#Getting_a_Heap_Dump
For Sun JVM's, the option is
   -vmargs -XX:+HeapDumpOnOutOfMemoryError

Comment 9 James Blackburn

2009-09-29 11:25:07 EDT

Created attachment 148334 [details]
Patch 1

A trivial patch to prevent Data Tree from holding onto the IResource's FullPath.

IResources are light-weight and created at time of use  (see Container#get{File|Folder}). the segment returned from IPath#lastSegment is backed by the full IPath String, which wastes memory when the IResources are garbage collected.

Comment 10 Martin Oberhuber

2009-09-29 17:41:38 EDT

Excellent! Now with a small modification to core.tests.resources / WorkspacePerformanceTest, which does some gc() and print used memory after the tests (at least the refresh one), we should be able to verify that this change actually does bring an improvement without sacrificing performance...

Comment 11 Markus Keller

2009-09-30 10:47:42 EDT

(In reply to comment #9)
This pruning could also be worth putting into the StringPool (then it shouldn't be necessary in ElementTree any more).

Comment 12 Pawel Pogorzelski

2009-10-01 16:01:23 EDT

(In reply to comment #9)
> Created an attachment (id=148334) [details]
> Patch 1
> 
> A trivial patch to prevent Data Tree from holding onto the IResource's
> FullPath.

James, the patch supplied addresses only a fraction of the problem. I opened a workspace with a large set of java projects and run a build on it. During the process there were more than 4 million calls to AbstractDataTreeNode(String name, AbstractDataTreeNode[] children) with a string that points to a buffer holding more chars than necessary. Not even 1% of the calls is triggered by ElementTree.createElement(IPath key, Object data).

I'm still investigating the issue. We have to make sure the fix covers the problem comprehensively and doesn't impact execution time and consumed memory (in case references to the original string are not lost).

Comment 13 James Blackburn

2009-10-01 17:07:34 EDT

(In reply to comment #12)
> James, the patch supplied addresses only a fraction of the problem. I opened a
> workspace with a large set of java projects and run a build on it. During the
> process there were more than 4 million calls to AbstractDataTreeNode(String
> name, AbstractDataTreeNode[] children) with a string that points to a buffer
> holding more chars than necessary. Not even 1% of the calls is triggered by
> ElementTree.createElement(IPath key, Object data).

Ok. But the story may be more nuanced than that.  From looking at the heapdump it appeared that many elements in the tree were backed by the same Strings, possible different segments of the same path.

Things, for me at least, only started getting large when the nodes of the DeltaDataTree were created.

I haven't looked into this too deeply, so no doubt you know better than I and can come up with a good soution :)

Comment 14 Martin Oberhuber

2009-10-01 22:16:44 EDT

(In reply to comment #13)
> Ok. But the story may be more nuanced than that.  From looking at the heapdump
> it appeared that many elements in the tree were backed by the same Strings,
> possible different segments of the same path.

How did you come to that conclusion? When I did a "group by value" of all the large char[] arrays in my sample heap dump, it seemed to me that each value was held on to by only one reference.

But in general I agree that this needs to be closely investigated, and especially tested for memory and performance impact. When it turns out that a single char[] array is in fact re-used by many references looking at different portions of the array, a "new String()" may actually deteriorate the situation rather than improving it.

I have not yet found a really good testcase. The "WorkspacePerformanceTest" seemed to take only 3MB of heap; and looking at a "live" workspace manually opens the question what operations should be monitored for potential impact and how. I tried looking at the heap meter in the status bar and manually performing garbage collections, but also with mixed success since the large memory consumption during a build is only transient. A measured unittest still feels like the most reliable way to go, at least it should help us get confidence that we are not making things worse compared to the existing performance baseline.

> Things, for me at least, only started getting large when the nodes of the
> DeltaDataTree were created.

Yes, I also had that impression. Just opening a large workspace (which reads persisted state information from disk) does not seem to be a problem whereas a large refresh operation, which may arguably be triggered by a Java Build, creates the DeltaDataTree that is a problem.

Comment 15 Pawel Pogorzelski

2009-10-02 06:02:26 EDT

(In reply to comment #14)
> When I did a "group by value" of all the large char[] arrays in my sample heap 
> dump, it seemed to me that each value was held on to by only one reference.

I got the same impression from looking at the dump.

Comment 16 James Blackburn

2009-10-02 07:05:37 EDT

Created attachment 148627 [details]
Multiple refs to Folder name

(In reply to comment #15)
> (In reply to comment #14)
> > When I did a "group by value" of all the large char[] arrays in my sample heap 
> > dump, it seemed to me that each value was held on to by only one reference.
> 
> I got the same impression from looking at the dump.

Your're right, there don't appear to be multiple segments from the same IPath, but the IPath is held onto by very many ElementTree objects in their childIDsCache which I think is what I saw.

In the attached screenshot the org.eclipse.cdt.core.tests String value (a folder under the org.eclipse.cdt.releng Project) is referenced from 1400+ ElementTree ChildIDsCaches.  The same IPath segment backs the Strings. And the DataDeltaNodes & ElementTree instances appear to be unique.  The picture is confused because the ChildIDsCache class isn't static so has a pointer back to the containing ElementTree.

Comment 17 James Blackburn

2009-10-02 08:11:20 EDT

(In reply to comment #12)
> Not even 1% of the calls is triggered by
> ElementTree.createElement(IPath key, Object data).

That is pretty obvious when you look at the CallGraph of the AbstractDataTreeNode constructor.

However most of these callsites are _internal_ to the o.e.c.internal.dtree package, where there are methods for copying trees: copyCompleteSubtrees, calculating deltas, etc.

I made the assumption (perhaps wrongly) that the String wastage is caused by the call-sites responsible for building the tree in the first place.  If you do a Java search for references to elements in o.e.c.i.dtree you'll get many fewer hits.

The main entrance point does appear to be ElementTree.  

There are a few calls in ResourceDeltaFactory & BuildManager for calculating deltas, and a some calls from the serialization classes. But unless I'm missing something, TreeNode creation seems to be funneled through ElementTree.

Comment 18 Martin Oberhuber

2009-10-05 08:53:19 EDT

Would it be possible to completely replace the current
    String AbstractDataTreeNode#name
field by a
    char[] AbstractDataTreeNode#name
field?

Some facts in favor of doing this:
  - Low memory consumption is important for the ElementTree. A String object
    takes 24 bytes more than the underlying char[] array. In a workspace
    with 300000 elements, this equals to 7.2 MBytes just for the String
    overhead.
  - The name is immutable anyways, so no advanced operations are necessary.
  - Replacing the signature, we will make 100% sure that there won't be any
    leftovers of substring() eating precious memory.

Some caveats:
  - Using a char[] array internally requires to *always* copy. String sharing
    via the StringPool won't work for this. Do we have any numbers how much
    StringPool.getSavedStringCount() we get for the ElementTree specifically
    in a large workspace?

Comment 19 Martin Oberhuber

2009-10-05 09:33:30 EDT

(In reply to comment #11)
> This pruning could also be worth putting into the StringPool (then it shouldn't
> be necessary in ElementTree any more).

I had a look at this suggestion, and at first sight it seems a great idea
to perform the substring pruning in StringPool's
   String StringPool#add(String)
but I'm afraid that for the issue we are discussing here, pruning in StringPool only may be too late. 

The problem that at least I am seeing is that the *temporary* memory usage of DeltaDataTree leads to an OutOfMemoryError during a large workspace refresh or Java build job. This is before the StringPoolJob even gets a chance to internalize the Strings.

This doesn't mean, though that substring pruning in StringPool doesn't make sense *in addition to* anything we are doing in the DeltaDataTree if we think that the StringPool should be optimized for Memory. The only caveat is that if multiple substring sections of one and the same large String are being held on by different clients, the substring pruning may actually lead to *increased* memory usage. As said before, it's important to test the real-life outcome of any changes.

Comment 20 Martin Oberhuber

2009-10-05 10:18:00 EDT

Created attachment 148779 [details]
Patch instrumenting AbstractDataTreeNode to find substring memory pigs

In order to find out what clients lead to creating Strings in the ElementTree that use more memory than needed, I wrote attached patch which instruments the AbstractDataTreeNode constructor.

Using introspection, this is able to find out whether a String being passed in as the "name" actually is a "plain" string or a substring that carries more characters in its underlying character array than necessary.

Set a breakpoint in line 59 to see who creates Strings with an extra stowaway payload.

This patch is of course not ready to go into any product, but I hope it will help us diagnose the problem further.

Comment 21 Martin Oberhuber

2009-10-05 10:22:34 EDT

Using the patch just attached, I found that Strings with stowaway characters are created during the "open project" and "close project" operations, when Markers go into DeltaDataTree.setData():

DataDeltaNode(AbstractDataTreeNode).<init>(String, AbstractDataTreeNode[]) line: 59	
DataDeltaNode(DataTreeNode).<init>(String, Object) line: 32	
DataDeltaNode.<init>(String, Object) line: 27	
DeltaDataTree.setData(IPath, Object) line: 919	
ElementTree.openElementData(IPath) line: 661	
Workspace.getResourceInfo(IPath, boolean, boolean) line: 1243	
MarkerManager.changedMarkers(IResource, IMarkerSetElement[]) line: 228	
MarkerReader_3.read(DataInputStream, boolean) line: 86	
MarkerReader.read(DataInputStream, boolean) line: 50	
MarkerManager.restoreFromSave(IResource, boolean) line: 528	
MarkerManager.restore(IResource, boolean, IProgressMonitor) line: 513	
SaveManager.restoreMarkers(IResource, boolean, IProgressMonitor) line: 721	
SaveManager.restore(Project, IProgressMonitor) line: 699	
Project.open(int, IProgressMonitor) line: 908	
Project.open(IProgressMonitor) line: 947	

the IPath in this case is a Workspace path, from which only the last segment is being used. I do believe that using a "new String()" in this place as well should be beneficial since the entire workspace path should not be a long-lived object whereas the DataTree maybe is.

Unfortunately I cannot invest much more time in this right now, but I hope that we'll now be able to eliminate those places that lead to excessive memory consumption, while still avoiding much extra work.

Comment 22 Martin Oberhuber

2009-10-16 10:03:05 EDT

For everybody's information, another OutOfMemoryError during workspace refresh has been fixed as per bug 292267. That issue (in UnifiedTree, and transient during refresh) is unrelated though to the issue discussed here (in DeltaDataTree, and potentially persistent).

Comment 23 Martin Oberhuber

2009-10-27 18:06:48 EDT

(In reply to comment #20)

At Wind River, we have been testing for a while now using the patch from attachment 148779 [details], and we have found the patch to be very effective reducing memory consumption without any perceivable negative impact.

Counter to my original assessment, I'm wondering whether that patch might be allowable in a release even though it uses introspection? - Fact is, that the patch fixes the problem exactly at its roots: At the point where Strings with unnecessary payload are about to be added to a DataTree (that is meant to be a memory efficient data structure).

What do other committers think? Would it make sense to give it a try and commit this patch into HEAD once 3.6m3 is out -- just to see how the performance results change?

The other option (not using introspection) is using the patch to detect at what "entry points" those Strings with extra payload get into the system, and replace those occurrences with "new String(...)" constructs. The drawback of that approach is that it always creates new Strings whereas the introspection method creates a new String only if necessary.

Comment 24 John Arthorne

2009-10-28 11:41:36 EDT

Created attachment 150740 [details]
Benchmark of string shortening

From a quick benchmark (attached here), the reflective approach seems to be 100x slower than simply doing new String(name) every time regardless of input. The reflective code may be useful for finding where this is happening, but if we were to release something in HEAD i would opt for a simpler non-reflective solution. I'm also not sure if it's specified anywhere that String has a private field called 'value' and maybe it's not the same for all class library implementations.

Comment 25 John Arthorne

2009-10-28 11:45:01 EDT

I also tested with "new String(name.toCharArray())" which is about the same cost as "new String(name)". The latter case isn't guaranteed to copy/shorten the underlying character array whereas the first one always will (unfortunately it will have to copy it twice but it didn't seem to have measurable performance impact).

Comment 26 Martin Oberhuber

2009-10-28 11:55:09 EDT

(In reply to comment #24)
Thanks John, this is very interesting!

I am a little afraid of just blindly doing "new String(foo)" but it may be worth a try. A more interesting replacement would be something like we do in StringPool today, assuming an identity map in there which always only hosts shortened Strings:

   String shortString = stringPool.get(longString)
   if (shortString == null) {
      shortString = new String(longString);
      stringPool.add(shortString);
   }

Trying that in a benchmark with, say, 400.000 random Strings where some of them are actually duplicates and/or actually substrings would be extremely interesting. I'd especially be interested regarding the cost of hash lookup versus the cost of new String(), and the cost of time (processing power) versus the cost of wasting memory and garbage collecting. Given that Strings store their hashCode in an internal cache, I could imagine that the performance is OK -- and it would give the extra benefit of avoiding duplicate Strings from the start.

Could you try that out?

Comment 27 John Arthorne

2009-10-28 12:04:48 EDT

(In reply to comment #26)
> (In reply to comment #24)
> Thanks John, this is very interesting!
> 
> I am a little afraid of just blindly doing "new String(foo)" but it may be
> worth a try. A more interesting replacement would be something like we do in
> StringPool today, assuming an identity map in there which always only hosts
> shortened Strings:

The problem with doing that is maintaining the pool over time. If I delete a very large project all of its strings should go away quickly. This could perhaps be done as part of the background string pool job, which would help with average memory consumption but not help with peak consumption.

Comment 28 Martin Oberhuber

2009-10-28 12:15:56 EDT

What about creating a fresh StringPool just for the sake of building the DeltaDataTree, and then throwing it away?

Our problem at the moment is building very large trees due to very large deltas. Once the tree is built, processing it is not a problem and the current "storeStrings()" will take care of not making string count explode over time.

Martin

Comment 29 Thomas Watson

2009-10-28 13:47:54 EDT

(In reply to comment #25)
> I also tested with "new String(name.toCharArray())" which is about the same
> cost as "new String(name)". The latter case isn't guaranteed to copy/shorten
> the underlying character array whereas the first one always will (unfortunately
> it will have to copy it twice but it didn't seem to have measurable performance
> impact).

See bug 287183 for a case where this hit our memory consumption in the NLS class.

Comment 30 Martin Oberhuber

2010-06-05 02:36:18 EDT

FYI, I'm about 95% sure that this problem has been fixed with the fixes for 
bug 292267 and bug 294429, both of which have not only been fixed in Eclipse 3.6 but also been backported to Eclipse 3.5.2.

I just didn't find the time yet to verify this with our original usage scenario.

If anybody still sees this issue (or something related to it) with Eclipse 3.6 or 3.5.2, please let us know here on the bug (because then I know my assumption is wrong and I don't need to go verifying).

Likewise, it would be very helpful if any other commenters here could see if their problem has been fixed, particularly the original reporter.

Comment 31 James Blackburn

2010-07-30 08:19:41 EDT

Created attachment 175560 [details]
Lots of core.resource heap even after pruning fs tree

I've just had a user hit this. Using Eclipse 3.6 + core.resources HEAD.  Starting Eclipse prompts a full refresh which causes an OOM and crash. Repeatable for the user using a 768M heap.

I tried adding resource filters to the .project after the fact. The workspace still fails to load.  The user then removed the large subtrees from the project and still the workspace fails to load (see screenshot of heapdump).

I've tar'd up their (pruned) project tree and workspace .metadata area, but of course having moved the .metadata, I can't launch Eclipse to exactly replicate the user's setup. Re-importing the pruned project works fine.  I've asked him to re-create the large tree so I can attempt to reproduce this.

Comment 32 Martin Oberhuber

2010-07-30 08:25:30 EDT

James, this is very interesting. A couple of times I was close to closing this bug but always refrained from doing so since I had no verification proof that we fixed all issues of memory pigs in core.resources.

Can you zip up and share the heapdump somewhere for more in-depth analysis? (I used http://getdropbox.com , worked fine for my heap dumps). Since you have the heapdump, it may not be necessary to reproduce this. Is the heapdump from an OutOfMemoryError ?

Comment 33 James Blackburn

2010-07-30 08:26:26 EDT

The NoDataDeltaNodes' names seem to be backed by the resource FullPaths (and examining the references of a couple, there doesn't seem to be much sharing).  These referenced resources are no longer in the workspace tree.  These FullPaths are ~500 bytes long (whereas the filenames are just ~10s bytes long) which I guess accounts for the massive leak.

Comment 34 James Blackburn

2010-07-30 08:45:38 EDT

Created attachment 175565 [details]
char arrays in core.resources

Thanks Martin. I'm not sure I can distribute the heapdump as some of the strings may be sensitive to the project :s.

Digging into the the memory analyzer report on core.resources shows:
Class Name	Objects	Shallow Heap
char[]	516,681	258,567,368
org.eclipse.core.internal.resources.ResourceInfo	1,376,333	121,117,304
org.eclipse.core.internal.dtree.DataTreeNode	873,509	34,940,360
java.lang.String	624,472	24,978,880
org.eclipse.core.internal.dtree.DataDeltaNode	502,837	20,113,480
org.eclipse.core.internal.dtree.AbstractDataTreeNode[]	262,590	18,175,000
java.lang.Object[]	190,337	10,005,928
org.eclipse.core.internal.utils.ObjectMap	188,749	6,039,968 
...

Examining the char[] shows the same symptom of String names backed by very long char[] with many of them unshared (screenshot).

Comment 35 Martin Oberhuber

2010-07-30 09:10:13 EDT

Hm... this is, in fact, clearly substring baggage. Question is, by what operation that baggage gets into the ElementTree model. AFAIK, all of the ElementTree deals with IPath objects (which use arrays of individual segments as storage).

Your paths appear to be file system paths, and not workspace paths. This might indicate a problem in EFS (IFileStore / IFileInfo and implementations). Since I cannot think of another way how file system paths with substring baggage could get into the system.

Whenever the file system is queried (like the list() or stat() calls), a full path String is needed. Typically, that full path String is constructed out of IPath elements and then discarded; or, *if* a substring operation is performed to get the file name of an element only, the special code is used which discards substring baggage.

I know that in 3.6, some code has been added to deal with UNIX file permissions in a more flexible way. Perhaps a regression was introduced. I'd try looking for a substring() operation in code that was changed in 3.6 in the core.filesystem plugin after Nov 2009 -- or substring() operation in core.filesystem in general. Note that java.io.File performs substring() internally, so the culprit might be somewhat hidden.

Do you have a hs_err*.log of the OOME? Unlikely this helps here, but it should not be ignored.

Comment 36 Martin Oberhuber

2010-07-30 09:15:03 EDT

PS in case these are not file system paths but resource fullpaths, discard my previous statements .. in that case it's unlikely the culprit is in core.filesystem and more likely it's in the ElementTree itself, or the DataTreeReader (which unpersists previous workspace state). 

RefreshLocal / UnifiedTree could be a culprit but that's unlikely since we analyzed and fixed it last year. 

The trick that helped me find the culprit last year was my patch from
attachment 148779 [details], with a breakpoint that strikes early in case substring baggage is introduced. Of course, this does require a reproducable test case that you can debug.

Comment 37 James Blackburn

2010-07-30 09:21:21 EDT

(In reply to comment #35)
> Your paths appear to be file system paths, and not workspace paths. This might
> indicate a problem in EFS (IFileStore / IFileInfo and implementations). 

They are workspace FullPaths (the project is called apps_...).  [It used to be the case that we saw fs locations, but I think you fixed that.] For some reason the test framework used encodes the test parameters in the path name (ugh.) which results in the ~250 character long paths in the project -- but as you say this wouldn't be a problem if these strings were shared.

> Do you have a hs_err*.log of the OOME? Unlikely this helps here, but it should
> not be ignored.

I couldn't find an hs_err_*.log here, the JVM didn't crash we just passed -XX:+HeapDumpOnOutOfMemoryError to it on startup. 

Hopefully I'll be able to reproduce this in a debugger when the user has re-created the test trees in his project.

Comment 38 James Blackburn

2010-08-02 08:14:24 EDT

Created attachment 175704 [details]
large_project.tar.bz2

I wrote a quick program to re-create a project tree with segment names obfuscated (preserving length, extension and uniqueness).  Attached is a tar of the results.

What's interesting is that there are quite a few symlinks in this project, and I suspect this may be the root of the problems we're having.

A note on the complexity of this tree:
Unique Path segments: 10455  ==> 215k characters
FullPath total: find large_project |wc
  27737 entries  ==>  3126127 characters
3343 symlinks

If common segments were shared, Eclipse should use less than 1MB of space for the paths.  Even if we use a new string for each entry, the total usage should be bounded at 4MB.

If you switch the .project for .project_filter (which filters 'results') the project imports easily in less than 10s for me.  This removes ~1/3 of the paths, and most the symlinks.

I'm wondering if this is caused by bug 105554 comment 53 because of the 2s timeout leading to not all cyclic links being correctly detected.  Certainly the refresh seems to never end before Eclipse eventually OOMs.

Comment 39 Martin Oberhuber

2010-08-02 09:24:47 EDT

(In reply to comment #38)
> I'm wondering if this is caused by bug 105554 comment 53 because of the 2s

The 2 second timeout problem should have been resolved with the fix for
bug 232426. Due to the algorithm, it can still sometimes take "long" until a symbolic link cycle is detected, but it should eventually terminate; the tradeoff we made was that we reduced the number of "getCanonicalPath()" calls to a minimum such that projects which don't have any symbolic links at all don't get a performance hit due to calling getCanonicalPath() unnecessarily.

Also, symbolic link cycles still don't explain the existence of substring baggage, which is causing the OutOfMemoryError. So regardless of whether we can do another round at improving symbolic link cycle handling, the substring baggage problem should be found and fixed.

Do you have simple instructions for reproducing the issue with your attached artificial project structure? Again, I'd recommend using attachment 148779 [details] with a breakpoint to see where the substring baggage gets first introduced.

Comment 40 James Blackburn

2010-08-02 09:34:21 EDT

Thanks Martin, I'll need to dig a bit deeper. At the outset I wanted to be able to attach something that would allow others to see what I'm seeing.

(In reply to comment #39)
> Do you have simple instructions for reproducing the issue with your attached
> artificial project structure? Again, I'd recommend using attachment 148779 [details] with
> a breakpoint to see where the substring baggage gets first introduced.

On a *nix machine expand it:
> tar -jxvf large_project.tar.bz2
> eclipse
  File > Import > Existing projects into workspace
  Point at : .../large_project

Depending on the size of your heap it uses lots of CPU for a time then OOMs.

Adding the filter for 'results' before import allows the import to happen quickly.
> mv large_project/.project_filter large_project/.project

I'd be very interested to know if you can get the project to import...

Comment 41 James Blackburn

2010-08-02 10:02:07 EDT

Just interrupting the runtime Eclipse as the project is being import I'm seeing pretty bad behaviour:

There's a notifyJob being run during the import. And the CDT listener is iterating over the delta calling IFolder#members() => 
  RefreshJob.addRequest(IResource) line: 67	
  RefreshJob.refresh(IResource) line: 141	
  RefreshManager.refresh(IResource) line: 86	
  Folder(Container).members(int) line: 268	
  ...
The RefreshManager fRequests ArrayList has 73k items and counting...

Comment 42 Konstantin Scheglov

2010-08-02 15:38:58 EDT

I also see memory leak related to DeltaDataTree.
http://dl.dropbox.com/u/76691/Eclipse/bug245008/PDReader-leak.7z

All projects deleted from workspace, but still a lot of objects in memory.

Comment 43 Konstantin Scheglov

2010-08-02 15:40:12 EDT

Created attachment 175730 [details]
Leaked ProjectDescriptionReader.

Comment 44 Martin Oberhuber

2010-08-02 22:06:11 EDT

(In reply to comment #42)
> All projects deleted from workspace, but still a lot of objects in memory.

In your screenshot, I see that ProjectDescription holds a builder, which holds an "oldState" of the workspace tree that accounts for the 24MB memory usage.

The workspace (and builders specifically) are made in a way that allows rolling back changes. So while you have deleted your projects, they are still available (until after you quit Workbench - perhaps you have already seen a small dialog like "compressing local history" on shutdown).

So I wouldn't claim that this is a memory leak.

Comment 45 Konstantin Scheglov

2010-08-03 09:14:05 EDT

(In reply to comment #44)

> 
> The workspace (and builders specifically) are made in a way that allows rolling
> back changes. So while you have deleted your projects, they are still available
> (until after you quit Workbench - perhaps you have already seen a small dialog
> like "compressing local history" on shutdown).

1. I delete project now only from workspace, but also from disk, and Eclipse states that "this cannot be undone" when I do this in GUI. So, what is the reason to keep anything in memory, if this information can not be used?

2. Is there any way to delete project fully, including undo information? I know, that I will not use undo and I create and delete projects with name TestProject hundreds times.

Comment 46 Martin Oberhuber

2010-08-03 09:33:37 EDT

(In reply to comment #45)
> states that "this cannot be undone" when I do this in GUI. So, what is the
> reason to keep anything in memory, if this information can not be used?

I looked again at your screenshot from attachment 175730 [details], and the reason why your 24MB are still in memory is in the lower pane: There is a singletonParser for parsing the XML project descriptions, and it keeps the most recent projectDescription in memory (which in turn keeps the builder, which keeps the oldState of your workspace).

> 2. Is there any way to delete project fully, including undo information? I
> know, that I will not use undo and I create and delete projects with name
> TestProject hundreds times.

If you are only concerned about this specific case - next time you create or open a project, the singletonParser will be re-used for your new project description, so the old one will be gone.

Again, I do not believe that this qualifies as a memory leak since it doesn't grow indefinitely. Only the last-used project description is kept.

It might be fairly easy to resolve this behavior (set the "projectDescription" field to null in the singletonParser when parsing is done). If you think that this should be done, please file a separate bug such that it can be tracked separate from the current discussion.

Comment 47 Martin Oberhuber

2010-08-03 11:34:40 EDT

(In reply to comment #40)
> I'd be very interested to know if you can get the project to import...

I reproduced the OOME with your large_project and Eclipse SDK M20100728-0800 , as well as Eclipse SDK 3.5.2 with default VM args (-Xmx256m -XX:MaxPermSize=256m). Since you have so many symbolic links, perhaps the PrefixPool could be responsible for some substring baggage.

Since I didn't have CDT installed, it looks like there is an issue independent of the CDT listener problem you mentioned in comment 41.

At any rate, the patch from bug 232426 comment 27 should very likely alleviate your problem, since it detects resource tree duplicates caused by symbolic links much earlier and thus greatly reduces the workspace tree. I think that unless we find an even more intelligent option for dealing with symbolic links, this patch should go into Eclipse 3.7 as either a UI option (project level or workspace level), or as a non-UI option like the current patch.

Comment 48 Martin Oberhuber

2010-08-03 12:11:54 EDT

(In reply to comment #47)

Correction: With both M20100728-0800 as well as 3.5.2, the "Refreshing Workspace" operation seemed to complete without an OOME. But afterwards, the "Building Workspace" ran into an OOME. I was presented a dialog, but was able to continue using the Workbench.

Since I was able to continue using the Workbench, I could perform a clean shutdown and restart. After the restart,
 - total heap size is 94 MB
 - 80% of this is in 929000 Objects of type ResourceInfo / DataTreeNode

So although your "large_project" has only 27000 distinct files, we get 929000 nodes in the Workspace tree (factor 30). This is clearly an artifact of your symbolic link cycles (duplicates) and should be addressed by the patch on 
bug 232426 comment 27.

The substring baggage is clearly gone after quit/restart, so the DataTreeReader creates "clean" workspace tree nodes. I also looked at the heapdump created by "Build" and it also shows the main problem being the sheer size of workspace nodes; while the retained size of all String and char[] objects is larger than after quit/restart, it is not ridiculously large so I could not find any clear culprits of substring baggage.

So - all in all - to me it looks like there is a clear problem of duplicates due to symbolic links, plus (probably) a problem with the CDT listener (that I haven't looked at).

Comment 49 Martin Oberhuber

2010-08-03 12:22:25 EDT

Created attachment 175795 [details]
String duplicates

My final comment: In the heapdump obtained from the "build" OOME, there are clearly some duplicate Strings (that could be optimized away with String Pooling, and apparently are pooled after quit / restart - so that's the reason for the smaller heap in that case).

For instance, file "rsMOT1,PCn6T502.cfg" is found > 115000 times in Memory while it is only 657 times in the physical file tree. The many duplicates of this single String account for a memory consumption of 6.5 MB in my heap dump.

I seem to remember that String Pooling is performed in a low-priority background job independent of Workspace refresh / build. It might make sense to require a pooling job to run after refresh but before build, when the refresh job produced a huge number of new items.

But still, the core problem here is insufficient handling of symlink cycles.

Comment 50 James Blackburn

2010-08-03 12:24:25 EDT

Thanks Martin. I'll have a look at the patch you referenced.

(In reply to comment #48)
> So - all in all - to me it looks like there is a clear problem of duplicates
> due to symbolic links, plus (probably) a problem with the CDT listener (that I
> haven't looked at).

I don't think there's anything wrong with the CDT listener. It's simply iterating over the resource delta. The problem comes because:
  - the refresh is breadth-first
  - a notification is fired during the refresh (when the tree is partially built, and has great width)
  - the cdt resource change listener iterates over the tree calling Container#members().

All the folders with ICoreConstants.M_CHILDREN_UNKNOWN (i.e. the entire lower level of the tree) are added to the refresh manager for refresh.

The problem wouldn't occur -- or would be much less severe -- if the refresh was depth first. At least then there wouldn't be many (exponentially) fewer containers marked with ICoreConstants.M_CHILDREN_UNKNOWN.

I'll file a separate issue for this.

Comment 51 Martin Oberhuber

2010-08-03 12:48:18 EDT

Created attachment 175798 [details]
Minipatch to suppress duplicates due to symlinks

Here is a minipatch to suppress workspace duplicates introduced due to symbolic links. With this patch applied, James' structure loads into Eclipse at 21 MB heap usage only.

At the heart, this is exactly the same patch I referenced earlier, but without any Preference stuff. Applies cleanly against o.e.core.resources R3_6_maintenance.

Comment 52 James Blackburn

2010-08-04 05:08:48 EDT

(In reply to comment #51)
> Created an attachment (id=175798) [details]
> Minipatch to suppress duplicates due to symlinks

Magic!  I can import both of the user's very large projects :) Thanks for looking at this Martin.

Comment 53 John Arthorne

2010-10-06 14:58:37 EDT

Can this bug be closed now? From my quick reading of it, all issues are addressed, or covered by different bugs (such as bug 251159).

Comment 54 Szymon Brandys

2010-10-06 19:38:12 EDT

I would like to see a bug raised to backport changes made during 3.7 to the 3.6 maintenance stream. Could you Martin or James raise the bug and attach the patch there? Thanks.

Comment 55 James Blackburn

2010-10-07 05:22:35 EDT

(In reply to comment #53)
> Can this bug be closed now? From my quick reading of it, all issues are
> addressed, or covered by different bugs (such as bug 251159).

Possibly. Certainly core.resources on HEAD can't open our projects with deep symlinks.  

Applying Martin's minipatch makes things work very much better.  Perhaps this is all wrapped up in bug 251159?  It would definitely be good to have a fix in for 3.7.

Comment 56 Martin Oberhuber

2010-10-07 06:44:38 EDT

(In reply to comment #53)

I never got to performing a final verification, but from everything I know it seems right to close this bug. I'm not aware of anything committed to 3.7 specifically. All fixes are in the 2 dependent bugs, which have actually both been fixed in 3.6m4 and backported to 3.5.2 so that seems to be the right target milestone.

We can use bug 251159 to follow up on the issue seen by James.