Bug 210470 - Class Reloading and Load Time Weaving
Summary: Class Reloading and Load Time Weaving
Status: RESOLVED FIXED
Alias: None
Product: AspectJ
Classification: Tools
Component: LTWeaving (show other bugs)
Version: 1.5.3   Edit
Hardware: PC Windows Vista
: P2 major (vote)
Target Milestone: 1.6.1   Edit
Assignee: aspectj inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-20 19:48 EST by Eric Grunzke CLA
Modified: 2008-06-08 13:37 EDT (History)
3 users (show)

See Also:


Attachments
Contains code that demonstrates the bug (8.87 KB, application/x-zip-compressed)
2007-11-20 19:48 EST, Eric Grunzke CLA
no flags Details
5 HTML files from jhat analysis (152.24 KB, application/x-zip-compressed)
2007-12-06 13:31 EST, Eric Grunzke CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Grunzke CLA 2007-11-20 19:48:05 EST
Created attachment 83378 [details]
Contains code that demonstrates the bug

I'm working on a project that generates a lot of code and uses a custom ClassLoader to pull generated classes into the JVM.  My structure allows for reloading (by orphaning the old loader and creating a new one) in the event that code is regenerated. Additionally, I have enabled load-time weaving to add some behavior to the generated classes as they are loaded.

My problem here is that once I enable load-time weaving, orphaned class loaders are *not* garbage collected.  After some investigation, I'm convinced that the Load Time Weaver is holding on to references of class loaders that have otherwise been abandoned.  As long as a reference is maintained to the loader, it will not be garbage collected, and all of the classes it loaded will similarly stay in memory.

I've attached a zip with some code that demonstrates the bug.  README.txt goes into more detail about the code and how to execute it to demonstrate the bug.
Comment 1 Andrew Clement CLA 2007-12-04 14:52:42 EST
I've started to look into this.  Do you actually have a problem where things don't work or are you just concerned about memory consumption?

The weavers are held onto via a WeakHashMap indexed by the classloader instance - as it is a weakhashmap then if the only remaining reference to the loader is this key, the entry is a candidate for garbage collection.  Being a candidate does not mean it will be GC'd the next time a gc occurs and System.gc() is only a hint to the garbage collector, not a request for it to collect everything it can.  Finalizers are also not a completely reliable way of detecting the gc of an object since there is no guarantee when they will run or if they ever will run.  I initially thought I had found a potential problem with the mechanism we use but it doesn't look like I have.  I did some analysis with hprof/hat and could only see connections through weak references.  I then changed your src program, instead of this bit in MemoryLeak:

 releaseMe = null;
 leaky = null;
 clazz = null;
 loadClass = null;
 System.gc();

I changed it to:

 releaseMe = null;
 leaky = null;
 clazz = null;
 loadClass = null;
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();
 System.gc();

just to give the JVM more than a few hints :)  After recompilation, I now get this output when it runs:

java -javaagent:C:\aspectj1.5-dev\lib\aspectjweaver.jar bug.aspectj.rtw.MemoryLeak
[AppClassLoader@9fbe93] info AspectJ Weaver Version DEVELOPMENT built on Monday Dec 3, 2007 at 19:23:20 GMT
[AppClassLoader@9fbe93] info register classloader sun.misc.Launcher$AppClassLoader@9fbe93
[AppClassLoader@9fbe93] info using configuration /D:/temp/ltw/bin/META-INF/aop.xml
[AppClassLoader@9fbe93] info register aspect bug.aspectj.rtw.BuggyWeaver
CREATE: bug.aspectj.rtw.CustomLoader@56f631
CREATE: bug.aspectj.rtw.CustomLoader@13d28e3
[AppClassLoader@9fbe93] info processing reweavable type bug.aspectj.rtw.BuggyWeaver: bug\aspectj\rtw\BuggyWeaver.aj
Woven behavior
Unwoven behavior
Regular behavior
DELETE: bug.aspectj.rtw.CustomLoader@13d28e3
DELETE: bug.aspectj.rtw.CustomLoader@56f631
Comment 2 Eric Grunzke CLA 2007-12-04 18:10:49 EST
This issue came up while testing in my development environment.  Running my complete test suite started to throw OutOfMemoryErrors, and casual monitoring of the heap seemed to indicate that classes that had been reloaded were not being garbage collected.

I tried adding 30 or gc() calls to mine, and that did fix the problem in my little Sandbox environment.  Of course, my development environment is a bit more complicated.

And you're completely correct: finalize() is definitely a weak way to monitor garbage collection.  I relied on it because 1) I'm lazy, and 2) nobody likes bug exposition code that contains dependencies to other libraries.

I'll profile my development environment more closely and see if I can get more specific information about my OutOfMemoryErrors.
Comment 3 Andrew Clement CLA 2007-12-05 05:15:10 EST
Just noting some of my findings...

With hprof and hat I noticed references back to the classloader through the weavingadaptors map - even though I asked hat to not show me weakrefs, these were still displayed, which worried me.  Problem routes to the object appeared to be from CompilationAndWeavingContext.contextMap and from ReflectionBasedReferenceTypeDelegate.BootClassLoader

By making a change to the map from

Map<ClassLoader,WeavingAdaptor>

to

Map<ClassLoader,WeakReference(WeavingAdaptor)>

I was able to entirely eliminate the two problematic references.  But in running the whole test suite I get problems where the weavers are discarded too early because we attempt to define an ajc$ generated class more than once (clearly we already discarded the weaver that knew about the generated type and we build a new weaver and generate it again).  Tests that failed were in AtAjLTWTests

I could investigate a 'load to check if its about before generating/defining' strategy but I need to think through other ramifications of that WeakReference change.
Comment 4 Eric Grunzke CLA 2007-12-06 13:31:42 EST
Created attachment 84651 [details]
5 HTML files from jhat analysis

5 HTML files from jhat analysis of my development heap:
instance_counts - The list of all active instances (by type) in the heap
loader_decorator - The singleton instance of LoaderDecorator.  Confirms the active reference to the current TFCL
current_loader - The current instance of TerraFrameClassLoader.  Is referenced by LoaderDecorator
orphaned_loader - One of 68 orphaned instances of TerraFrameClassLoader.  Is not referenced by LoaderDecorator
rootset_references_to_orphaned_loader - All rootset references to the orphaned instace of TFCL from above.  Excludes weak references.
Comment 5 Eric Grunzke CLA 2007-12-06 13:32:56 EST
I've done some investigation on my real codebase with jmap and jhat, and might have some interesting information.  But first I should briefly describe my design, to give a little context:

I generate a lot of .java files automatically, compiling them with ECJ and loading them via a custom classloader.  LTW then kicks in and weaves in some behavior.  Classloading is accomplished through two classes: LoaderDecorator and TerraFrameClassLoader.  LoaderDecorator is a singleton decorator that extends URLClassloader but delegates to a private instance of TFCL to do all actual loading.  LoaderDecorator maintains the only reference to the current TFCL, so reloading consists of orphaning the existing TFCL and replacing it with a new one.

All that said, I ran one of my test suites that has a lot of reloading, put a breakpoint about halfway through the suite and dumped the heap with jmap.  The instance count shows 1 instance of LoaderDecorator (as expected), but 69 instances of TerraFrameClassLoader (not expected).

I opened an orphaned instance of TFCL in jhat and it showed 6 references from aspectj.  Now, I'm relatively unfamiliar with jmap/jhat, so I don't claim to understand the full depth of what that means.  I'm attaching some HTML files straight out of jhat: hopefully they'll provide you with some info.
Comment 6 Rob Gratz CLA 2008-02-14 16:58:40 EST
I'm glad to see someone else with this problem.  I'm working on a project using AspectJ 1.5.3 with LTW along with a Derby DB.  Derby has a classloader implementation that dynamically creates new distinct classes for each non-prepared statement executed.  As such, every call creates a new class that is never reclaimed due to what I believe is the result of this defect.
Comment 7 Eric Grunzke CLA 2008-02-27 13:35:27 EST
I'm glad to see this is still on the burner.  Just as a quick update, I've implemented a workaround for my project by programmatically calling the AspectJ compiler (as opposed to the ECJ + LTW solution I was using when the bug arose).  This works, as I still get aspect behavior woven into my generated code, but being able to 'lazily' weave at load time had much better performance in my architecture.

I can switch between compile time and load time weaving with a simple change of a constant, so I'll happily test out any solutions to this bug.
Comment 8 Andrew Clement CLA 2008-03-13 12:13:50 EDT
ok.... I have a weaver, I've done a lot of testing of it with profiling and the hat tool and it doesn't persist once the loader is gone in my tests.  With a 32M heap I could create 700 weavers but now I can create them indefinetly.

The weaver is here:

http://www.eclipse.org/downloads/download.php?file=/tools/aspectj/dev/aspectjweaver_leakproof.jar

If you could try it out, that would be fantastic.

Basically to address the problems of it being anchored in the heap, I had to locate each occurrence of the ClassLoader type in the weaver and wrap it in a weak reference.  I found these locations through the use of hprof/hat.  Now I fixed all the ones I found in my simple scenario, but it is feasible there are more and that I missed them because I didn't exercise a particular code path.  For the full fix I would find all occurences of classloader in the weaver and check if they need wrapping, but I wanted to give you something to try out first...
Comment 9 Andrew Clement CLA 2008-03-25 13:04:23 EDT
have had no feedback after I posted on the list for people to try it.  Cannot put the changes in without more testing.  Will revisit for 1.6.1.
Comment 10 Eric Grunzke CLA 2008-03-25 13:45:43 EDT
(In reply to comment #9)
> have had no feedback after I posted on the list for people to try it.  Cannot
> put the changes in without more testing.  Will revisit for 1.6.1.
> 

Apologies for my lack of communication.  I did download the new jar and run a quick test on my development environment.  Unfortunately, surface testing of the new jar appeared to behave similarly to the old one.  I have not had the time to run real tests (like jhat analysis), though, so I didn't want to post without concrete data.

This is a busy time for me at work, which is why I haven't given this the time it deserves.  I'm gonna be busy for a while longer, but I will definitely post again when I've got helpful feedback.

Thanks.
Comment 11 Rob Gratz CLA 2008-03-25 15:33:04 EDT
I have been somewhat in the same boat.  I ran against the new jar (without profiling) and was not able to duplicate the PermGen error I was getting before, but did get an out of memory error.  Without profiling, wasn't sure if it was from some other area or due to the change that was put in.  When I have more time, I will definitely get a more definitive answer on whether the changes have fixed the problem completely.
Comment 12 Andrew Clement CLA 2008-03-26 02:25:55 EDT
Since my original upload, I found another couple of places that could lead to trouble and an anchored weaver - a jhat check if those are the places you are still seeing problems would be very helpful so do let me know if you get the time to check it out.  In the meantime, if I find time I will upload another variant with these addition trouble spots fixed up.
Comment 13 Andrew Clement CLA 2008-05-02 20:19:49 EDT
Ok, I've addressed everything I can find in the weaver, so there should be no more concrete references to the classloader once it is orphaned by the user.  Here it is (I just uploaded it, so it may take a couple of hours to replicate around the servers and the link starts working):

http://www.eclipse.org/downloads/download.php?file=/tools/aspectj/dev/aspectjweaver_leakproof3.jar

I've tested as follows.  I have a simple driver program that creates a custom classloader that loads and weaves all public methods from the jdk tools.jar (~2000) classes.  I am using hprof to get heap info:

-agentlib:hprof=heap=all,cpu=samples,format=b

And then hat to process it.  Looking in the hat output I identify my custom classloader, I look at any references to it excluding weak references (weak will not prevent it being GC'd).  With AspectJ1.6.0, I get these non-weak references to my loader:

org.aspectj.bridge.context.CompilationAndWeavingContext.contextMap

so i have fixed that.

Now, depending on the kind of weaving done, different parts of the weaver will be exercised and may lead to references from places.  The weaver above produces, for me, a completely clean report on hat - there are no-non weak references to my custom loader after it has finished weaving tools.jar and is orphaned.

If anyone tries this, and it doesn't work for you, please collect a suitable heap dump for me to take a look at or report here what the references are (full stack trace if possible)
Comment 14 Andrew Clement CLA 2008-05-02 22:45:34 EDT
I am gradually committing pieces of this change - the ones i'm 100% confident are required initially, and waiting to hear back from anyone who tries it out before I commit everything :)
Comment 15 Andrew Clement CLA 2008-05-26 14:45:29 EDT
Ok - new driver.  This link will become valid shortly once the file has replicated across the servers:

http://www.eclipse.org/downloads/download.php?file=/tools/aspectj/dev/aspectjweaver_leakproof4.jar

I'm stressing this one over and over at the moment, 60mins with 30 threads, each continuously weaving 500 classes in a new classloader then discarding the classloader.  Works perfectly - the weavers attached to loaders are discarded as expected (I have some nice memory profile graphs I will attach at some point)  For a comparison AspectJ1.6.0 lasts 90seconds in the same stress test before blowing the PermGen limit.

The fixes this time relate to:
- better management of the weak references with a reference queue and making sure we do tidy up properly.  
- another problem in CompilationAndWeavingContext that only arises in a multi-threaded environment (horrible NPE)
- removed sharing of bootstrap types, it causes problems and isn't designed properly - I might make it conditional if users think it is helping them but I think a slightly fatter but disposable weaver is better than a lightweight one that can't be gotten rid of.
Comment 16 Andrew Clement CLA 2008-05-27 14:57:26 EDT
these changes are so important I have put them into HEAD.  They pass all my tests and behave well in a multi-threaded stress testcase.  So they will be in a dev build of AspectJ 1.6.1 today.

For consumers there are two differences that may affect them:

- if you implement IWeavingContext yourself, there is one new method 'getClassLoader()' that you need to implement
- due to turning off sharing of bootstrap types (because its a memory leak), the weaver instances may be larger, but I'm not quite sure by how much - really depends what the weaver is doing.  Depending on how seriously this manifests, I will prioritize implementing proper sharing appropriately.  Hopefully the fact that weavers will now go away over time (if their loader goes away...) - this will offset any memory problems with them being larger. (Obviously depends on the scenario)
Comment 17 Andrew Clement CLA 2008-05-27 18:50:28 EDT
pasting in changes made from the build log, in case I need to revisit:

modified 	aclement 	/org.aspectj/modules/loadtime/src/org/aspectj/weaver/loadtime/Aj.java 	05/27/2008 10:53:22 	210470: preventing weaver leaks: big changes to better manage the lifecycle of weaver adaptors
modified 	aclement 	/org.aspectj/modules/loadtime/src/org/aspectj/weaver/loadtime/WeavingURLClassLoader.java 	05/27/2008 10:53:03 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/loadtime/src/org/aspectj/weaver/loadtime/ClassLoaderWeavingAdaptor.java 	05/27/2008 10:52:47 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/loadtime/src/org/aspectj/weaver/loadtime/DefaultWeavingContext.java 	05/27/2008 10:52:33 	210470: preventing weaver leaks: implement new IWeavingContext method
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/tools/PointcutParser.java 	05/27/2008 10:52:24 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/ltw/LTWWorld.java 	05/27/2008 10:51:52 	210470: preventing weaver leaks: removed bootstrap type sharing for now, it is badly designed
modified 	aclement 	/org.aspectj/modules/loadtime/testsrc/org/aspectj/weaver/loadtime/WeavingContextTest.java 	05/27/2008 10:51:23 	210470: preventing weaver leaks: implement new IWeavingContext method
modified 	aclement 	/org.aspectj/modules/weaver5/java5-src/org/aspectj/weaver/reflect/Java15AnnotationFinder.java 	05/27/2008 10:51:06 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver5/java5-src/org/aspectj/weaver/reflect/Java15ReflectionBasedReferenceTypeDelegate.java 	05/27/2008 10:51:06 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver/testsrc/org/aspectj/weaver/bcel/ClassLoaderRepositoryTests.java 	05/27/2008 10:50:53 	210470: preventing weaver leaks: cast to select right ctor in new bcel API
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/reflect/ReflectionBasedReferenceTypeDelegate.java 	05/27/2008 10:50:34 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/reflect/ReflectionWorld.java 	05/27/2008 10:50:34 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWorld.java 	05/27/2008 10:50:19 	210470: preventing weaver leaks: use a weak classloader ref
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/WeakClassLoaderReference.java 	05/27/2008 10:49:58 	210470: preventing weaver leaks: Wraps a classloader reference in a weak reference, for use everywhere inside a weaver
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/loadtime/IWeavingContext.java 	05/27/2008 10:49:31 	210470: preventing weaver leaks: added getClassLoader() to interface
modified 	aclement 	/org.aspectj/modules/lib/bcel/bcel-src.zip 	05/27/2008 10:48:36 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/lib/bcel/bcel.jar 	05/27/2008 10:48:36 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/bridge/src/org/aspectj/bridge/context/CompilationAndWeavingContext.java 	05/27/2008 10:46:58 	210470: preventing weaver leaks: do not mess up the context stack in a multi-threaded environment
modified 	aclement 	/org.aspectj/modules/bcel-builder/src/org/aspectj/apache/bcel/util/ClassLoaderReference.java 	05/27/2008 10:46:34 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/bcel-builder/src/org/aspectj/apache/bcel/util/ClassLoaderRepository.java 	05/27/2008 10:46:34 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/bcel-builder/src/org/aspectj/apache/bcel/util/DefaultClassLoaderReference.java 	05/27/2008 10:46:34 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/bcel-builder/src/org/aspectj/apache/bcel/util/NonCachingClassLoaderRepository.java 	05/27/2008 10:46:34 	210470: preventing weaver leaks: get control of the classloader references in Bcel
modified 	aclement 	/org.aspectj/modules/weaver/src/org/aspectj/weaver/CrosscuttingMembersSet.java 	05/27/2008 10:43:59 	organized imports
modified 	aclement 	/org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/tools/ajc/AjcTestCase.java 	05/27/2008 10:43:36 	allow use of DOLLARsandbox in the arguments to a 'run' test
Comment 18 Andrew Clement CLA 2008-06-08 13:37:40 EDT
Believed fixed - confirmed by myself and a couple of others.

Stale weavers (that are attached to now dead classloaders) are tidied up whenever a new class gets woven.  That is better than spawning a thread to deal with stale weaver removal (threads yuck).  And so there may be a weaver left for the last classloader that gets used and then discarded. To remove that one too, the method: 

int org.aspectj.weaver.loadtime.Aj.removeStaleAdaptors(boolean)

can be called when the environment believes it has finished spawning classloaders.  The boolean tells the method whether to report what it is up to (to stderr, just for debugging really).  The return value is the number of stale adaptors that were tidied.

The current number of active adaptors is returned from:

int org.aspectj.weaver.loadtime.Aj.getActiveAdaptorCount()

These methods are really just for debugging/testing - with the other changes coming such that unused weavers shrink to a fraction of their size over time, having a couple of not yet GC'd weavers around will be no big deal.