Bug 457106 - Slow memory leak when doing runtime weaving using IBM JDK 1.6 running on AIX
Summary: Slow memory leak when doing runtime weaving using IBM JDK 1.6 running on AIX
Status: NEW
Alias: None
Product: AspectJ
Classification: Tools
Component: Runtime (show other bugs)
Version: 1.7.3   Edit
Hardware: Power PC AIX
: P3 critical (vote)
Target Milestone: ---   Edit
Assignee: aspectj inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-09 02:58 EST by Rudolf Visagie CLA
Modified: 2015-01-24 14:20 EST (History)
1 user (show)

See Also:


Attachments
Screenshots of verbose GC graph and Eclipse Memory Analyzer output (429.71 KB, application/x-zip-compressed)
2015-01-09 02:58 EST, Rudolf Visagie CLA
no flags Details
Test program for reference handling (1.67 MB, application/octet-stream)
2015-01-14 13:13 EST, Andrew Clement CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rudolf Visagie CLA 2015-01-09 02:58:42 EST
Created attachment 249809 [details]
Screenshots of verbose GC graph and Eclipse Memory Analyzer output

Hi

It would appear that AspectJ 1.7.3 is leaking memory. Please see the attached verbose GC graph as well as the output from Eclipse Memory Analyzer.

#java -version
--------------
java version "1.6.0"
Java(TM) SE Runtime Environment (build pap6460sr15fp1-20140110_01(SR15 FP1))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc64-64 jvmap6460sr15-20131231_180656 (JIT enabled, AOT enabled)
J9VM - 20131231_180656
JIT  - r9_20130920_46510ifx3
GC   - GA24_Java6_SR15_20131231_1152_B180656)
JCL  - 20140107_01

aspectjweaver.jar manifest
--------------------------

Manifest-Version: 1.0
Name: org/aspectj/weaver/
Specification-Title: AspectJ Weaver Classes
Specification-Version: 1.7.3
Specification-Vendor: aspectj.org
Implementation-Title: org.aspectj.weaver
Implementation-Version: 1.7.3
Implementation-Vendor: aspectj.org
Premain-Class: org.aspectj.weaver.loadtime.Agent
Can-Redefine-Classes: true

I have scanned the release release notes of subsequent releases and could not find anything which addresses a memory leak (but did notice that there were memory leaks in prior versions - 1.6.1 I think).
Comment 1 Andrew Clement CLA 2015-01-09 18:17:26 EST
Thanks for the detailed info. Looks like the weaver adaptors are not being tidied up once the associated classloaders are removed.

The reference queue for GC'd loaders is processed the next time an attempt is made to get a weaver for a new loader. Processing this queue removes entries from the map that you appear to be indicating is leaking. In a system where loaders are constantly coming and going this ensures they are tidied up. If there are lots of loaders created and then discarded but no subsequent loader is created, the reference queue would remain unhandled and you'd be left with a chunk of memory held onto by this map, but that isn't necessarily a leak.

I just created a test program that spawned hundreds of classloaders (and I'm running with the weaver javaagent). This created lots of entries in the weavingAdaptors map. I then threw them all away, they remained in the weavingAdaptors map and the reference queue was empty. I then forced GC a few hundred times (I know on some JVMs references need a good prod to indicate they should be processed). I then created one more loader - this triggered processing of the reference queue which emptied out the stale weavingAdaptors entries.  So it is working as expected from my point of view. But I'm not on AIX and not using J9 either.

We have some options:

- You could try using a more recent AspectJ, like 1.8.4. I wouldn't be supplying any fixes on a 1.7.3 base anyway, they'd always be on the latest. It may have inadvertently been an unnoticed regression that has since been fixed.

- Can you use a non J9 JVM? I am aware that different JVMs can treat weak references very differently. Or a later version of the J9 vm?

- Can you debug into the problem code? The interesting methods are in the Aj class. See Aj.checkQ() which processes stale entries that have been added to the reference queue, is that ever finding anything to process? If it isn't then either weak references aren't being queued up correctly when they go stale or the classloaders are not being thrown away and someone else is still holding onto them (in which case the weakreferences AspectJ is holding will continue to live on too)

- If I produced a debug build that checked some of this, could you try that? I'd probably put some extra info in the checkQ() method to tell us if anything is ever being tidied up.

- I could give you my test program to try which verifies the JVM behaviour is what we expect.

- If you are able to inspect the heap - look at who is holding onto the classloaders that are hanging around. Is it just AJ?

From memory I don't think that code has really changed for a few years.
Comment 2 Rudolf Visagie CLA 2015-01-10 09:38:57 EST
(In reply to Andrew Clement from comment #1)
> Thanks for the detailed info. Looks like the weaver adaptors are not being
> tidied up once the associated classloaders are removed.
> 
> The reference queue for GC'd loaders is processed the next time an attempt
> is made to get a weaver for a new loader. Processing this queue removes
> entries from the map that you appear to be indicating is leaking. In a
> system where loaders are constantly coming and going this ensures they are
> tidied up. If there are lots of loaders created and then discarded but no
> subsequent loader is created, the reference queue would remain unhandled and
> you'd be left with a chunk of memory held onto by this map, but that isn't
> necessarily a leak.
> 
> I just created a test program that spawned hundreds of classloaders (and I'm
> running with the weaver javaagent). This created lots of entries in the
> weavingAdaptors map. I then threw them all away, they remained in the
> weavingAdaptors map and the reference queue was empty. I then forced GC a
> few hundred times (I know on some JVMs references need a good prod to
> indicate they should be processed). I then created one more loader - this
> triggered processing of the reference queue which emptied out the stale
> weavingAdaptors entries.  So it is working as expected from my point of
> view. But I'm not on AIX and not using J9 either.
> 
> We have some options:
> 
> - You could try using a more recent AspectJ, like 1.8.4. I wouldn't be
> supplying any fixes on a 1.7.3 base anyway, they'd always be on the latest.
> It may have inadvertently been an unnoticed regression that has since been
> fixed.
> 
> - Can you use a non J9 JVM? I am aware that different JVMs can treat weak
> references very differently. Or a later version of the J9 vm?
> 
> - Can you debug into the problem code? The interesting methods are in the Aj
> class. See Aj.checkQ() which processes stale entries that have been added to
> the reference queue, is that ever finding anything to process? If it isn't
> then either weak references aren't being queued up correctly when they go
> stale or the classloaders are not being thrown away and someone else is
> still holding onto them (in which case the weakreferences AspectJ is holding
> will continue to live on too)
> 
> - If I produced a debug build that checked some of this, could you try that?
> I'd probably put some extra info in the checkQ() method to tell us if
> anything is ever being tidied up.
> 
> - I could give you my test program to try which verifies the JVM behaviour
> is what we expect.
> 
> - If you are able to inspect the heap - look at who is holding onto the
> classloaders that are hanging around. Is it just AJ?
> 
> From memory I don't think that code has really changed for a few years.

Thanks for all the suggestions. Regarding the options:

- Can you please provide both the test program and the debug build? I'd like to run these on AIX before upgrading to 1.8.4 to confirm the existence of a bug.

- Switching/upgrading to another JVM is scheduled (IBM JDK 1.7) but might still be a couple of months away as a lot of regression testing is required

- With regards to the inspecting the heap: Where can I find the classloaders being held by AspectJ? Is it held by the org.aspectj.weaver.loadtime.DefaultWeavingContext class that's a member of the org.aspectj.weaver.loadtime.ClassLoaderWeavingAdaptor class? With the actual reference being kept in the  org.aspectj.weaver.bcel.BcelWeakClassLoaderReference class? If so I will be able to have a look at this when I have access to the full heap dump again.

I should also mention that the system is a Weblogic Portal 10.3.6 server monitored using DynaTrace. To my knowledge the Weblogic Application classloaders and child classloaders are created once when an application (WAR or EAR) is activated. However, I'm not sure what the classloader behavior is of DynaTrace instrumenting the classes.

PS I'm going on a business trip the coming week, so there might be delayed response.
Comment 3 Andrew Clement CLA 2015-01-14 13:13:56 EST
Created attachment 249946 [details]
Test program for reference handling

Unzip this to get the test program.  Then:

cd Refcheck
java -javaagent:aspectjweaver.jar -classpath bin:aspects.jar Runner

The weaver included is AspectJ 1.8.4 - but any vaguely recent one should work.

When I run it:
Creating a bunch of classloaders...
In 6 seconds created 444 classloaders
WeavingAdaptors map size before any classloaders discarded = 444
WeavingAdaptors map size after *ALL* classloaders discarded = 444
This next number will likely be quite large:
WeavingAdaptors map size after hammering GC (but not reference processing) = 444
This next number should be 1
WeavingAdaptors map size after reference processing = 1
Comment 4 Andrew Clement CLA 2015-01-14 14:46:20 EST
> - With regards to the inspecting the heap: Where can I find the class loaders being
> held by AspectJ? Is it held by the org.aspectj.weaver.loadtime.DefaultWeavingContext
> class that's a member of the org.aspectj.weaver.loadtime.ClassLoaderWeavingAdaptor
> class? With the actual reference being kept in the 
> org.aspectj.weaver.bcel.BcelWeakClassLoaderReference class? If so I will be able to
> have a look at this when I have access to the full heap dump again.

The weavingAdaptors map is declared:
 Map<AdaptorKey,ExplicitlyInitializedClassLoaderWeavingAdaptor>

It is the AdaptorKey (which is a subclass of weak reference) that holds the class loader link. Once a class loader is collected, when the reference queue is processed the adaptor key and the associated ExplicitlyInitializedClassLoaderWeavingAdaptor should be removed from the map. Other places will also likely have weak links to the class loader (like the one you mentioned) - any of these routes should be followable in a memory dump. The interesting thing is who is holding it *not* via a weak/soft reference.
Comment 5 Andrew Clement CLA 2015-01-14 14:50:38 EST
We often find that that you don't necessarily want a weaver for every type of class loader that might be created. There are configuration options on more recent AspectJ releases that allow you to exclude a set of class loaders by name (e.g jsp loaders). I know this could be considered 'working around' the problem but it can help performance if you can prevent creation of a weaver that is not going to find anything to do.
Comment 6 Rudolf Visagie CLA 2015-01-14 23:01:38 EST
(In reply to Andrew Clement from comment #3)
> Created attachment 249946 [details]
> Test program for reference handling
> 
> Unzip this to get the test program.  Then:
> 
> cd Refcheck
> java -javaagent:aspectjweaver.jar -classpath bin:aspects.jar Runner
> 
> The weaver included is AspectJ 1.8.4 - but any vaguely recent one should
> work.
> 
> When I run it:
> Creating a bunch of classloaders...
> In 6 seconds created 444 classloaders
> WeavingAdaptors map size before any classloaders discarded = 444
> WeavingAdaptors map size after *ALL* classloaders discarded = 444
> This next number will likely be quite large:
> WeavingAdaptors map size after hammering GC (but not reference processing) =
> 444
> This next number should be 1
> WeavingAdaptors map size after reference processing = 1

Hi Andrew

I ran the test on the same machine where the heap dump was from and it does not seem like it's leaking:

Output for AspectJ 1.8.4:

Creating a bunch of classloaders...
In 6 seconds created 259 classloaders
WeavingAdaptors map size before any classloaders discarded = 258
WeavingAdaptors map size after *ALL* classloaders discarded = 258
This next number will likely be quite large:
WeavingAdaptors map size after hammering GC (but not reference processing) = 258
This next number should be 1
WeavingAdaptors map size after reference processing = 1

Output for AspectJ 1.7.3:

Creating a bunch of classloaders...
In 6 seconds created 246 classloaders
WeavingAdaptors map size before any classloaders discarded = 246
WeavingAdaptors map size after *ALL* classloaders discarded = 246
This next number will likely be quite large:
WeavingAdaptors map size after hammering GC (but not reference processing) = 246
This next number should be 1
WeavingAdaptors map size after reference processing = 1
Comment 7 Rudolf Visagie CLA 2015-01-14 23:18:46 EST
(In reply to Andrew Clement from comment #5)
> We often find that that you don't necessarily want a weaver for every type
> of class loader that might be created. There are configuration options on
> more recent AspectJ releases that allow you to exclude a set of class
> loaders by name (e.g jsp loaders). I know this could be considered 'working
> around' the problem but it can help performance if you can prevent creation
> of a weaver that is not going to find anything to do.

I will have access to the full heap dump on Monday and then be able to check which ClassLoader references are being held. But your suggestion to limit the list of classloaders sounds promising and I think we will be going that route. Are you referring to this option?

-Daj.weaving.loadersToSkip
Comment 8 Andrew Clement CLA 2015-01-24 14:20:49 EST
Yes, that is the option I mean, it is discussed a bit more in https://bugs.eclipse.org/bugs/show_bug.cgi?id=368046