Bug 128650 - Performance and memory usage
Summary: Performance and memory usage
Status: RESOLVED FIXED
Alias: None
Product: AspectJ
Classification: Tools
Component: Compiler (show other bugs)
Version: DEVELOPMENT   Edit
Hardware: PC Windows XP
: P2 enhancement (vote)
Target Milestone: 1.5.1   Edit
Assignee: Andrew Clement CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-20 09:46 EST by Andrew Clement CLA
Modified: 2006-04-04 14:26 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Clement CLA 2006-02-20 09:46:32 EST
This bug will track the work done to address these issues, and include some stats.
Comment 1 Andrew Clement CLA 2006-02-20 09:57:28 EST
First batch:

1. In MethodGen, rather than converting the linked list of instructions to an array each time we want to search it, we do it once and reuse the array.
2. For couldMatchKinds support in pointcuts, instead of using a Collection (Set), we now use a bit flag.
3. TypeFactory - use char rather than string manipulation.
4. NamePattern/SignaturePattern - reduced cost of hashcode and equals.
Comment 2 Andrew Clement CLA 2006-02-21 05:47:18 EST
5. Modified reset logic for AjBuildManager to remove the 'world' instances stored against Primitive types (which are anchored in static in ResolvedType) - enables the memory usage to shrink right down after compilation (if states aren't being maintained for later incremental compiles).  Remaining static state (preventing us from getting back to zero heap) would appear to be JDT compiler related.

6. Modified bcel to create Tags rather than Gens.  Tags are cheaper and what we work with anyway.  We used to allow bcel to create gens then iterate through all instructions for a woven type and transform them into tags.  Creating tags directly means: we save creating all the garbage for Gens, we save time transforming the gens to tags.  Tags/Gens relate to local variable and line number information.  Implementation:
  - added a flag to the MethodGen ctor in bcel that allows you to specify what you want.
  - had to push Tag/LineNumberTag/LocalVariableTag down from weaver into BCEL, which meant removing the UnresolvedType dependency that LVTag had - no big deal.
  - rather than 'unpack' line numbers (converting gens to tags) in LazyMethodGen, I now have to ensure all instructions have the tags.  Once we have a lazy method gen, all instructions need to be tagged with the right line number, so that if we extract some instructions to another method (implementing around advice) we want to preserve the tags.  We could jump through some firey hoops to discover the tags at the necessary time, but for now I just put the right tag on all the instructions.  This isnt worse than previously.  We used to unpack the gens and put tags on all instructions, we now go through the tags and just make sure every instruction gets the right one.

  If this optimization goes wrong, it can be deactivated by:
   -Xset:optimizeWithTags=false
Comment 3 Andrew Clement CLA 2006-02-21 06:08:59 EST
time for some stats, although these include some BCI changes in addition to those above that I've not described yet.

Compiling the shadows (our copy of the JDT compiler) module from source (target source.compile) , with -XnotReweavable, -Xlint:ignore and 1.3 source/target set:

Allocated bytes during the whole process with AspectJ1.5.0: 2,232Mb
Allocated bytes during the whole process with AspectJ-ANDY: 1,997Mb

Saving: 235 megabytes (~10%)
Comment 4 Andrew Clement CLA 2006-02-22 10:26:08 EST
7. Ok, the big one.  This is a switch to using the asm bytecode toolkit for creating lightweight delegates that represent types that will not be woven.  So, for example, all the types we load from rt.jar (String/etc) that are merely referenced during compilation/weaving but not actually woven themselves.  Delegates for these types can be more lightweight because we don't need to know anything about the code in the methods and we dont need to know about line number tables or local variable tables.  The option to use lightweight delegates defaults to ON, to run the compiler with it turned OFF, specify:

  -Xset:activateLightweightDelegates=false

This commit also includes a few other optimizations/fixes/etc:
   - The AnnotationAJ type has been introduced to represent an annotation in a bytecode toolkit independent way.  We previously always used bcel annotations during weaving, wrapping them in an AnnotationX. The asm visitors create AnnotationAJ objects, and AnnotationX has been modified to be a facade over either kind.  (Longer term BCEL would spit out AnnotationAJ objects too...) - perhaps should have put these classes in their own package.
   - To unpack annotations as lazily as possible, the usual approach is to store the unpacked versions, some flags that indicate whether unpacking has been done *and* their unpacked form in the members that may get annotated.  This is a lot of wasted space on a method or a field when you consider 99% of the worlds methods and fields are *not* annotated.  So the AnnotationsForMemberHolder class encapsulates the lazy unpacking and requires only one object reference from the method/field.
   - modified who holds the sourcecontext so it isnt stored in too many places.


I am still concerned about the CompilationAndWeavingContext calls we make when matching every single shadow! (see BcelClassWeaver.match)
Comment 5 Andrew Clement CLA 2006-02-22 11:20:33 EST
8. ok, I made the low level context capturing switchable.

   -Xset:captureAllContext=true

   will switch on capturing context information about all shadows being matched (meaning when if AJ ever crashes on you, you can tell which shadow it was weaving at the time)
Comment 6 Andrew Clement CLA 2006-02-23 12:17:53 EST
9. Just discovered the reason why AJDT memory usage just seems to increase over time - or at least what is responsible for 90% of the problem.  The AjBuildManager wraps the incoming messagehandler and sets it on the world - unfortunately on the incremental compiles that follow it was wrapping the old weavermessagehandler and resetting it on the world - so after 10builds you have a chain of 10 weavermessagehandlers.  Each weavermessagehandlers holds onto a compiler instance, which holds on to a ton of state.  The fix is to only wrap it if it isnt already a weavermessagehandler - if it already is then just tell it about a new compiler instance.  In my minimal example, I full built spacewar then did 10 incremental compiles changing only a comment each time, without my fix the heap grows by 0.9megabytes (so imagine scaling that up to a *big* project), with my fix the heap grows by < 0.1megabytes (and I'm still trying to nail that).
Comment 7 Andrew Clement CLA 2006-02-28 06:31:12 EST
10.  Not sure this has been captured anywhere, so I'll mention it here.  Adrian did some work reducing the size of AjState.  The aim was to minimize what we need to keep around and he got down to the point where we were keeping around a copy of the bytecode so that on the subsequent incremental compile we could compare the bytecode produced for a class with the bytecode it had on the previous run and determine if there were any structural changes to the class (new methods/etc).  Structural changes would require dependent projects to get rebuilt.  He has reduced it so that we keep the shape of the class post compile (methods/modifiers/fields/etc) but not the bytecode as that is not involved in the comparison.  Further to this we agreed on a change such that if you aren't going to do a subsequent incremental compile, we don't keep anything at all.  This is different to the 'old' behaviour which always allowed any batch build to be followed by an incremental build.  There are now flags that need to be set to indicate you may follow up a batch build with an incremental build.  The path AJDT takes to calling the compiler sets these correctly (so there are no AJDT changes required) - it should mean command line builds and ant builds use less memory - and get a performance boost not having to calculate this class shape object.
Comment 8 Andrew Clement CLA 2006-02-28 12:29:51 EST
Just some ramblings before the next commit ... related to bug 129297 and Rons patch included in it - i'll stick some comments in there too based on specific commits relating to the patch.

I've been measuring compilation of spacewar, I anchor the world instance that gets used in static so I can look at the 'active' object at the end of compilation - this enables me to see what would be active in memory in either a long running compiler behind an AJDT build or in an LTW system.

Following observations, first number (megs) is a kind of 'max heap we got to' and the second is 'live residual heap' (effectively this second number is basically how much space the world is consuming...)

Compiling spacewar straight: 5.76/5.67
Implementing a weavingCompleted lifecycle method (based on Rons eviction code in the patch, but going a bit further and being actively triggered rather than triggered on GC of some refs): 4.09/4.01
Fix for the static state preserved in Bcel that I stumbled across this morning (*sigh*): 4.01/3.93
Slight backstep as some of the optimizations in my weavingCompleted support only work if compiling -XnoInline: 5.52/5.43
Flushing the xcut member set if there is no scheduled incremental compile (based on Adrians support for that feature): 5.49/5.40
Sticking on -XnoInline again just to see how low it now goes: 3.99/3.91

so, all progress in the right direction - I'm hoping that static bcel state will  address the last blob of memory leak I eluded to in an earlier comment on this bug report.
Comment 9 Andrew Clement CLA 2006-03-04 05:45:04 EST
Fix for static state remaining in Bcel is now committed - removes yet another small memory leak.

Agreed with Ron to make the choice of reference types used for the typemap conditional on the 'world' instance. so LTW can have weak and AJDT can have soft - rather than everyone getting SOFT as is the case right now - might add in an Xset: option for overriding this so it's easy for us to see if we are getting the benefits we expect.

Re: thilos memory situation on the list - does seem important to get some of Rons stuff to do with sharing state across worlds (for common classes) in.
Comment 10 Adrian Colyer CLA 2006-03-05 03:10:31 EST
While we're tracking all this stuff... some more changes I committed that aren't in this thread yet. As part of the AjState reduction, it is very important that there's no-one inadvertantly holding on to references to worlds or states when they shouldn't be. More HAT analysis showed a few places where this was happening, and they have now been fixed. Two of them were singleton aspects (obvious when you say it, but not obvious beforehand,is that the state in a singleton aspect is anchored in static). CompilationAndWeavingContext also held onto expensive refererences in its diagnostic stack (which was also improved to only collect per-thread data when running the weaver multi-threaded), and the Dump class also retained static references.
Comment 11 Adrian Colyer CLA 2006-03-05 03:12:42 EST
Also for the record while I think about it .... the profiling suite we're using to do all this is in tests/profiling and described here: http://dev.eclipse.org/mhonarc/lists/aspectj-dev/msg01963.html
Comment 12 Adrian Colyer CLA 2006-03-05 03:18:34 EST
Final comment (for the moment). The goal I'm working towards with the AjState reductions etc. is to change to a "pipelining" architecture for compilation. Currently we ramp up memory all the way through the source compilation phase, and then when we have source-compiled everything (and held the results in memory), we start the weaving phase. Weaving also works in strict phases across  the set of types exposed to the weaver (e.g. aspects before classes, type mungers before shadow mungers (except for 'late' type mungers... but that's another story). The goal is to try and order the pipeline so that individual types (or small batches of types) can progress through all the stages together, and then have the bulk of the memory released (leaving just a compact structure record behind). This should give our memory usage profile a lower and flatter look (rather than the ramp that it is now). This change is *not* scheduled for 1.5.1.
Comment 13 Andrew Clement CLA 2006-03-10 09:24:30 EST
*sigh*, still some problems with the latest AJDT:

1. the 'active heap' for a big project is less than it used to be but not by as much as expected. We need to find where it is.
2. the state/world/etc for a project isn't GC'd when the project is closed, clearly someone is holding onto it that we've forgotten about.

for part(2), using HAT  I can see these two places are to blame:

Static reference from org.aspectj.weaver.Dump.nodes (from class org.aspectj.weaver.Dump) :
--> java.util.HashMap@0x5000144e (32 bytes) (field table:)
--> Instance of [Ljava.util.HashMap$Entry[];@0x5000144f (20 bytes) (Element 4 of Instance of [Ljava.util.HashMap$Entry[];:)
--> java.util.HashMap$Entry@0x50002074 (16 bytes) (field value:)
--> org.aspectj.weaver.bcel.BcelWorld@0x50001c77 (79 bytes)


Static reference from org.aspectj.ajde.Ajde.INSTANCE (from class org.aspectj.ajde.Ajde) :
--> org.aspectj.ajde.Ajde@0x500001d2 (52 bytes) (field buildManager:)
--> org.aspectj.ajde.internal.AspectJBuildManager@0x500017de (33 bytes) (field compilerAdapter:)
--> org.aspectj.ajde.internal.CompilerAdapter@0x50001aa3 (16 bytes) (field buildManager:)
--> org.aspectj.ajdt.internal.core.builder.AjBuildManager@0x50000dcf (46 bytes) (field state:)
--> org.aspectj.ajdt.internal.core.builder.AjState@0x500015b9 (114 bytes) (field world:)
--> org.aspectj.weaver.bcel.BcelWorld@0x50001c77 (79 bytes) 
Comment 14 Adrian Colyer CLA 2006-03-10 11:32:08 EST
ah yes... Dump - of course. I put some logic in to clear Dump when we have completed a compile via main, but AJDT doesn't go down that route. I did add a reset() or similar method to dump though, so that should be just a case of AJDT calling it at the appropriate time.
Comment 15 Andrew Clement CLA 2006-03-10 14:18:35 EST
I have builds with Dump.reset() called at the appropriate time, but some tests fail - needs a little more tweaking.

But I found where the rest of the memory is - i'll post again when I've double checked my numbers...
Comment 16 Andrew Clement CLA 2006-03-14 11:19:40 EST
Changes committed to address the memory problem - should behave much better now ;)
Still not optimal, but much improved.

Problems were:
- compiler artifacts being held onto by the code building the structure model (anchoring even one leads down a chain that gets you to all of them)
- programelements being bloated, they've been stripped down now.

i see a reduction in 'live' heap at the end of a compiler run of about 200megabytes (from 280 > 80)
Comment 17 Andrew Clement CLA 2006-03-18 07:07:43 EST
still wasnt right ... following changes have now been made:

- We don't cache all programelements created in a handle map

- EclipseSourceContexts are flushed of their compilationresult when weaving is complete (we don't need the context after that..) - it is reset if the file is compiled again. this saves loadsa memory

In all cases this now seems to mean AJDT requires less memory to build a project and holds onto much less between compiles.  Usually 60-70% less !
Comment 18 Andrew Clement CLA 2006-03-18 09:25:13 EST
I havent collected any more formal benchmarks.  But:

Normally to build the JDT compiler we use for AspectJ, I had to start eclipse with 768M of memory or I'd get OutOfMemoryException and we'd crash.  I've just done it reliably with an Xmx of 384M - resting state between compiles is ~100M.  (the compiler is 1054 source .java files and 5 aspects).  There is a fair amount of GC, but it *does* work ;)
Comment 19 Andrew Clement CLA 2006-04-04 14:26:17 EDT
massive amount of work done in 1.5.1 - closing this as documenting the work done for that target release.