Bug 35547 - memory footprint in weaver
Summary: memory footprint in weaver
Status: RESOLVED FIXED
Alias: None
Product: AspectJ
Classification: Tools
Component: Compiler (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows 2000
: P2 enhancement (vote)
Target Milestone: 1.5.1   Edit
Assignee: Andrew Clement CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-03-23 13:56 EST by Martin Lippert CLA
Modified: 2006-03-27 03:44 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Lippert CLA 2003-03-23 13:56:04 EST
The footprint of the weaver is pretty high even if classes are not woven. I use
the weaver implementation to weave on a class per class basis in the weaving
class loader and observed an immense memory consumption because every class I
would like to give to the weaver got parsed and it seems to me that all these
parsing information are kept somewhere.
Is this really necessary? Would it be possible to improve the memory footprint
somehow? Maybe be skipping unnecessary information after the parsing?
Even much nicer would be if these is a possibility to do fastmatch without
parsing the incoming class completely. That would reduce the number of created
objects during dynamic class loading tremendiously.
Comment 1 Per S Hustad CLA 2003-08-22 10:26:50 EDT
Is there any activity on this one ?
Comment 2 Martin Lippert CLA 2003-08-25 13:02:25 EDT
For my project I implemented a workaround for this. I enhanced the BcelWorld
class with a special method that removes all elements from the typeMap. This new
method is called after every class that is passed to the weaver. I did a rough
measurement for my setting. Here are the results for weaving about 3000 classes:

- with total cache cleaning after every weaving: 56sec, 53MB
- without cache cleaning at all: 54sec, 118MB
- with deleteSourceObjectType the class that got woven: 51sec, 69MB (this seems
to be the situation where only the instances with delegees are removed from the
cache)

(The numbers are not measured for the weaving activity only, but for using the
weaver for dynamic load-time weaving within the Eclipse platform.)
Comment 3 Andrew Clement CLA 2004-02-25 08:53:08 EST
Martin - can you tell me how you tracked heap usage after emptying the type 
map - was it purely through verbose Garbage Collection messages, or something 
smarter?
Comment 4 Martin Lippert CLA 2004-02-25 13:05:28 EST
Something extremely more stupid. ;-)
I just tracked the memory usage of the complete VM via the Windows Task Manager
information. This was definitely not the most intelligent or exact method but
somehow the easiest. ;-)

I tried it with my load time weaving implementation on a per class base. I
re-initialized the map with the default entries after every loaded class.
Comment 5 Andrew Clement CLA 2004-03-03 07:57:10 EST
I thought I'd write out my findings ...

I've been investigating around the area Martin has talked about.  It does seem 
there are a lot of types added to the weaver early on in compilation that are 
not really required during the weaving phase.   Firstly, I added a method that 
summarized the list of types in the world in terms of those exposed to the 
weaver and those not exposed to the weaver.  I call this method at the start 
of weave() and at the end of weave().  For my smallish project, ~110 java 
files including a few aspects, I got this:

Prior to weave: exposed=109  notexposed=268
 After weaving: exposed=109  notexposed=268

I then added a hook to trim the world by emptying it of the notexposed types 
prior to weave (I know some of them will be brought back in of course, but I 
was interested in seeing how many).  I now got:

Prior to weave: exposed=109  notexposed=268
 After weaving: exposed=109  notexposed=24

So, compiling my app with AspectJ1.1 right now I have about 228 types sitting 
in the weaver during weave() that don't appear to be required for weaving.

How does that affect memory consumption? I changed the methods involved to 
force gc with a System.gc() when they are finished and got the following 
verbosegc output:

Here are the stats for a compile that doesn't trim the world:

Prior to weave: exposed=109  notexposed=268
[Full GC 14086K->12199K(23064K), 0.1604967 secs]
After weaving: exposed=109  notexposed=268
[Full GC 13040K->11137K(23064K), 0.2531081 secs]

So, at the end of a build I am using 11Meg.

Here are the stats when we trim the world before weaving:

Prior to weave: exposed=109  notexposed=268
[Full GC 14129K->12232K(23092K), 0.1983992 secs]
Emptying the world of notexposed types
[Full GC 12236K->7493K(23092K), 0.1182385 secs]
After weaving: exposed=109  notexposed=24
[Full GC 8390K->8098K(23092K), 0.0927232 secs]

We we finish our heap is 8Meg.

So, by emptying the weaver prior to weaving we have a much smaller working set 
of memory during the weave phase.  BUT, in both cases we can see the max heap 
during the whole compilation/weave process is 23Meg.  Emptying the weaver 
isn't bringing the max heap size down - this tells us that the compile phase 
is when memory is really chewed up.

The final test I did was to empty the weaver of unexported types after 
compilation of every file - to see what that did for memory usage:

[Full GC 9514K->8891K(16364K), 0.1414252 secs]

So now our max heap usage is only 16Meg rather than 23Meg.

interesting, eh ?
Comment 6 Martin Lippert CLA 2004-03-05 09:12:34 EST
Interesting. What makes me think is that I am not using the compiler part of
AspectJ for my setting. I am only using the weaver implementation and that seems
to consume a significant amount of memory.

But maybe the thing that is of interest to me is not how much memory is used in
total during the weaving. I observed that the weaver uses more and more memory
if it is used on a per-class base (weaving each class when it is loaded by some
class loader). Therefore two options might might be interesting:

- reducing the memory consumption of the weaver generally (that would help
everybody).
- remove unused objects and therefore free memory as early as possible during
the weaving process (which might help to reduce the total memory consumption as
well as the ever growing consumption if used on a per-class base).

I found the typemap of World to be the most prominent memory consumer if used in
the per-class setting. And I don't know how much of the information is really
needed to weave the next class. The map contains the complete parsed information
for each class that is stored in the map. And I don't know if it would be
possible to somehow throw this information away to free up some memory from time
to time.
Comment 7 Adrian Colyer CLA 2004-03-15 11:27:19 EST
As part of the change to the compile loop, I did a bunch of investigation into
memory usage (to ensure that my changes used the same or less memory than 
before).

I've checked in a couple of enhancements as part of the work on enh 50458. These 
are:

* types that go into the world's type map that are not exposed to the weaver, 
are now stored in a WeakHashMap - this allows the gc to reclaim the memory if it 
needs it, but otherwise consumes memory for maximum performance.

* I've added a new subclass of UnwovenClassFile called 
UnwovenClassFileWithThirdPartyManagedByteCodes (bit of a mouthful I know!). When 
taking compilation results from the compiler (instances of ClassFile), I know 
wrap these using the aforementioned subclass. This avoids the need to take a 
second copy of the bytecodes and store them in a UCF (a ClassFile creates a 
whole new copy when you ask for its bytes).

In a compile that previously used 52MB (AspectJ 1.1.1), the new version will run 
in 40MB (if you force gc'ing), or more of course if you let it.

Here are some things I found out / ideas to investigate when we return to this 
issue (post 1.2 now I suspect):

* We create a lot of copies of byte codes. Roughly speaking it goes as follows:

JDT compiler produces one ClassFile per output class file
Each ClassFile has an UnwovenClassFile created for it
Each UnwovenClassFile has a JavaClass created for it
Each class that needs weaving has a LazyClassGen
Each woven class has a new Unwoven(!)ClassFile created for it

Can we reduce the number of forms? Can we release some copies earlier? 

* The typeMap in the world is very expensive. Entries of type ResolvedTypeX.Name 
have a delegate which is a ResolvedTypeMap.ConcreteName, which is either a 
BcelObjectType or an EclipseSourceType. These contain the JavaClasses, and for 
many of them an ISourceContext which is an instance of EclipseSourceContext. 
EclipseSourceContext holds within it a CompilationResult, which holds within it 
in compiledTypes copies of all the byte codes produced for the compilation unit 
during compile. The type map currently serves two purposes - it is used for 
resolving questions regarding types, and as a repository for the JavaClasses and 
results (JavaClasses are used in answering type questions too though). Can we 
separate out the data structures needed to answer type queries (which must 
persist until the end of the whole weave), from the data structures that hold 
byte code copies (which could potentially be released earlier? A look at the 
State class in the JDT compiler might help us here.

* A lot of the state we keep around is kept for future incremental compiles. If 
we know we're doing a batch compile, and that the process will exit afterwards, 
then we could use a lot less memory for ajc and ant compiles by just never 
keeping any of the incremental state.

* The weaving process creates a lot of garbage - this consumes memory as the gc 
doesn't recover it as fast as it is used and then released. Could we create less 
temp objects somehow? Less temporary byte arrays? (Haven't looked into this)

Comment 8 Andrew Clement CLA 2006-03-27 03:44:16 EST
i'm closing this bug .. much of the suggested work in the comments has been done and now the changes in bug 128650 are through, AspectJ/AJDT creates far less garbage and consumes far less memory.  There are a few more changes to make in LTW to support sharing of some types, but that is being done under more recent bugs raised by Ron (where he has contributed some patches).