Bug 367673 - LTW weaved and generated byte code caching
Summary: LTW weaved and generated byte code caching
Status: RESOLVED FIXED
Alias: None
Product: AspectJ
Classification: Tools
Component: LTWeaving (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 enhancement (vote)
Target Milestone: 1.7.0   Edit
Assignee: aspectj inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-30 16:21 EST by John Kew CLA
Modified: 2012-04-17 16:53 EDT (History)
2 users (show)

See Also:


Attachments
patch for caching (55.76 KB, application/octet-stream)
2011-12-30 16:22 EST, John Kew CLA
no flags Details
rss usage (6.26 KB, image/png)
2011-12-30 16:23 EST, John Kew CLA
no flags Details
cpu usage (6.87 KB, image/png)
2011-12-30 16:23 EST, John Kew CLA
no flags Details
Modified patch for caching (62.46 KB, application/octet-stream)
2012-01-08 07:36 EST, John Kew CLA
no flags Details
loadtime module patch (2.93 KB, application/octet-stream)
2012-01-08 07:37 EST, John Kew CLA
no flags Details
weaver patch for just the module (59.53 KB, application/octet-stream)
2012-01-08 07:38 EST, John Kew CLA
no flags Details
Caching where there is a CRC check on the classfile in an index (69.56 KB, application/octet-stream)
2012-02-22 19:30 EST, John Kew CLA
aclement: iplog+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Kew CLA 2011-12-30 16:21:21 EST
Build Identifier: 

Summary:
------------------
Similarly to Eclipse Equinox; I implemented (patch attached) a type of byte-code caching for aspectj, using the WeavingAdapter to hook into the right part of WeavingAdapter.weaveClass. Much of this code *could* exist totally separate from aspectj; the only advantage of having it part of the weaver is that we can ensure we are only caching classes which have passed through the weaver and it allows other folks to make use of it.

Use Cases:
------------------
1. As a user, I would like my container to startup faster with a large number aspects
2. As a user, I would like to reduce the RSS/Resident Size of the JVM process
3. As a user, I would like more visibility into the number and nature of the classes weaved using LTW

Configuration:
------------------
To enable caching set the system properties:
    -Daj.weaving.cache.enabled=true
    -Daj.weaving.cache.dir=/tmp/aspectj-cache/ ( for DefaultFileCacheBacking implementation )

How It Works:
------------------
For each classloader used by ClassLoaderWeavingAdapter a directory will be created within /tmp/aspectj-cache/ with a hash generated from the name of the ClassLoader or the list of URLs if it is a URLClassLoader. These hashes are created with an implementation of the CacheKeyResolver Interface. Within each class-loader scoped directory classes are stored under a hashed name derived from the class name, length of the original bytes, and optionally the md5sum of the original bytes (if -Daj.weaving.cache.hash=true is specified).

Generated classes are cached and handled a little differently; On a restart of the JVM, if the cache already exists, these classes will be passed through the GeneratedClassHandler which will presumably redefine them.

Management:
------------------
WeavedClassCache provides a static method for retrieving a list of all WeavedClassCache objects; which subsequently provide methods for getting a CacheStatistics object and some methods for invalidating items within the cache. More thought is needed here.

Extensibility:
------------------
The interfaces CacheBacking and CacheKeyResolver allow users to override the behavior of the cache.  These need to be set statically on the WeavedClassCache before aspectj creates the WeavingAdapter. This would allow a developer to associate a particular ClassLoader with a web application context by associating the object reference with the application name at start.

Known Patch Issues:
------------------
I have not been working with the aspectj code base for long; so there will probably be a bunch of hidden ugly. While it passes "ant test", and there are a bunch new unit tests, it isn't clear what jvm gauntlet this needs to run through before it is shippable. I may also need to sign some contributor agreement or something.

I'm not sure how it performs with containers other than tomcat; or on OS's other than Linux and MacOSX.

I also do not feel that I have properly explained, or avoided, those conditions under which caching is inadvisable; namely those where the type hierarchy has been modified.

Initial Memory Characteristics:
------------------
I originally started looking into this after discovering that LTW + JIT seems to use an inordinate amount of non-managed memory. Under resource constrained systems, this could create a problem. While I did not find the exact cause of this I wanted to test a theory that caching would create a modest improvement.

We have an isolated perf environment where we could compare RSS/Heap/CPU characteristics of an application with lots of LTW under a reasonably high load (1000 concurrent users). After the class-files are cached on the second execution of the JVM, the heap usage dips by an average of ~50MB (670 -> 622) and the RSS seems to drop by about 100 to 200MB between runs (non-managed memory always seems to vary more widely in these tests). GCT is also reduced, from about 26secs over the course of the run to 20.

Initial Observed CPU Characteristics:
------------------
There seems to be a modest improvement in CPU usage (maybe 1%) but the stddev of the CPU usage drops a bit. It's not completely clear yet what the impact here is. Generally CPU usage always approaches the non-weaved baseline asymptotically for the first 30 minutes of the test (well, + delta for the extra code). With cached classes, it approaches this limit faster.

Reproducible: Always
Comment 1 John Kew CLA 2011-12-30 16:22:15 EST
Created attachment 208883 [details]
patch for caching
Comment 2 John Kew CLA 2011-12-30 16:23:22 EST
Created attachment 208884 [details]
rss usage
Comment 3 John Kew CLA 2011-12-30 16:23:40 EST
Created attachment 208885 [details]
cpu usage
Comment 4 John Kew CLA 2012-01-08 07:36:12 EST
Created attachment 209173 [details]
Modified patch for caching

Modified patch for caching; 

1. Added crc checking of the class (possibly overkill)
2. Made sure that only modified classes were cached ( need to check with andy on this)
3. license headers
4. better docs
5. Changed the ClassLoader scope generator to look at the list of aspects for generating a unique scope. This should make the caching a little bit more conservative.
Comment 5 John Kew CLA 2012-01-08 07:37:19 EST
Created attachment 209174 [details]
loadtime module patch

Same thing as the big patch, but for just the loadtime module
Comment 6 John Kew CLA 2012-01-08 07:38:00 EST
Created attachment 209175 [details]
weaver patch for just the module

Same thing as the big patch, but for just the weaver module
Comment 7 John Kew CLA 2012-01-10 15:40:29 EST
In the modified patch I changed it so that only the classes which were actually transformed were cached (point 2 above). This was at the point where the bytes were just compared (WeavingAdapter:374)

I finally got some perf numbers from this approach and it looks like it may be better to cache everything:

Caching Only where bytes are transformed - Statistics from PS
    psCPU psCPUSDev     RSS       System
1 33.4941   10.4539 1216335    agent1000
2 26.0798    6.2286 1047651 baseline1000

Caching Everything - Statistics from PS
    psCPU psCPUSDev     RSS       System
1 31.9987    8.9866 1135415    agent1000
2 26.2027    6.3209 1045701 baseline1000

The RSS number is a mean for a test of 3600 seconds; but the maximums are similarly lower and closer to the baseline (no aspectj) when caching everything.

In retrospect, this difference makes sense; aspectj may hold on to these untransformed classes to deal with type-hierarchy changes; but it was surprising.

Since this patch attempts to take a very naive approach, I'm inclined to remove the second clause of line 374 in the WeavingAdapter. If we want fine-grained modification of the caching behavior the existing interfaces should be able to provide that.
Comment 8 John Kew CLA 2012-02-22 19:30:22 EST
Created attachment 211463 [details]
Caching where there is a CRC check on the classfile in an index

A new patch generated from the git aspectj with the command:
   org.aspectj(caching-squash) $ git format-patch master --stdout

This includes weaver and loadtime module changes.

The patch is a better version of the caching code, which stores only weaved and generated classes. All classes are registered in a per-classloader index file (in the default backing store ) which has entries for files which should never be transformed.

For my container tests, which consist of tcServer with about five or six different webapps this code caches roughly 231 weaved or generated classes including one index file per classloader. Once the cache is filled the loadtime weaver is rarely engaged and the startup time is quick and memory usage quite a bit lower.

From the start of the tcServer to the point where a single webapp responds we have the following timings and RSS values:

With no aspects nocache:
	1m39.808s	RSS: 374764
	1m40.242s	RSS: 371716
	1m38.849s	RSS: 367816
With fully filled cache:
	0m30.675s	RSS: 255736
	0m30.091s	RSS: 253408
	0m27.639s	RSS: 258392
Comment 9 Andrew Clement CLA 2012-03-01 11:23:43 EST
changes are committed, thanks John.