Bug 166246 - Avoid Memory Overhead Per ClassLoader
Summary: Avoid Memory Overhead Per ClassLoader
Status: NEW
Alias: None
Product: AspectJ
Classification: Tools
Component: LTWeaving (show other bugs)
Version: DEVELOPMENT   Edit
Hardware: PC Windows XP
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: aspectj inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-29 13:56 EST by Ron Bodkin CLA
Modified: 2010-11-22 10:26 EST (History)
2 users (show)

See Also:


Attachments
Proposed patch (57.32 KB, patch)
2010-11-01 16:32 EDT, Abraham Nevado CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ron Bodkin CLA 2006-11-29 13:56:04 EST
The biggest issue I see with AspectJ load-time weaving in Java EE containers is the duplication of weaver state per class loader. In Particular, keeping a copy of weaving state for each JSP in a large system can be onerus. This has come up previously in the AspectJ users mailing list and is a major issue for Glassbox users too, e.g., see http://www.glassbox.com/forum/forum/viewthread?thread=92  I measure that AspectJ load-time weaving state for Glassbox is about 1 megabyte per ClassLoader. In a large system you can easily have hundreds and even thousands of loaders per JSP.

I have some ideas for how one might handle this, but since Matthew prefers feature requests without design I'd like to know if there's any prospect of a fix for this major problem any time soon.
Comment 1 Matthew Webster CLA 2006-12-01 08:21:35 EST
The memory footprint for LTW is too large. More importantly its long term footprint is too large. Once the system has reached a steady state with no class loading or weaving then state cached in either the World or Weaver should be available for GC. The trick is to avoid hard references which I believe is the root of the problem.

Here are some things to consider:
1. It is important that we maintain the existing relationship ClassLoader -> Adaptor -> Weaver -> World for the lifetime of the class loader. This enables lifecycle management, namespace and problem diagnosis. I think breaking this chain to save memory only to rebuild it later when further weaving is required would be counter productive.
2. Sharing Weaver/World state is problematic. We have done this for bootstrap classes which has reduced footprint. This cannot reliably be done for other classes/class loaders because of assumptions that would need to be made about their relationships and delegation patterns. This applies to both "non-delegating" J2EE class loaders whose behaviour is not determined by any standard external meta-data and OSGi bundle loaders whose behaviour is dynamic.
3. We cannot discard aspects because they must be constant for the lifetime of a class loader and can we, at least for the moment, discard types that are the targets of ITDs. 

>In a large system you can easily have hundreds and even
>thousands of loaders per JSP.
What is the lifetime of these loaders?
Comment 2 Eugene Kuleshov CLA 2006-12-01 12:04:32 EST
Matthew, jsp classloader live until jsp is reloaded, and reload is very unlikely in production.

Making World collectable seem good idea. Also it might make sense to limit its size, like for other adaptable caching approaches and make possible to load that data again on demand. I was thinking about building something like that within ASM framework.
Comment 3 Ron Bodkin CLA 2006-12-01 13:52:13 EST
Unfortunately as Eugene says, the lifetime of the JSP loaders is typically the time the application is up and running unless it is redeployed (when new ones are created). The loaded JSP class stays in memory and it has a hard reference to its loader. 

It's also noteworthy that the time it takes to reparse definitions and reload aspect definitions can add significant latency (100's of milliseconds each time a new JSP loads with the Glassbox AOP configuration).

JSP loaders are interesting in particular because:
1) they should never have any changes to the aspect configuration of their parent
2) they should always see the exact same resources as the parent loader in particular they can't add new aop.xml configuration to the mix

This might make it easier to optimize (e.g., by sharing some state). I could imagine making a copy of the servlet loader weaver and world, weaving, then just saving off the type map for the jsp loader and discarding the rest of the state (this is a strategy that wouldn't work for other loaders).

The other question I have is how much memory overhead can be optimized by using better data structures to allow caching and sharing of some common configuration parameters. It would be great to measure the memory savings from not having hard references in the world and weaver. I can share recent memory profile output to show the memory use, in particular in the case of these small loaders holding lots of memory.
Comment 4 Abraham Nevado CLA 2010-11-01 16:31:13 EDT
We were facing a similar issue at customer site, impossible to deploy and make run AspectJ instrumenting everything -in a similar way Glassbox used to- with our solution. 

As a proposed solution -and previously discussed with Andy-  we have created a Hybrid HashMap which maps the first 25 weavingAdaptors in memory, the rest are dehydrated to disk and rehydrated when required in a LRU fashion, that is the last used Weaving Adaptor is serialized to disk in case more than 25 Weaving Adaptor are required. We have relayed on current test system to test this function as well as our internal tests.  Additionally it has been running several weeks right now at several customers sites with no issue associated. 

Now we are able to install our solution no matter the number of JSPs, additionally heap pressure has been dramatically reduced in those environments where several JSPs are used.

The main modification has been identifying Serialized objects and make them Serializable to allow everything run smoothly. Some sort of bypass to keep weak references work properly has been as well implemented.

Hope helps,

Best regards.
Comment 5 Abraham Nevado CLA 2010-11-01 16:32:18 EDT
Created attachment 182173 [details]
Proposed patch

Serialization patch to avoid overhead due to several ClassLoaders, specially when hundreds of JSPs are used. Drastically reduces Heap pressure as well on this environments.
Comment 6 Andrew Clement CLA 2010-11-02 11:45:21 EDT
I was just taking a look at the patch.  Serialization isn't the route I want to take in the longer term - I want more control over what gets serialized, so will like go down implementing a real write/read mechanism with my own defined format.  However, I can see that this addresses your problem here.  I am nervous about the changes around weak references as there are no tests for those things - I typically just got them right through endless profiling and so if there are changes I can't be 100% confident problems haven't been introduced.  I presume you are measuring everything to check the references are still behaving?

How large are the serialized weavers?

I'd prefer some control over where the serialized weavers go - isn't it a security risk to just dump them all with File.createTempFile()?

Have you been able to do much testing with rehydrated weavers?  I'd like to know they are still as capable as they were prior to the serialization.  Perhaps we can change the weave to periodically serialize/deserialize itself during a run of the tests (just for a one off test run).
Comment 7 Andrew Clement CLA 2010-11-02 12:02:56 EDT
I see the new Hybrid hashmap type - but I didn't see anyone creating instances of it?
Comment 8 Abraham Nevado CLA 2010-11-22 09:52:41 EST
Regarding the tmpFile: totally agree, a better and more customizable option is required. Do you think adding some sort of -X option including the path, if set up and use the TMP in case no is suitable enough?

Regarding tests, we have been able to run in different production environments serializing and deserializing the weavers without not noticeable impact. 

In fact, you can define the max number of Weavers in memory adjusting HybridHashMAp capacity at construction level. So we in order to do real hard tests we have setup in our devel and QA environments to 1 Weaver in memory, for WebSphere 6.X and 7.x, for Tomcats 5.X and 6.X, and other sort of APP servers and we have been able to work properly and pass all our tests despite only one Weaver is in memory at a time. 

I mean that by constraining the Memory instance to 1 weaver only, everytime a JSP is loaded or whatever the root Weaver is serialized to disk, and afterwards rehydrated from disk... this is a real hard test and seems to pass properly. Of course apart from the experience we are getting from the productions environments where our solution is already deployed and working smoothly.

Anyway as aspectJ has a lot of features and options we could not be using,  I do not know if it is possible to ask for further help to other community members, and add this change as something we are going to test and that can be easily turned off. And wait for others inputs before toggling it on by default.

More things, about the serialized weavers size, I will let you know about the ones at production in the next response (within 50 minutes) one colleague is arriving there will let me know.

And finally about the creation of the HybridHashMap, it is created at Aj.Weaver, during initialization of weavingAdaptors field....

Let me know what do you think about all of this, and we will start coding modifications in order to make this happen :)

Thanks and regards.
Comment 9 Abraham Nevado CLA 2010-11-22 10:26:09 EST
Hi, regarding sizes in disk:

about 11660 bytes for simple JSP classloaders...

Regards.