Bug 457760 - Need a disk-usage plugin that can check usage without dragging all build data into memory
Summary: Need a disk-usage plugin that can check usage without dragging all build data...
Status: NEW
Alias: None
Product: Hudson
Classification: Technology
Component: Plugins (show other bugs)
Version: 3.1.2   Edit
Hardware: PC Mac OS X
: P3 major (vote)
Target Milestone: ---   Edit
Assignee: Winston Prakash CLA
QA Contact: Geoff Waymark CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-16 17:06 EST by Bob Foster CLA
Modified: 2015-02-02 12:32 EST (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bob Foster CLA 2015-01-16 17:06:07 EST
When a Hudson instance has a very large number of jobs and builds, the job/build cache prevents memory from filling up with them, as long as no plugin drags them all back into memory. Which is what the disk-usage plugin does. We've recently had two different sites report periodic freezes with high CPU utilization and no responsiveness. If memory is set high enough, the problem may resolve itself; otherwise it leads to an OOME exception:

2015-01-16 02:17:23.737:WARN:oeji.nio:handle failed
java.lang.OutOfMemoryError: GC overhead limit exceeded

This is a sign of memory manager thrashing; which it will indeed do if several gigabytes of data is suddenly dragged into memory and weak referenced. If this fills memory near the threshhold, each new object will trigger a gc cycle to find memory, resulting in freeing enough weak referenced data to make room, and so on.

In both such cases, the disk usage plugin was in use and is a strong suspect as a culprit.

Obviously it is not necessary to have the build data reside in memory just to check the locations on disk it resides. We need a new version of the plugin that is cache-aware and perhaps some API extensions to make this possible.
Comment 1 Bob Foster CLA 2015-02-02 12:32:40 EST
The current plan is to incorporate disk-usage functionality into core and remove the disk-usage plugin.

The _runmap.xml files will be extended to incorporate disk usage information, per build and for the job. The job disk usage will include the trend data, the size of which will still be settable in Configure Hudson in the same way.

In the meantime, the disk usage plugin for < 3.3.0 will be rewritten to mimic the above behavior as nearly as possible, by keeping an extra cache file alongside _runmap.xml with the additional usage information.