Bug 231936 - Improve API description memory usage
Summary: Improve API description memory usage
Status: VERIFIED FIXED
Alias: None
Product: PDE
Classification: Eclipse Project
Component: API Tools (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: 3.6 M5   Edit
Assignee: Michael Rennie CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on: 246139 247509 247510 255222 266695
Blocks: 265525 280464
  Show dependency tree
 
Reported: 2008-05-13 17:00 EDT by Michael Rennie CLA
Modified: 2010-03-01 09:56 EST (History)
5 users (show)

See Also:


Attachments
work in progress (14.13 KB, patch)
2009-04-23 14:34 EDT, Michael Rennie CLA
no flags Details | Diff
fix for on-disk descriptions (42.66 KB, patch)
2009-04-24 14:29 EDT, Michael Rennie CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Rennie CLA 2008-05-13 17:00:16 EDT
Currently the single largest use of memory in API tools are the API descriptions since descriptions are directly dependent on the API package / type count. We should look at ways to minimize the use of LRU / MRU caches for components / packages and we should examine our usage of HashMaps (avoid default sized aps if smaller will suffice and make intellgent guesses at default mapo siezs to minimize map resizing / rehashing).

Some classes to consider (which hold, or are part of, the largest amount of retained memory)

ApiDescriptionManager
ApiProfile
ProjectApiDescription
BundleApiComponent
ReferenceTypeDescriptorImpl
FieldDescriptorImpl
Comment 1 Markus Keller CLA 2008-05-19 11:12:40 EDT
I just ran out of memory with -Xmx400. 100MB of the heap was used by the API Tools.
Comment 2 Mike Wilson CLA 2008-09-11 16:44:17 EDT
Any plans to fix this?
Comment 3 Michael Rennie CLA 2008-09-12 10:16:37 EDT
yes, it is up next, now that the builder test suite is complete.
Comment 4 Darin Wright CLA 2008-09-16 10:27:54 EDT
(In reply to comment #2)
> Any plans to fix this?

We have several work items in 3.5 that will address memory usage by API tools:

* Migrate architecture to support binary or source analysis - Currently, API tools only analyzes binaries. As we move to support source analysis we will leverage the bounded Java model element cache. We will use a similar caching strategy for analysis in a non-OSGi/non-workspace environment (i.e. releng-build).

* Investigate integration as Java compilaion participant  - Currentl API tools uses its own builder and has to create it's own ASTs, incremental dependecy analysis, etc. If we can piggy back as a compilation participant, the total amount of work during a build should be reduced/shared with the Java builder.

However, we should also add a work item to explicitly examine memory use by API descriptions. Currently there is one description per plug-in project - we may need to bound this with some MRU/LRU caching scheme.
Comment 5 Michael Rennie CLA 2008-09-16 11:11:51 EDT
created 247509 for the re-architecting
Comment 6 Michael Rennie CLA 2008-09-16 11:20:15 EDT
created bug 247510 for the builder participant
Comment 7 Michael Rennie CLA 2008-10-28 17:20:17 EDT
bug 246139 removed invalid field items that were bloating API descriptions
Comment 8 Michael Rennie CLA 2008-10-30 10:02:05 EDT
bug 246139 found that we were leaking annotations in to an API description when an invalid tag on a field should be ignored.
Comment 9 Darin Wright CLA 2008-11-20 15:13:14 EST
See bug 256006, which been fixed. We no longer keep entire manifests in memory - only the few (4) headers that we need once BundleDescription objects have been created.
Comment 10 Michael Rennie CLA 2009-03-03 09:10:27 EST
adding bug 266695 as a dependency as part of the changes for the patch reduces the number of members that appear in the .api_description file: having a noticible effect for larger bundles like jdt.ui
Comment 11 Michael Rennie CLA 2009-03-03 09:30:25 EST
one last are we should consider fixing is when we have API types that have no restrictions appearing in the .api_description file. Thee is no need for them to be there (since they are unrestricted) unless they have a child (member) that has restrictions.

Take for example the .api_description for jdt.ui, where we have entries like:

<type handle="=org.eclipse.jdt.ui/ui&lt;org.eclipse.jdt.ui.text.java{IJavaCompletionProposal.java[IJavaCompletionProposal" modificationStamp="1" restrictions="0" visibility="0"/>

type entries that are unrestricted should only show up when a member has restrictions, like:

<type handle="=org.eclipse.jdt.ui/ui&lt;org.eclipse.jdt.ui{ProblemsLabelDecorator.java[ProblemsLabelDecorator" modificationStamp="2" restrictions="0" visibility="0">
            <method name="&lt;init&gt;" restrictions="8" signature="(Lorg/eclipse/jdt/internal/ui/viewsupport/ImageDescriptorRegistry;)V" visibility="0"/>
        </type>

Comment 12 Darin Wright CLA 2009-03-03 10:19:49 EST
API descriptions for binary bundles are sparse as they have static descriptions. API descriptions for workspace projects are not sparse, as they are dynamic. When we lookup a description, we dynamically parse the associated compilation unit for API tags (which is cached with a timestamp in the description). If we don't cache "empty" results, I think there is an issue with being able to tell that there are no tags vs. we have not yet parsed to find out there are no tags.

I think it is safe to not cache types from non-API packages (since we don't analyze descriptions/tags on non-API types), as long as we retain that a package is non-API.
Comment 13 Michael Rennie CLA 2009-04-23 14:34:36 EDT
Created attachment 132994 [details]
work in progress

here is an updated version of the patch that was reverted for bug 266695. the only problem with the original patch was that it did not consider derived visibilities.
Comment 14 Michael Rennie CLA 2009-04-24 14:29:27 EDT
Created attachment 133160 [details]
fix for on-disk descriptions

This patch optimizes our on-disk API descriptions. It also fixes up the APi description export task which was reporting it was the latest version (1.2), but was still writing 1.0 version attributes out.
Comment 15 Michael Rennie CLA 2009-04-24 15:56:54 EDT
applied patch for on-disk API descriptions
Comment 16 Michael Rennie CLA 2009-05-05 15:15:13 EDT
moving to 3.6, as we have had limited success pruning the in-memory descriptions without causing a lot more processor work during a build - more investigation is needed for a solid solution.
Comment 17 Darin Wright CLA 2009-08-21 14:50:14 EDT
The fix to bug 286408 reduces memory used by project API descriptions. The API type containers are no longer duplicated in the API description - they are now retrieved lazily from the API workspace baseline.
Comment 18 Darin Wright CLA 2010-01-12 09:20:54 EST
Marking fixed. Bug 296487 also reduces API tooling memory footprint (sharing OSGi state with PDE).
Comment 19 Darin Wright CLA 2010-01-12 09:32:17 EST
(marking fixed).