Community
Participate
Working Groups
20040604 New workspace/import all projects binary/linked 1. Search for all types named *. This results in 22.8k types. The operation takes 5 minutes and my task manager shows an increase of > 200M during the operation (from 60M to 290M) Both of these numbers seem high. #2. pressing group by type makes Eclipse non-responsive for 10 minutes. (This is related to bug 63247 (treeviewer slow with many items) - annotating that PR to reflect this case.
Tested the following scenario: - all 3.0 plug-in in linked binary format - 22812 types - open eclipse and forced all indexes to be built - closed/reopened Eclipse - opened search result view - opened search dialog - took memory snapshot - searched for all types (*, declaration) - memory went up ~200 MB - took memory snapshot The actual diff between the two snapshots is 28MB (see attached screen shot). So it seems that we are generating a lot of intermediate garbage.
Created attachment 11631 [details] The difference between Heap before and Heap after search for all types
I created another workspace containing the following projects as binary: org.apache.ant org.eclipse.core.resources org.eclipse.core.resources.win32 org.eclipse.core.runtime org.eclipse.core.runtime.compatibility org.eclipse.jdt.core org.eclipse.osgi org.eclipse.team.core org.eclipse.text org.eclipse.update.configurator Searching for all types in these projects excluding the runtime jars produces a result of 2078 types. While computing the result temporary objects of a total size of ~400MB are created. I will upload the memory dump to our ftp server on Monday. Thomas please have a look at the dump. Philippe, can you please look at the dump as well. It seems that a lot of objects are allocated in JDT/Core.
Created attachment 11632 [details] Screen shot showing the allocated objects
I've looked at the allocation statistics in the 2000 type snapshot. 95% of all allocations happen inside JavaSearchQuery.run(...), which pretty much just calls the search engine. However, only 1% of the allocations occur inside the SearchRequestor I pass to the search engine (see NewSearchResultCollector.acceptSearchMatch(...)). If I understand this right, 94% of the allocations happen in the SearchEngine in Core, outside of our influence. Unless I misunderstand the trace, there's nothing I can do on the search UI side. Moving to JDT-Core.
Scanner#getLineEnds() alone is allocating close to 20Mb of transient objects (copy line tables). Reduced number significantly by reworking clients. There were 2 hot instances. JavadocParser creating a copy of these for every single Javadoc, and source element requestors (where they should have stolen the existing copy from compilation result). Early measurements show the transient memory for this drops down to 3Mb.
Rescheduling for RC3. I have a change in progress which should cut by 2 the source char[] stored in scanner (make unicode support more lazy).
On a smaller testcase (linked jdtcore with prereqs), search for '*' type decls was allocated 1,474 megs of transient memory. With the above scanner optimization, it drops to 971 megs.
Released scanner changes to HEAD, JCK tests are ok (lots of unicode tests). Need to further check the search engine behavior on this scenario. - it should not resolve any type name to find type declarations - would reducing the amount of units processed at once benefit to search (500- >250?). Note: search is usually more memory intensive as build, since it will go and parse source attachments for all binaries (where build simply skip binaries). So search is always dealing with a bigger amount of sources to process.
Several problems remains: 1. The Java model is populated when it is not necessary: - By using IType#isMember(), #isLocal() and #isAnonymous(), SourceMapper#findSourceFileName(IType, IBinaryType) is forcing a new ClassFileReader to be created even if we have one in hand: IBinaryType. -> Propose to change #findSourceFileName(...) to use IBinaryType#isMember(), #isLocal() and #isAnonymous() instead. - By using IMember#getNameRange(), MatchLocator#reportBinaryMemberDeclaration (IResource, IMember, IBinaryTpe, int) is forcing the IMember to be opened. -> Propose to change #reportBinaryMemberDeclaration(...) to use the SourceMapper directly to find the name range if the member is not opened. 2. Resolution of possible matches is always requested even if the search pattern doesn't need it. -> Propose to change MatchLocator#locateMatches(JavaProject,PossibleMatch[], int, int) to skip the resolution and process each possible match if the pattern doesn't need resolution. 3. ASTNodes are kept in its MatchingNodeSet after a possible match has been processed. -> Propose to change MatchLocator#locateMatches(JavaProject,PossibleMatch[], int, int) to nullify the node set when done with the possible match as we do for the source field. With these 4 changes, I'm able to search for all types in a workspace containing all Eclipse SDK plugins and without increasing the VM's maximun Java heap size. All JDT Core and JDT UI tests are green.
Reporting progress slows the whole process a lot also. We report progress for each possible match. Changing this to report progress for each batch of 500 possible matches makes the whole search twice as fast.
Jerome: pls attach patch to this defect.
Created attachment 12160 [details] Proposed patch
Entered bug 67276 against Platform Search for the slowness in their progress monitor.
Testing with the following scenario: - workspace with org.eclipse.jdt.core as a linked binary project - JRE : JDK1.4.2 - group by project in the search view - search for '*' Type Declarations in Workspace 11 169 matches are found. With 3.0 RC2: - memory peak: 403 124 K - time to find all the matches: 1 minute 50 sec With 3.0 RC2 + Philippe's changes + attached patch: - memory peak: 85 240 K - time to find all the matches: 32 sec
Impressive numbers.
Approved by John and I for RC3.
Patch + changed to batch progress reporting released in HEAD.
Verified for 3.0RC3 I200406180010