Bug 354332 - DeltaProcessor exhibits O(N^2) behavior
Summary: DeltaProcessor exhibits O(N^2) behavior
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.8   Edit
Hardware: PC Linux
: P3 major with 3 votes (vote)
Target Milestone: 3.8 M3   Edit
Assignee: Jay Arthanareeswaran CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-10 02:54 EDT by Terry Parker CLA
Modified: 2012-01-09 06:37 EST (History)
8 users (show)

See Also:


Attachments
Moves the call to IJavaProject.resetResolvedClasspath() out of DeltaProcessor.createExternalArchiveDelta()'s loop (2.58 KB, text/plain)
2011-08-10 02:56 EDT, Terry Parker CLA
jarthana: iplog+
Details
Patch with performance test (6.65 KB, patch)
2011-09-07 04:44 EDT, Jay Arthanareeswaran CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Terry Parker CLA 2011-08-10 02:54:22 EDT
Build Identifier: I20110613-1736

We have been seeing instances where a trigger of DeltaProcessor.createExternalArchiveDelta() is hanging Eclipse for multiple minutes for projects with large classpaths (~5000 external jars). That function loops over all of the library jars in the classpath.  In that loop, if a jar has changed it calls IJavaProject.resetResolvedClasspath() "in case it contains a chained jar" (this was introduced in 1.330 to fix http://bugs.eclipse.org/198572).

The problem we are seeing is that on a big update where nearly all of the jars on the classpath are being updated, IProject.resolveClasspath() is reconstructing the resolved classpath needlessly over and over again, causing O(M*N) accesses to jar files, where N is the number of jars on the classpath and M is the number of changed jars.

The attached patch removes the call to IProject.resolveClasspath() from inside the loop and instead keeps a boolean value that tracks whether JavaModelManager.isNonChainingJar() ever returns true and calls IProject.resolveClasspath() only if needed.  isNonChainingJar() is backed by a HashSet.

I tried to follow all of the code paths to see if there would be any side effects to delaying the resolving of chained jars completely outside of the loop.  The bottom half of the loop seemsonly to be creating the delta tree, and the upper part of the loop is only scheduling rather than performing indexer updates, so it looked safe to pull out to me, but I may have missed something.

I did some timing tests on a contrived example where I coppied a single jar hundreds to thousands of times and added all of those copies to the classpath, then updated all of those jar copies with a different jar, and then immediately called IJavaModel.refreshExternalArchives() (whether DeltaProcessor.createExternalArchiveDelta() actually sees changed jar files seems to depend on whether an earlier call to DeltaProcessor.resourceChanged() runs first over all the "classpath change deltas" before the "external archive delta" runs.

                   @HEAD  PATCHED
100 jars updated:     28       23
100 jars updated:     21       28
100 jars updated:     28       41
5000 jars updated: 11617     2419
5000 jars updated: 11644     2841
5000 jars updated: 12376     2686

That is a 4x speed up.  In a more realistic project this change brought the number from ~2 minutes down to 4 seconds.


Reproducible: Always

Steps to Reproduce:
1. Duplicate a jar 5000 times
2. IProject.setRawClasspath() with those jars
3. Change those jars copies to a different jar
4. Call jproject.getJavaModel().refreshExternalArchives()
Comment 1 Terry Parker CLA 2011-08-10 02:56:59 EDT
Created attachment 201207 [details]
Moves the call to IJavaProject.resetResolvedClasspath() out of DeltaProcessor.createExternalArchiveDelta()'s loop
Comment 2 Srikanth Sankaran CLA 2011-08-10 23:34:41 EDT
Jay, please follow up. Thanks.
Comment 3 Jay Arthanareeswaran CLA 2011-08-18 03:59:50 EDT
In an isolated performance test, I see about 36-42% improvement in terms of elapsed process time when I tried with 10 and 100 jars respectively. Good work!

I will take a look at the patch closely again and see if this will have any side-effects.
Comment 4 Jay Arthanareeswaran CLA 2011-08-29 07:11:45 EDT
Patch looks good to me. Besides all existing tests pass. 
Satyam, what do you think? Do we need a new performance test for this scenario?
Comment 5 Satyam Kandula CLA 2011-09-05 01:44:09 EDT
(In reply to comment #4)
> Patch looks good to me. Besides all existing tests pass. 
> Satyam, what do you think? Do we need a new performance test for this scenario?
It will be good to have a test for this.
Comment 6 Jay Arthanareeswaran CLA 2011-09-05 10:21:23 EDT
I have a performance test at hand, which I will add to the patch after some clean-up and one more round of comparison runs.
Comment 7 Jay Arthanareeswaran CLA 2011-09-07 04:44:11 EDT
Created attachment 202867 [details]
Patch with performance test

The same fix with new performance test. Here are some numbers from the test for refreshing 500 jars:

=================
Without Fix:
=================

Scenario 'org.eclipse.jdt.core.tests.performance.FullSourceWorkspaceModelTests#testRefreshExternalArchives()' (average over 10 samples):
  Used Java Heap:        10.11M         (95% in [6.35M, 13.88M])       Measurable effect: 6.66M (1.3 SDs) (required sample size for an effect of 5% of mean: 1735)
  Working Set:           -33.2K         (95% in [-132.85K, 66.45K])    Measurable effect: 176.19K (1.3 SDs) (required sample size for an effect of 5% of stdev: 6401)
  Committed:             -65.2K         (95% in [-257.83K, 127.43K])   Measurable effect: 340.6K (1.3 SDs) (required sample size for an effect of 5% of stdev: 6400)
  Working Set Peak:          0          (95% in [0, 0])               
  Elapsed Process:       21.14s         (95% in [20.35s, 21.94s])      Measurable effect: 1.4s (1.3 SDs) (required sample size for an effect of 5% of mean: 18)
  Kernel time:           13.94s         (95% in [13.3s, 14.59s])       Measurable effect: 1.14s (1.3 SDs) (required sample size for an effect of 5% of mean: 28)
  Page Faults:              74          (95% in [26, 122])             Measurable effect: 85 (1.3 SDs) (required sample size for an effect of 5% of mean: 5273)
  CPU Time:              19.97s         (95% in [19.05s, 20.88s])      Measurable effect: 1.62s (1.3 SDs) (required sample size for an effect of 5% of mean: 27)
  GDI Objects:               0          (95% in [0, 0])       


=============
With Fix:
=============

Scenario 'org.eclipse.jdt.core.tests.performance.FullSourceWorkspaceModelTests#testRefreshExternalArchives()' (average over 10 samples):
  Used Java Heap:        23.08M         (95% in [20.53M, 25.63M])      Measurable effect: 4.51M (1.3 SDs) (required sample size for an effect of 5% of mean: 153)
  Working Set:            14.4K         (95% in [-34.78K, 63.58K])     Measurable effect: 86.96K (1.3 SDs) (required sample size for an effect of 5% of stdev: 6400)
  Committed:              59.6K         (95% in [-60.71K, 179.91K])    Measurable effect: 212.74K (1.3 SDs) (required sample size for an effect of 5% of stdev: 6401)
  Working Set Peak:          0          (95% in [0, 0])               
  Elapsed Process:        8.43s         (95% in [8.31s, 8.55s])        Measurable effect: 216ms (1.3 SDs)
  Kernel time:             6.2s         (95% in [5.97s, 6.43s])        Measurable effect: 408ms (1.3 SDs) (required sample size for an effect of 5% of mean: 18)
  Page Faults:              37          (95% in [22, 52])              Measurable effect: 26 (1.3 SDs) (required sample size for an effect of 5% of mean: 1991)
  CPU Time:               8.25s         (95% in [7.94s, 8.55s])        Measurable effect: 541ms (1.3 SDs) (required sample size for an effect of 5% of mean: 18)
  GDI Objects:               0          (95% in [0, 0])
Comment 8 Jay Arthanareeswaran CLA 2011-09-09 04:52:53 EDT
Released in HEAD with performance test. Thanks for the patch, Terry!
Comment 9 Terry Parker CLA 2011-09-09 11:49:42 EDT
Thanks for getting to the patch so quickly, Jay.

FYI,I patched this into the 3.7.0 JDT release and pushed it to all of our internal Google developers, so this will get a fair bit of usage before the 3.8 release.
Comment 10 Jay Arthanareeswaran CLA 2011-09-16 07:34:08 EDT
Reopening the bug since it's causing the regression bug 357425.
Comment 11 Olivier Thomann CLA 2011-09-16 08:20:39 EDT
Moving to M3. Will be closed again once bug 357425 is fixed.
Comment 12 Srikanth Sankaran CLA 2011-10-19 08:54:08 EDT
(In reply to comment #11)
> Moving to M3. Will be closed again once bug 357425 is fixed.

Jay, does this need to stay open anymore ?
Comment 13 Jay Arthanareeswaran CLA 2011-10-19 09:02:52 EDT
(In reply to comment #12)
> Jay, does this need to stay open anymore ?

No, it doesn't. As bug 357425 has been resolved, this bug can also be marked resolved.
Comment 14 Stephan Herrmann CLA 2011-10-25 09:25:42 EDT
Verified for 3.8 M3 using build N20111022-2000

For performance data I'm relying on comment 7.
The regression reported in bug 357425 has been fixed and verified, too.
Comment 15 Terry Parker CLA 2012-01-04 11:42:16 EST
Can we get this patch and the one in bug 357425 applied to the 3.7 maintenance branch as well?
Comment 16 Jay Arthanareeswaran CLA 2012-01-04 12:28:53 EST
(In reply to comment #15)
> Can we get this patch and the one in bug 357425 applied to the 3.7 maintenance
> branch as well?

Dani, what do you think? This is a performance improvement.
Comment 17 Dani Megert CLA 2012-01-05 05:25:48 EST
(In reply to comment #16)
> (In reply to comment #15)
> > Can we get this patch and the one in bug 357425 applied to the 3.7 maintenance
> > branch as well?
> 
> Dani, what do you think? This is a performance improvement.

-1 to backport. It's not fixing a critical issue and already caused a regression.

Also, it looks like the performance test does not work as it was probably not backported to the reference branch. If so, please do so now.
Comment 18 Jay Arthanareeswaran CLA 2012-01-05 06:59:04 EST
(In reply to comment #17)
> Also, it looks like the performance test does not work as it was probably not
> backported to the reference branch. If so, please do so now.

The performance test is already present in R3_7_maintenance, but looks like perf_37x branch has not been created yet. I am creating it now.
Comment 19 Jay Arthanareeswaran CLA 2012-01-06 03:25:51 EST
(In reply to comment #18)
> The performance test is already present in R3_7_maintenance, but looks like
> perf_37x branch has not been created yet. I am creating it now.

I have pushed the new performance test to perf_37x branch.
Comment 20 Jay Arthanareeswaran CLA 2012-01-09 00:23:55 EST
(In reply to comment #19)
> I have pushed the new performance test to perf_37x branch.

The perf_37x branch has been tagged with the performance test and maps files updated accordingly. We should see the test results from next build.
Comment 21 Dani Megert CLA 2012-01-09 06:37:08 EST
(In reply to comment #20)
> (In reply to comment #19)
> > I have pushed the new performance test to perf_37x branch.
> 
> The perf_37x branch has been tagged with the performance test and maps files
> updated accordingly. We should see the test results from next build.

Nope. It looks like you tagged just today, hence we have to wait for the next baseline build (usually on Friday).