Community
Participate
Working Groups
Created attachment 225063 [details] profiler session showing the quatratic number of calls to TreeWalk.isPathPrefix() This issue was introduced by commit c9e507c: https://github.com/eclipse/jgit/commit/c9e507c0b9fe9a91ae65dfa8a3bcaa1f739a5686 and related to the file: org.eclipse.jgit/src/org/eclipse/jgit/dircache/DirCache.java I have a use case for which this commit c9e507c actually decreases the performance by 500x when invoking the method dirCacheBuilder.getDirCache().write(). In this scenario: 1) create a new working repository 2) insert 10 random blobs of average size 1MB using objectInserter.insert(); 3) add 100,000 entries into the dirCache, in a for loop, each time creating a new DirCacheEntry at a random path, setting objectId to and objectId from (2) and then dirCacheBuilder.add(dirCacheEntry0); 4) dirCacheBuilder.finish(); 5) dirCacheBuilder.getDirCache().write(); 6) dirCacheBuilder.getDirCache().commit(); the performance issue is noted on step 5) dirCacheBuilder.getDirCache().write(); with this test case on versions: 1.3.0.201202151440-r, 2.0.0.201206130900-r: step 5 executes in approx 150ms with this test case on versions: 2.1.0.201209190230-r, 2.2.0.201212191850-r, 2.3.0-SNAPSHOT (master 706f8eb 20/12/2012): step 5 executes in approx 66,000 ms I appreciate that commit c9e507c introduces an improvement for a different, more typical use case, however this same commit has negative performance effects (for example in the test case illustrated 500x performance decrease). I can send the code of the test case which I have used to interested parties. -- as can been seen by the following profiler session (attached), the performance issue introduced by this commit seems to be due to the quadratic nature of the call which DirCache.updateSmudgedEntries() makes to TreeWalk.isPathPrefix(). The number of invocations of TreeWalk.isPathPrefix() is approximately 0.5*N*N, (where N is the number of DirEntry's).
affected versions are: 2.1.0.201209190230-r, 2.2.0.201212191850-r, 2.3.0-SNAPSHOT issue does not affect versions: 1.3.0.201202151440-r, 2.0.0.201206130900-r.
Created attachment 225064 [details] testcase maven project
Perhaps this one solves the problem: https://git.eclipse.org/r/#/c/9411/
(In reply to comment #3) > Perhaps this one solves the problem: > > https://git.eclipse.org/r/#/c/9411/ Hi Robin, thanks, I tried out your patch on my test case, and list results compared to other versions. Test scenario: "step 5) dirCacheBuilder.getDirCache().write(); with 100,000 entries." * with this test case on versions: 1.3.0.201202151440-r, 2.0.0.201206130900-r: step 5 executes in approx 150 ms * with this test case on versions: 2.1.0.201209190230-r, 2.2.0.201212191850-r, 2.3.0-SNAPSHOT (master 706f8eb 20/12/2012): step 5 executes in approx 66,000 ms * with this test case on version: 2.3.0-SNAPSHOT (patched with https://git.eclipse.org/r/#/c/9411/): step 5 executes in approx 400 ms -- Therefore in this case the performance of the patch (https://git.eclipse.org/r/#/c/9411/) is much better than 2.1.0.201209190230-r, 2.2.0.201212191850-r, 2.3.0-SNAPSHOT (150x improvement), but not as good as 1.3.0.201202151440-r, 2.0.0.201206130900-r (2.5x slowdown). Performance of patch is anyway acceptable on this particular test case (commit on 100,000 entries).
Merged
The fix was not good. It caused an IAE on every of my repositories [1]. This indicates that the comparator is not correct. [1] !MESSAGE An internal error occurred during: "Re-indexing repository eclipse.jdt.debug". !STACK 0 java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeLo(TimSort.java:747) at java.util.TimSort.mergeAt(TimSort.java:483) at java.util.TimSort.mergeCollapse(TimSort.java:408) at java.util.TimSort.sort(TimSort.java:214) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at org.eclipse.jgit.treewalk.filter.PathFilterGroup$Group.<init>(PathFilterGroup.java:222) at org.eclipse.jgit.treewalk.filter.PathFilterGroup$Group.<init>(PathFilterGroup.java:179) at org.eclipse.jgit.treewalk.filter.PathFilterGroup.create(PathFilterGroup.java:143) at org.eclipse.jgit.treewalk.filter.PathFilterGroup.createFromStrings(PathFilterGroup.java:91) at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.calcIndexDiffDataIncremental(IndexDiffCacheEntry.java:400) at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.access$9(IndexDiffCacheEntry.java:381) at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry$5.run(IndexDiffCacheEntry.java:341) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:53)
I'm using org.eclipse.egit.core_2.3.0.201301082326.
Can you post the list of files, or a reference to a place where I can repeat the error. Short-term workaround(?) Set the system property java.util.Arrays.useLegacyMergeSort=true
(In reply to comment #8) > Can you post the list of files, or a reference to a place where I can repeat > the error. 1. start with fresh workspace with Eclipse that has latest EGit 2. clone the jgit repo (ssh://git.eclipse.org/gitroot/jgit/jgit.git) 3. import all projects ==> boom!
Reverted the change. Nothing near that code can work.
(In reply to comment #10) > Reverted the change. Nothing near that code can work. Verified in org.eclipse.jgit_2.3.0.201301091911 that the problem is gone.
New attempt at https://git.eclipse.org/r/#/c/9607/
Merged https://git.eclipse.org/r/#/c/9607/
I did not verify the performance improvement, but verified that it doesn't cause IAEs.