Community
Participate
Working Groups
Created attachment 277744 [details] YourKit snapshot showing the reconciler initialization method take 10 hours to finish. In bug 544921 we report poor performance with a 370k lines of code source file. This is of course an extreme case, however I see that SemanticHighlightingReconciler.reconcilePositions(ASTNode[]) takes 10 hours to finish once I open the source file. The work is done on a thread, not in a job, so once this huge file is opened and Eclipse is closed at some point, the Eclipse process lives on until the reconcile thread finishes. The time is solely spent in: org.eclipse.jdt.core.dom.DefaultBindingResolver.internalGetTypeBinding(TypeBinding, IBinding) There are 2 problems with this: 1. If we know that a file is too big, we should at least be able to turn off features that will run forever. Editing a 370k lines file in a Linux editor (e.g. KWrite) takes a few seconds for some operations (e.g. opening) but otherwise works. 2. The source file with 370k lines of code has a mostly trivial structure, some inner class definitions and a lot of public final fields. The file size is about 30MB. Its absurd that any operation takes 10 hours on this amount of data. There must be some code with sub-optimal complexity somewhere on the stack traces of the attached YourKit snapshot.
Do note that in the attached YourKit snapshot I've experimentally replaced the concurrent hash maps of org.eclipse.jdt.core.dom.DefaultBindingResolver.BindingTables with non-concurrent ones. So the actual running time would actually be even longer than in the snapshot.
To reproduce to some extent, the following code can be used to generate a class: public static void main(String[] args) { System.out.println("package a;"); System.out.println(""); System.out.println("public class A {"); System.out.println(""); System.out.println(" public static final class B extends Unknown {"); System.out.println(" public int x = 2;"); System.out.println(" }"); System.out.println(""); int n = 1_000; int m = 100; for (int i = 0; i < n; ++i) { System.out.println(" public static final class C" + i + " {"); for (int j = 0; j < m; ++j) { System.out.println(" public final B b" + ((i * m) + j) + " = new B();"); } System.out.println(" }"); } System.out.println("}"); } The "key" seems to be the unknown type for the super class of "B". Without this, the reconciler finishes in 4-5 seconds. With this unknown super type the reconciler runs for about: * 544 seconds at 150k lines of code (~5.1M B) * 304 seconds at 125k lines of code (~4.2 MB) * 105 seconds at 100k lines of code (~3.4 MB) * 4 seconds at 75k lines of code * 1 second at 50k lines of code * 200ms at 10k lines of code * less than a second at 10k lines of code.
I do notice that if I occasionally restart the IDE, the time spent in the initialization method drops to a few seconds, even with 150k lines of code. If I on the other hand keep replacing the contents of the source file with a larger class before closing and opening the source file, I observe the really long running time. Even with the source file with which we initially observed the problem in our product (370k lines of code), if I open the file with a freshly restarted IDE the initialization code finishes in 30 seconds. Maybe the backing hash map becomes cluttered at some point and runs into lots of collisions.
OK, the extremely long running times seem to be the product of the following sequence: 1. Open a 100k lines source file (e.g. generate one with code from comment 3). 2. Select all, copy, paste the same 100k lines. 3. Close the file and re-open it. In this case there are a lot of removed positions due to replacing file contents with other contents. There is a linear search at the following location: org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingReconciler.PositionCollector.addPosition(int, int, Highlighting) For each new highlight, the code will look for a previously removed position, by iterating over all removed positions. Since all positions were removed and are being re-added, this has quadratic complexity. I'm not certain how valid of a use case this is, it seems like a product of my profiling actions more than anything.
(In reply to Simeon Andreev from comment #4) > In this case there are a lot of removed positions due to replacing file > contents with other contents. There is a linear search at the following > location: > org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingReconciler.PositionCollector.addPosition(int, > int, Highlighting) May be Noopur can throw some light on this.
(In reply to Simeon Andreev from comment #4) > I'm not certain how valid of a use case this is, it seems like a product of > my profiling actions more than anything. Not sure how to interpret this. Are you saying it is not a real issue but caused by your test steps?
(In reply to Dani Megert from comment #6) > (In reply to Simeon Andreev from comment #4) > > I'm not certain how valid of a use case this is, it seems like a product of > > my profiling actions more than anything. > Not sure how to interpret this. Are you saying it is not a real issue but > caused by your test steps? I've not been able to reproduce very long running times, other than with the following sequence: 1. Open a very large source file (e.g. 100k LOC) in a Java editor. 2. Select all, paste equally large source file contents (again e.g. 100k LOC). 3. Close the editor of the source file. 4. Open the source file again in a Java editor. I can't imagine how that would be a use case for Java development. It seems to be rather unlikely that an Eclipse user would run into the problem with a standard workflow.
(In reply to Simeon Andreev from comment #7) > (In reply to Dani Megert from comment #6) > > (In reply to Simeon Andreev from comment #4) > > > I'm not certain how valid of a use case this is, it seems like a product of > > > my profiling actions more than anything. > > Not sure how to interpret this. Are you saying it is not a real issue but > > caused by your test steps? > > I've not been able to reproduce very long running times, other than with the > following sequence: > > 1. Open a very large source file (e.g. 100k LOC) in a Java editor. > 2. Select all, paste equally large source file contents (again e.g. 100k > LOC). > 3. Close the editor of the source file. > 4. Open the source file again in a Java editor. > > I can't imagine how that would be a use case for Java development. It seems > to be rather unlikely that an Eclipse user would run into the problem with a > standard workflow. I agree. So, are you OK to close it?
(In reply to Dani Megert from comment #8) > I agree. So, are you OK to close it? Andrey?
(In reply to Simeon Andreev from comment #7) > (In reply to Dani Megert from comment #6) > > (In reply to Simeon Andreev from comment #4) > > > I'm not certain how valid of a use case this is, it seems like a product of > > > my profiling actions more than anything. > > Not sure how to interpret this. Are you saying it is not a real issue but > > caused by your test steps? > > I've not been able to reproduce very long running times, other than with the > following sequence: > > 1. Open a very large source file (e.g. 100k LOC) in a Java editor. > 2. Select all, paste equally large source file contents (again e.g. 100k > LOC). > 3. Close the editor of the source file. > 4. Open the source file again in a Java editor. > > I can't imagine how that would be a use case for Java development. It seems > to be rather unlikely that an Eclipse user would run into the problem with a > standard workflow. Yep. If this is the only way to force the performance issue, we can close the bug.