Bug 126901 - [Commands] handlers: ExpressionAuthority and HandlerAuthority cause huge performance degradation
Summary: [Commands] handlers: ExpressionAuthority and HandlerAuthority cause huge perf...
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.2   Edit
Hardware: All All
: P1 critical (vote)
Target Milestone: 3.2 M6   Edit
Assignee: Douglas Pollock CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on: 114363
Blocks:
  Show dependency tree
 
Reported: 2006-02-08 08:51 EST by Dani Megert CLA
Modified: 2006-03-26 09:36 EST (History)
4 users (show)

See Also:


Attachments
method and class diffs between R3.1 and I20060207-1305 (553.17 KB, application/octet-stream)
2006-02-08 08:55 EST, Dani Megert CLA
no flags Details
Patch for when Bug 114363 is Fixed (27.11 KB, patch)
2006-02-22 13:10 EST, Douglas Pollock CLA
no flags Details | Diff
Revised Patch (72.06 KB, patch)
2006-02-28 14:43 EST, Douglas Pollock CLA
no flags Details | Diff
New numbers, comparing R3_1 with I20060314-0800 (396.91 KB, application/octet-stream)
2006-03-23 12:08 EST, Dani Megert CLA
no flags Details
Updated comparsion with R3.1 (463.73 KB, application/x-zip-compressed)
2006-03-24 12:57 EST, Dani Megert CLA
no flags Details
Picture of comparison - all threads - per class (125.57 KB, image/png)
2006-03-26 04:26 EST, Dani Megert CLA
no flags Details
Picture of comparison - all threads - per method (127.62 KB, image/png)
2006-03-26 04:26 EST, Dani Megert CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dani Megert CLA 2006-02-08 08:51:54 EST
I20060131-1200

Many editor tests are badly red:
http://download.eclipse.org/downloads/drops/I20060131-0800/performance/org.eclipse.jdt.text.php?
the "Activate 30 Java Editors" being the worst.

I've profiled the ActivateJavaEditorTest i.e. 5 times activating 30 Java editors. I started profiling after the 10 warmup runs. The attached reports reflect this and show that the huge regression is caused by ExpressionAuthority and HandlerAuthority. During the run ExpressionAuthority methods are called 173 million (!) times and with it methods on ArrayList.Itr.
Comment 1 Dani Megert CLA 2006-02-08 08:55:48 EST
Created attachment 34339 [details]
method and class diffs between R3.1 and I20060207-1305
Comment 2 Dani Megert CLA 2006-02-08 08:57:56 EST
Let me know if you need more data. I can provide backtraces if needed.
Comment 3 Douglas Pollock CLA 2006-02-08 09:19:34 EST
Backtraces would be helpful, if you have them available.
Comment 4 Dani Megert CLA 2006-02-09 02:28:03 EST
Doug, I've sent you an e-mail re comment 3
Comment 5 Douglas Pollock CLA 2006-02-22 13:10:58 EST
Created attachment 35164 [details]
Patch for when Bug 114363 is Fixed

This patch adapts to changes in the proposed patch for Bug 114363.

Even without Bug 114363 being fixed, I've gone ahead and added equals and hashCode to those expressions under control of the workbench.  I've restructured the handler authority so that it attempts to test only one of a set of equivalent expressions.  This allows for a significant speed-up.  I'm still not happy with this approach  While it should be much faster, it still means that every event can trigger O(n) operations; where O(n) is the number of handlers.  My changes have only played with the constant factor on O(n), but have not succeeded in reducing it below O(n).  With two or three hundred handlers per Java editor, this is can be a lot of operations.

The problem is really the centralized approach.  Instead of asking the newly active editor for its handlers, the handler authority is trying to check all handlers that might have been affected by the editor change.  This makes the operation linear over the number of editors, rather than constant.
Comment 6 Douglas Pollock CLA 2006-02-28 14:43:12 EST
Created attachment 35490 [details]
Revised Patch

A revision to the patch taking into account the most recent revisions to the patch in Bug 114363.  I'm going to hold off on committing these changes until I get an answer as to whether the API has been approved.
Comment 7 Dani Megert CLA 2006-03-01 04:48:35 EST
Doug, can you make a statement (in numbers) about the improvements you made (I assume you also profiled your fix) and how this compares to R3.1?

Note that to me the performance results from I20060228-1207 look strange: something else seems to have slowed down things (see my message on platform-releng).
Comment 8 Douglas Pollock CLA 2006-03-01 16:25:28 EST
(In reply to comment #7)
> Doug, can you make a statement (in numbers) about the improvements you made
> (I assume you also profiled your fix) and how this compares to R3.1?

When I speak of the patches, I'm speaking about all of the patches in Bug 114363 and this bug -- in their final desired form.  I ran the activate editor test, and came up with these results:
Elapsed Process (without patches):  18.84s
Elapsed Process (with patches):      8.25s

Curious to see how much I might still be contributing to the problem, I instrumented the methods that serve as entry points into HandlerAuthority (except for updateShellKludge, which is GTK+ specific).  With the patches still in place, I then ran again....

    total time = 353.42s
    handler authority = 13.38s (3.8%)

I tried an alternative approach (commenting out all of the handler authority methods -- including updateShellKludge).  This time round, I got 9.45s, which (oddly enough) is slower than running with the methods in place.

So, I dug a bit deeper, and profiled with the patches in place.  The hot spots were:
 1.) 9.18%  OS.gdk_pixbuf_get_from_drawable
 2.) 9.00%  OS.gdk_window_at_pointer
 3.) 8.73%  OS.g_main_context_iteration
 4.) 7.91%  OS.gdk_window_get_origin
 5.) 5.99%  Display.sleep
 6.) 5.11%  OS.gtk_main_do_event
 7.) 1.87%  OS.gdk_draw_layout
 8.) 1.81%  JavaElement.getElementInfo
 9.) 1.77%  OS.gdk_window_get_events
10.) 1.60%  Lock.unlock
11.) 1.50%  HandlerAuthority.resolveConflicts

Poking around a bit in the same snapshot from which I got the above hot spots, I found that the time seems to be taken up by a few things:
+ Extraneous updating of the Problems view.  I've mentioned this problem to Tod, but haven't opened a bug for it.
+ GTK+ exposing and painting.  This may mean the performance results are different per platform.
+ ToolBarManager.update().  This would have existed in 3.1 (and earlier), and likely won't be improved until 3.3 or later.

When I look at the performance results in the graph, however, I see a much different story.  The CPU time on RHEL3 matches what I see on my own machine, but the Elapsed Process time is crazy.  How does Elapsed Process vary from CPU time?

As a note, if you run with "instrumentation" rather than "sampling" in a profiler, the handler authority will figure more predominantly than it really is.  The methods are called frequently, but they are typically quite quickly (i.e., the results are cached).

As mentioned in comment #5, I still think there is room for improvement in how HandlerAuthority is implemented -- particularly in regards to conflict resolution.  However, I feel I've done as much as is reasonable for now.


I'm leaving this bug open to wait for final resolution on Bug 114363.  The Revised Patch on this bug needs to be applied when Bug 114363 is closed.
Comment 9 Dani Megert CLA 2006-03-02 02:12:46 EST
Doug, thanks a bunch for the first set of numbers showing before and after your patch. Very interesting for us at the consumer end for comparison is to see the numbers on your machine when running 3.1, i.e. what are the numbers for a real life scenario like the one "Activate 30 Java Editors" when using 3.1 and the code you think is going into 3.2.

Thanks.
Comment 10 Douglas Pollock CLA 2006-03-02 13:12:21 EST
I re-ran the 3.2 tests with the patches in place, just to make sure....
Elapsed Process (3.2):  8.37s
Elapsed Process (3.1):  3.13s

The hot spots in 3.1 were:
 1.) 49.27%  Display.sleep
 2.)  4.00%  OS.XReconfigureWMWindow
 3.)  3.85%  OS.gdk_pixbuf_get_from_drawable
 4.)  3.34%  OS.gdk_window_get_origin
 5.)  3.02%  HandlerAuthority.sourceChanged
 6.)  2.63%  OS.g_main_content_iteration
 7.)  2.12%  HandlerAuthority.trimInactive
 8.)  2.03%  OS.gtk_main_do_event
 9.)  1.62%  EvaluationContext.getVariable
10.)  1.27%  LegacyHandlerSubmissionExpression.evaluate
11.)  1.24%  Lock.unlock

Comparing the two, there are 117 problem markers in the 3.2, including 17 errors.  In 3.1, there are 97 warnings and no errors.  I've opened Bug 130171 to to track the refreshing of the Problems view.
Comment 11 Tod Creasey CLA 2006-03-02 15:47:56 EST
Doug you need to run this without the ui tests. The ui tests define a filter on selected element and it's children which will cause an update on editor selection (I am still checking if this is happening too often in Bug 130171).

The 3.1 tests and the performance tests do not have this extra filtering defined.
Comment 12 Douglas Pollock CLA 2006-03-02 15:49:14 EST
With "org.eclipse.ui.tests*" closed, the Elapsed Process time is 7.14s.
Comment 13 Dani Megert CLA 2006-03-03 07:29:16 EST
Doug, the errors depend on the setup (I also saw them once): they should not appear if you start a clean workspace with a 1.4 VM.
Comment 14 Douglas Pollock CLA 2006-03-03 09:40:40 EST
I used a 1.5 VM, so that would explain it.
Comment 15 Tom Hofmann CLA 2006-03-13 09:51:56 EST
(In reply to comment #8)
> 1.) 9.18%  OS.gdk_pixbuf_get_from_drawable
> 2.) 9.00%  OS.gdk_window_at_pointer
> 3.) 8.73%  OS.g_main_context_iteration
> 4.) 7.91%  OS.gdk_window_get_origin

Interesting that you measure these methods taking a lot of time, too. When I compared the 3.1 and I20060301 builds, I found that the time spent in these methods increased by something around 3x (300%). While I cannot explain why this should happen, I filed bug 131107 to track this.

> + ToolBarManager.update().  This would have existed in 3.1 (and earlier), and
> likely won't be improved until 3.3 or later.

My measurements indicate that the time spent in this method was really multiplied by the same factor, due to the above methods taking more time.
Comment 16 Douglas Pollock CLA 2006-03-15 14:59:08 EST
I've adapted to the changes in Expression as outlined in this bug and Bug 114363.
Comment 17 Dani Megert CLA 2006-03-23 12:08:04 EST
Created attachment 36816 [details]
New numbers, comparing R3_1 with I20060314-0800
Comment 18 Dani Megert CLA 2006-03-23 12:08:28 EST
The numbers show huge progress but not yet back to R3_1 performance.
Comment 19 Douglas Pollock CLA 2006-03-23 12:37:23 EST
Dani: For some reason, I can't read your PDFs.  Do they work for you?
Comment 20 Darin Swanson CLA 2006-03-23 12:42:04 EST
I was able to read them fine...
Comment 21 Dani Megert CLA 2006-03-23 13:00:55 EST
Work for me as well.
Comment 22 Dani Megert CLA 2006-03-24 12:57:18 EST
Created attachment 36898 [details]
Updated comparsion with R3.1

I'm sorry but when looking at the main thread only we still see major difference to 3.1, see updated attached numbers.

'Instr' means profiling  using instrumentation and 'sample' means sampling.
Comment 23 Dani Megert CLA 2006-03-24 12:58:08 EST
Reopening since there's sill major degradation compared to R3.1.
Comment 24 Douglas Pollock CLA 2006-03-25 23:26:36 EST
I'm not quite sure what to say.  I still can't view the PDFs, for whatever reason.  The HTML files show HandlerAuthority as a larger hot spot in 3.2 than in 3.1, but (1) it was a hot spot in 3.1; and (2) code that used to be elsewhere is now routed through HandlerAuthority.  So, in some ways, this is expected.

Let's take a look at these in text, so that it's a bit clearer.

In 3.1:
72.42% - 19034 ms - org.eclipse.swt.widgets.Display.sleep()
 2.16% - 567 ms - o.e.ui.internal.handlers.HandlerAuthority.sourceChanged()
 1.56% - 409 ms - org.eclipse.swt.graphics.TextLayout.draw()
 1.18% - 311 ms - org.eclipse.swt.internal.win32.OS.CallWindowProc()
 1.04% - 273 ms - org.eclipse.swt.widgets.Widget.SetWindowPos()

In 3.2:
48.01% - 18626 ms - org.eclipse.swt.widgets.Display.sleep()
5.42% - 2102 ms - org.eclipse.swt.ole.win32.OleFrame.getMsgProc()
3.82% - 1482 ms - o.e.u.internal.handlers.HandlerAuthority.resolveConflicts()
3.20% - 1243 ms - org.eclipse.swt.widgets.Control.forceFocus()
2.42% - 940 ms - o.e.c.i.filesystem.local.LocalFileNatives.fetchFileInfo()

So, while I can appreciate that HandlerAuthority is still high up on the stack, I think you are a long way from being able to claim what is said in this summary ("ExpressionAuthority and HandlerAuthority cause huge performance degradation").

I expected resolveConflicts to take more time.  Code has been re-routed from other methods, and it is simply more feature-rich than in 3.2: all actions declared in extension points are converted into commands and handlers on start-up, retarget actions are now treated purely as commands, etc.

I think it would be more constructive at this point to start answering questions about other items in the list of hot spots.  For example, while resolveConflicts takes 915ms more time in 3.2 than in 3.1 over the span of the test, it seems that OleFrame.getMsgProc takes 2000ms more time.

Also, if you are using YourKit, I would recommend against using it for CPU profiling.  I feel its results tend to be inaccurate.  I speak from the experience of trying to profile perspective switching with Billy back in 3.0.
Comment 25 Dani Megert CLA 2006-03-26 04:24:53 EST
Unfortunately the comparisons can only be generated as PDF and not as HTML. I'll take a picture and attach them here.

>I would recommend against using it for CPU profiling.
I do know that both methods have pros and cons - that's why I attached both ;-)

>I think it would be more constructive at this point to start answering
>questions about other items in the list of hot spots.
Doug, sorry if it looks like it is just this code - it is not and it is not about blaming anyone - I filed PRs for the other hot spots as well (except for the OleFrame.getMsgProc() which I cannot yet explain). In the comparisons where all threads are shown for example, you see that the parser is at the top. This boils down to a code fix we had to put in where JDT Text simply forgot to create the AST. Due to this fix the parser is now on top and it is not something we can change but are happy to understand the difference. So, if you think the new numbers for the handler story reflect the addtional functionality then feel free to close it
Comment 26 Dani Megert CLA 2006-03-26 04:26:07 EST
Created attachment 36951 [details]
Picture of comparison - all threads - per class
Comment 27 Dani Megert CLA 2006-03-26 04:26:59 EST
Created attachment 36952 [details]
Picture of comparison - all threads - per method
Comment 28 Douglas Pollock CLA 2006-03-26 09:36:18 EST
The problem with those two comparison images is that they are using instrumented data.  This will tend to make HandlerAuthority and ExpressionAuthority appear more significant than they actually are.  These two classes are called a *lot* (as you can see), but the execution time of each call should be very small (especially in the case of ExpressionAuthority).  I already mentioned this in comment #8.

This is not to say that HandlerAuthority doesn't take more time.  I'm sure it does.  But I think there are good reasons why it would be taking a little bit longer.

Closing again.  Please open a new bug if you want to talk about improving performance of handlers again.