311553 – [perfs] Huge regression for HelpServerTest#testServletRead100x()

Bug 311553 - [perfs] Huge regression for HelpServerTest#testServletRead100x()

Summary: [perfs] Huge regression for HelpServerTest#testServletRead100x()

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	User Assistance (show other bugs)
Version:	3.6
Hardware:	PC Windows XP

Importance:	P3 major (vote)
Target Milestone:	3.6 RC2
Assignee:	platform-ua-inbox
QA Contact:

URL:
Whiteboard:
Keywords:	test

Depends on:
Blocks:	286955 313596
	Show dependency tree

Reported:	2010-05-04 10:02 EDT by Frederic Fusier
Modified:	2010-05-26 04:50 EDT (History)
CC List:	4 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Frederic Fusier

2010-05-04 10:02:39 EDT

Verifying 3.6M7 perf results, it appears that I missed to report a regression
on HelpServerTest#testServletRead100x() test which occurs since build
N20100422-2133 (around -100%).

Note that this regression is only noticeable on the RHEL 5.0 Sun 6.0_04 and Win XP Sun 6.0_04 test machines.

Comment 1 Chris Goldthorpe

2010-05-04 12:42:41 EDT

This is almost certainly a regression caused by the new version of Jetty that was included in the build that day, see  Bug 309052 -  Upgrade to Jetty 6.1.23. Simon - any thoughts on this one?

Comment 2 Simon Kaegi

2010-05-04 14:04:39 EDT

Nothing off the top of my head but I'll start digging. Could you narrow this down to the time to accept the connection vs. request performance vs. response performance.

Comment 3 Chris Goldthorpe

2010-05-04 20:03:29 EDT

I'll see what I can find out. So far I have confirmed that the test is slower but have not isolated it down any further.

Comment 4 Simon Kaegi

2010-05-04 23:02:30 EDT

Doing a little profiling I can confirm that this test is slower because of the new version of Jetty. All of our "extra" time is spent in the client calling url.openConnection in LoadServletUtil.readLoadServlet which would indicate Jetty is being slower to respond.

From what I saw in my trace it looked like most requests were served in roughly the same time however every once in a while there was a much larger delay. Also for the servlet itself there was no real difference in terms of the total time spent in doGet so this appears to be something internal to how the server is handling connections.

Comment 5 Simon Kaegi

2010-05-05 10:17:12 EDT

Looking through the differences I noticed there is a new workaround for a jvm bug in Jetty 6.1.24 in org.mortbay.io.nio.SelectorManager.

If we define "-Dorg.mortbay.io.nio.JVMBUG_THRESHHOLD=8192" we will get roughly the same performance as before however I'm more than a little hesistant as I tend to believe Greg knows what he's doing here. If left undefined the default is 512 and from what I'm seeing this means we're pausing for 50ms roughly every 512th request when under load like in this case where we're issuing about 1000 requests per second.

Comment 6 Chris Goldthorpe

2010-05-07 18:42:41 EDT

Simon, are you proposing that we live with the performance hit or is there anything that we can do? I'm guessing that the change which caused the performance hit also fixed a bug so our options are limited.

Comment 7 Simon Kaegi

2010-05-07 22:17:56 EDT

Yes. I'm not comfortable messing with these settings furthermore I think this test is load testing jetty in a way that the regular help system is not used. And even if we were I don't think a 50 ms delay every 500th or so request is going to impact usability.

I'll take this up further with the Jetty team to see if we might do something for 3.6.1+ however I do not think we should mess with this further for 3.6.0. Can we live with this?

Comment 8 Greg Wilkins

2010-05-10 14:21:29 EDT

Typically the 50ms delay (to work around stupid Sun JVM bugs), should not cause a
performance loss...... UNLESS you have a few very busy connections (which unfortunately many benchmarks do).

If you have lots of mostly idle connections, then this latency is not going to be a problem, as it will just batch up the dispatches of all the connections that need to be handled.

The real fix is to get snoracle to fix the bugs, but apparently they are not going to be fixed in java6, only java7.

It could also be a false positive detection of the bug, triggered by a benchmark.
The result of setting JVMBUG_THRESHHOLD to a high value suggests this might be the case.

I'll look at setting this value higher by default.

Comment 9 Chris Goldthorpe

2010-05-12 11:16:31 EDT

Unless there is an objection before the end of this week I think we should close this as WONTFIX.

Comment 10 Frederic Fusier

2010-05-12 11:19:01 EDT

(In reply to comment #9)
> Unless there is an objection before the end of this week I think we should
> close this as WONTFIX.

I'd rather prefer that you add a comment on the test to make it gray and set this bug as FIXED...

Comment 11 Chris Goldthorpe

2010-05-12 12:19:20 EDT

If I can get the results to show as gray that seems like the right thing to do (although I'm still not sure that would count as fixing the bug). How do I cause the test result to be gray instead of red?

Comment 12 Frederic Fusier

2010-05-19 04:37:15 EDT

(In reply to comment #11)
> If I can get the results to show as gray that seems like the right thing to do
> (although I'm still not sure that would count as fixing the bug). How do I
> cause the test result to be gray instead of red?

Sorry Chris, I missed your comment. The way to make the result gray instead of red is just done by adding the following line in the performance test:
    setComment(Performance.EXPLAINS_DEGRADATION_COMMENT, "explanation");

Comment 13 Chris Goldthorpe

2010-05-19 13:33:26 EDT

I have made that change in HEAD. I'll wait till I see some new performance results before closing out this bug and opening a new bug to remove that line in the 3.7 performance base.

Comment 14 Frederic Fusier

2010-05-26 04:50:18 EDT

I close this bug as the test is now well grayed in Platform/UA fingerprints and bug 313892 has been opened to remove the comment in early 3.7 dvpt...