Bug 322807 - [perfs] Not last baseline is used while generating the performance results
Summary: [perfs] Not last baseline is used while generating the performance results
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 3.6   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Satyam Kandula CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-16 11:35 EDT by Frederic Fusier CLA
Modified: 2019-11-14 03:52 EST (History)
4 users (show)

See Also:


Attachments
Sample of launch config to generate HTML results similarily as the Releng Ant script does (2.58 KB, text/plain)
2010-11-17 07:10 EST, Frederic Fusier CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frederic Fusier CLA 2010-08-16 11:35:16 EDT
Looking at results for builds N20100814-2000 and M20100811-0800, it appears that this is not the correct baseline which is used while generating the results.

E.g. for the maintenance build, the generated HTML pages show the following title:
Performance of M20100811-0800 relative to R-3.5-200906111540 (201007261800)

Although in the console log, it's said:
+ no baseline specified => use last one: R-3.5-200906111540_201008091800
Comment 1 Frederic Fusier CLA 2010-08-16 11:36:12 EDT
Increase the severity as this gives to incorrect performance results and may lead to wrong conclusions during the verification!
Comment 2 Frederic Fusier CLA 2010-08-16 11:37:07 EDT
(In reply to comment #1)
> Increase the severity as this gives to incorrect performance results and may
> lead to wrong conclusions during the verification!

I would have said "inaccurate results" instead of "incorrect results". But this issue must still be fixed asap...
Comment 3 Frederic Fusier CLA 2010-08-16 12:06:43 EDT
I've temporarily fixed results for builds N20100814-2000 and M20100811-0800 by generating them manually. It looks like the problem is that last baseline is not added to local files, hence cannot be used while generating the results.

I'll investigate further...
Comment 4 Frederic Fusier CLA 2010-09-14 09:51:57 EDT
As the verification is done using a specific tool and not the generated HTML pages, I think the severity can be reduced and the bug deferred to next milestone.
Comment 5 Dani Megert CLA 2010-10-26 03:59:13 EDT
Ping.
Comment 6 Frederic Fusier CLA 2010-10-26 07:56:14 EDT
Sorry, not for this milestone... :-(
Maybe Satyam could have a look for next one?
Comment 7 Dani Megert CLA 2010-10-26 08:38:48 EDT
(In reply to comment #4)
> As the verification is done using a specific tool and not the generated HTML
> pages, I think the severity can be reduced and the bug deferred to next
> milestone.
Increasing severity again: most people look at the web pages and don't use the tool.
Comment 8 Satyam Kandula CLA 2010-11-16 11:50:40 EST
I do see that there is a lag in the baseline that is used, because of which the tool and the generated html files sometimes use a different baseline. Is that the problem or am I missing anything?
Comment 9 Dani Megert CLA 2010-11-17 06:17:14 EST
Frédéric, can you give Satyam some hints?
Comment 10 Frederic Fusier CLA 2010-11-17 07:09:18 EST
(In reply to comment #8)
> I do see that there is a lag in the baseline that is used, because of which the
> tool and the generated html files sometimes use a different baseline. Is that
> the problem or am I missing anything?

Yes, that's the problem. I seemed that the tool was using the correct baseline although the generation used an older one. The first step is to try to generate the results using the same command line as the one used in the ant script and see which baseline is used during the generation.

I'll attach a launch config I used to generate the HTML results manually from my workspace... If the used baseline is not the same, then it's easy to use the launch config in debug mode (after having changed the directories, of course) to track the problem down.

HTH
Comment 11 Frederic Fusier CLA 2010-11-17 07:10:44 EST
Created attachment 183287 [details]
Sample of launch config to generate HTML results similarily as the Releng Ant script does

Satyam, feel free to contact me if you still have some troubles to reproduce or debug this issue...
Comment 12 Kim Moir CLA 2010-11-17 09:26:32 EST
It seems the performance results generation is failing with the following error message.  Is this related to this bug

!ENTRY org.eclipse.test.performance 4 1 2010-11-17 09:18:51.016
!MESSAGE Execution failed due to a distribution protocol error that caused deallocation of the conversation.  The command requested could not be completed because of a permanent error condition detected at the target system.

!ENTRY org.eclipse.test.performance 4 1 2010-11-17 09:18:51.024
!MESSAGE Execution failed due to a distribution protocol error that caused deallocation of the conversation.  The command requested could not be completed because of a permanent error condition detected at the target system.

!ENTRY org.eclipse.osgi 2 0 2010-11-17 09:18:51.208
!MESSAGE One or more bundles are not resolved because the following root constraints are not resolved:
!SUBENTRY 1 org.eclipse.osgi 2 0 2010-11-17 09:18:51.209
!MESSAGE Bundle update@plugins/org.eclipse.equinox.p2.ui.sdk_1.0.200.v20100927-1600.jar was not resolved.
!SUBENTRY 2 org.eclipse.equinox.p2.ui.sdk 2 0 2010-11-17 09:18:51.209
!MESSAGE Missing required bundle org.eclipse.compare_0.0.0.

!ENTRY org.eclipse.osgi 2 0 2010-11-17 09:18:51.220
!MESSAGE The following is a complete list of bundles which are not resolved, see the prior log entry for the root cause if it exists:
!SUBENTRY 1 org.eclipse.osgi 2 0 2010-11-17 09:18:51.220
!MESSAGE Bundle org.eclipse.equinox.launcher.win32.win32.x86_1.1.100.v20101004 [49] was not resolved.
!SUBENTRY 1 org.eclipse.osgi 2 0 2010-11-17 09:18:51.221
!MESSAGE Bundle org.eclipse.swt.win32.win32.x86_3.7.0.v3712b [69] was not resolved.
!SUBENTRY 1 org.eclipse.osgi 2 0 2010-11-17 09:18:51.221
!MESSAGE Bundle org.eclipse.equinox.p2.ui.sdk_1.0.200.v20100927-1600 [113] was not resolved.
!SUBENTRY 2 org.eclipse.equinox.p2.ui.sdk 2 0 2010-11-17 09:18:51.222
!MESSAGE Missing required bundle org.eclipse.compare_0.0.0.

!ENTRY org.eclipse.osgi 4 0 2010-11-17 09:18:51.222
!MESSAGE Application error
!STACK 1
java.lang.NullPointerException
	at org.eclipse.test.internal.performance.results.db.DB_Results.getLastBaselineBuild(DB_Results.java:669)
	at org.eclipse.test.internal.performance.results.db.PerformanceResults.setDefaults(PerformanceResults.java:766)
	at org.eclipse.test.internal.performance.results.db.PerformanceResults.<init>(PerformanceResults.java:108)
	at org.eclipse.test.performance.ui.GenerateResults.setPerformanceResults(GenerateResults.java:1041)
	at org.eclipse.test.performance.ui.GenerateResults.parse(GenerateResults.java:475)
	at org.eclipse.test.performance.ui.GenerateResults.run(GenerateResults.java:823)
	at org.eclipse.test.performance.ui.Main.start(Main.java:40)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:621)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:576)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1409)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1385)
Comment 13 Satyam Kandula CLA 2010-11-18 06:33:26 EST
(In reply to comment #12)
> It seems the performance results generation is failing with the following error
> message.  Is this related to this bug
> 
> !ENTRY org.eclipse.test.performance 4 1 2010-11-17 09:18:51.016
> !MESSAGE Execution failed due to a distribution protocol error that caused
> deallocation of the conversation.  The command requested could not be completed
> because of a permanent error condition detected at the target system.
I guess the exception is because of this message. I am not able to connect to the server because of this error. I guess this is the reason for non availability of the I build results now.
Comment 14 Satyam Kandula CLA 2010-11-22 03:51:02 EST
(In reply to comment #11)
> Created an attachment (id=183287) [details]
> Sample of launch config to generate HTML results similarily as the Releng Ant
> script does
> 
> Satyam, feel free to contact me if you still have some troubles to reproduce or
> debug this issue...

Frederic, Thanks I am able to generate HTML results using this launcher and am not able to reproduce the problem. I am seeing the proper baselines as expected. 

Here are some other points to note. The N20101118-2000 generated results does show proper baseline (R-3.6-201006080911_201011121800). There was supposed to be a new baseline on 19th, but I don't see the results of them yet in the DB. It probably didn't run. 
 
The N20101120-2000 results are not generated( I don't know why), but I could see the results in the DB. These results are using the 12th baseline as the 19th is not available and doesn't seem to be OK too. 

While debugging, I could see that this could go wrong in two conditions: 1. Non-availability of the baseline build. 2. The particular test not having the result in the baseline build. 

I will monitor the tests in the coming weeks and get more details. 

By the way, the generated results should be correct unless the performance test is modified.
Comment 15 Dani Megert CLA 2010-11-22 10:03:47 EST
>By the way, the generated results should be correct unless the performance test
>is modified.
What about newly added ones?
Comment 16 Frederic Fusier CLA 2010-11-26 12:45:40 EST
(In reply to comment #15)
> >By the way, the generated results should be correct unless the performance test
> >is modified.
> What about newly added ones?

There's no problem for new added tests as soon as the same test has been correctly released in the performance reference stream (e.g. perf_36x for 3.7).

BTW, I think I was too pessimistic on this issue. The baseline result is supposed to be flat for each performance test. Hence, it's not a real big issue not to take the latest one to generate performance results.

If the baseline results is not enough stable, then the performance tools is able to notice it and we can remove the test from the verification, then avoid unnecessary performance alarm...

So, I reduce the severity of this bug to normal and update the title to make it less frightening... :-S

Note three things:
1) It seems to occur only on maintenance branch: for 3.7, it's always the last baseline which is used :-)
2) I was also not able to reproduce this issue locally :-(
3) I've updated the HTML pages with the generated results using last baseline...
Comment 17 Satyam Kandula CLA 2010-12-08 01:19:00 EST
Need some more time to validate this. As the severity is reduced, moving it to the next milestone.
Comment 18 Satyam Kandula CLA 2011-01-27 00:02:13 EST
Moving this again :(
Comment 19 Satyam Kandula CLA 2011-03-09 00:09:17 EST
Moving again :(
Comment 20 Satyam Kandula CLA 2011-05-13 05:16:04 EDT
Still couldn't figure out the real reason. Kim thinks this could be because the baseline process still stays alive even after the test is completed - bug 345331 comment 9.  Now, she made changes to kill the processes that have been hanging, this will probably get fixed - bug 343814 comment 8. 

Will have to wait and watch and hence moving it again.
Comment 21 Satyam Kandula CLA 2011-05-23 08:00:52 EDT
Need more time to test and validate
Comment 22 Satyam Kandula CLA 2011-05-23 08:01:32 EDT
(In reply to comment #21)
> Need more time to test and validate
Clicked the button too quick. 
Need more time to test and validate. Hence moving it to 3.8.
Comment 23 Lars Vogel CLA 2019-11-14 03:52:35 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

If the bug is still relevant, please remove the "stalebug" whiteboard tag.