Bug 444243 - Improve scripts to handle performance tests
Summary: Improve scripts to handle performance tests
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 4.5   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 4.5 M4   Edit
Assignee: David Williams CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 374441 434596
  Show dependency tree
 
Reported: 2014-09-16 07:54 EDT by David Williams CLA
Modified: 2015-06-02 09:29 EDT (History)
1 user (show)

See Also:


Attachments
screen shot showing "history" on personal "build test" machine (24.06 KB, image/png)
2014-11-17 04:45 EST, David Williams CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2014-09-16 07:54:05 EDT
General purpose bug to mark changes related to getting perf tests to run automatically, be "collected" automatically, etc.
Comment 1 David Williams CLA 2014-11-10 07:42:53 EST
made good progress past week/weekend is getting some "real" performance results displayed. ("real" in quotes, since no idea if they are "accurate" or not, ... just getting a glimpse of what used to be displayed). 

As linked from download page, see 

http://download.eclipse.org/eclipse/downloads/drops4/N20141109-2000/performance/performance.php

Still no "fingerprints" and the rest is, I'd estimate, still only about 70% done ... but I think what is there is near useful! 

Beside untangling the doze moving parts, (and in some cases tangling them more!) the major accomplish was getting an Xfvb display running during the analysis, so the "pretty picture" can be displayed. They are not very pretty at the moment ... but suspect that's just a matter of tweaking a few spots?
Comment 2 Markus Keller CLA 2014-11-10 08:48:55 EST
Looks like this broke the rendering of the non-performance test results table:

bad:  http://download.eclipse.org/eclipse/downloads/drops4/N20141109-2000/testResults.php
good: http://download.eclipse.org/eclipse/downloads/drops4/N20141108-1500/testResults.php
Comment 3 David Williams CLA 2014-11-10 10:48:56 EST
(In reply to Markus Keller from comment #2)
> Looks like this broke the rendering of the non-performance test results
> table:
> 
> bad: 
> http://download.eclipse.org/eclipse/downloads/drops4/N20141109-2000/
> testResults.php
> good:
> http://download.eclipse.org/eclipse/downloads/drops4/N20141108-1500/
> testResults.php

Yes, that's actually a "separate bug" (but, fine if I fix as part of this one). 

I had forgotten I was "in the middle" of changing some of the other parts of the "generate index" custom ant task, and had to rebuild that feature to pick up "performance.ui". 

I'll look closer tonight to see if there's an easy fix, or else try to revert to previous "good" state for those other bundles.
Comment 4 David Williams CLA 2014-11-10 10:52:10 EST
FWIW, my original xfvb had 800x600x8 by default, but some of the line graphs had "black blobs" instead of text. 

I set it to 
-screen 0 1024x768x24
and they are legible again. 

On my test machine, I increased to 1600x1200 but did not appear to make a difference, so I think that 1024x768x24 is a good (and required) "minimum" resolution.
Comment 5 David Williams CLA 2014-11-11 01:03:45 EST
(In reply to David Williams from comment #3)
> (In reply to Markus Keller from comment #2)

> 
> Yes, that's actually a "separate bug" (but, fine if I fix as part of this
> one). 
> 
> I had forgotten I was "in the middle" of changing some of the other parts of
> the "generate index" custom ant task, and had to rebuild that feature to
> pick up "performance.ui". 
> 

In the interest of 'fess'n-up, I was wrong. That "mess" wasn't due to the "build tool" changes, but several other bugs in the "performance processing" part of my additions. 

Not sure I can stay up late enough to (confirm the) fix, tonight, but if not, I have at least "turned off" the part that is making the mess. 

I addition, I realized the performance list is shorter than the usual "short list" because I have forgotten to re-enable some that I had temporarily disabled. (Just for quicker "top to bottom" testing on my local machine). 

I do have a question (mostly unrelated to this bug) and that is if 'uircp' should be included in performance tests. We actually do NOT include it in unit tests, due to bug 380553. I'll check soon if "it is now ready". (But, have seen some "hangs" ... but, don't recall if it was that suite, or not ... perhaps only in "baseline" runs?
Comment 6 David Williams CLA 2014-11-11 01:32:19 EST
Just so I have it handy ... this is the commit to revert, if local test builds don't go well before the I-build. 

http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/commit/?id=983f7b3e66b7a7701bd97a275daef02624e5f1a5
Comment 7 David Williams CLA 2014-11-11 17:04:47 EST
(In reply to David Williams from comment #6)
> Just so I have it handy ... this is the commit to revert, if local test
> builds don't go well before the I-build. 
> 
> http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/
> commit/?id=983f7b3e66b7a7701bd97a275daef02624e5f1a5

Surprisingly, the test results from I-build (still) look "messed up" in 

http://download.eclipse.org/eclipse/downloads/drops4/I20141111-0830/testResults.php

I'll see if I can fix "manually". From looking at build machine, it appears hudson results are retrieved to correct place, and if lucky, it's just a side effect that performance test "finishes last" and somehow steps on the unit test "output" location.
Comment 8 David Williams CLA 2014-11-11 22:32:10 EST
(In reply to David Williams from comment #5)
> (In reply to David Williams from comment #3)
> > (In reply to Markus Keller from comment #2)
> 
> > 
> > Yes, that's actually a "separate bug" ...

> > I had forgotten I was "in the middle" of changing some of the other parts of
> > the "generate index" custom ant task, and had to rebuild that feature to
> > pick up "performance.ui". 

> ... I was wrong. That "mess" wasn't due to the
> "build tool" changes, but several other bugs in the "performance processing"
> part of my additions. 
> 

I hate to admit it, but I was wrong, about being wrong! The error was ultimately due to a "half changed" index generator in "build tools" ... so, I corrected that ... and a few other bugs in my performance processing scripts ... and now it looks correct, at 

http://download.eclipse.org/eclipse/downloads/drops4/I20141111-0830/testResults.php

Well, correct except for some formatting problems ... and the fact that there are absolutely no failures which always looks suspicious :) [But, I think accurate, at the summary table, also says zero errors, and the data from that comes from another source -- directly from Hudson. 

And besides that, the patient lived! Performance results still at 
http://download.eclipse.org/eclipse/downloads/drops4/I20141111-0830/performance/performance.php

Some of which "look right" ... though don't but suspect they are explainable by inability of some "4.4 tests" to run against "4.5".   

Some links, such as 
 
 org.eclipse.ant*
   don't have any data ... I assume because those tests fail completely on the "baseline" version (due to change in pre-req, there is a bug for that). 

Some such as 
   org.eclipse.jdt.ui*
     only have one datapoint in the graphs ... I suspect there were many tests that failed (Due to "incompatible change in hierarchy" type errors), but a few that passed? 

Some such as 
   org.eclipse.jdt.ui*
     have two data points (I-build and Reference build) ... but, only one pair since we've only collected one "I-build" so far, so line graphs are not too interesting. 

Nest steps, in order of priority are to get the unit tests results (from the performance tests) to display, so it will be easier to see which fail where, and if they can be fixed, or eventually branched?). 

And, then the "fingerprint graphs". 

AND THEN will add more hudson jobs to run the "long performance tests" as well (such as "jdt.core").  

I'd estimate that will be in place by next week's I-build ... unless there are surprises.
Comment 9 David Williams CLA 2014-11-17 04:45:25 EST
Created attachment 248690 [details]
screen shot showing "history" on personal "build test" machine

To update status: 

I think good progress: The "unit test results" are now showing, for both "current performance tests" and "baseline performance tests" .. and I think in a good way, in that uses same/similar format as "regular" junit results (which apparently used to not be the case?). [Though, some formatting work still needed, such as so tables are same width]. 

One change is that I changed indexer to take a new argument, to NOT display the "missing files" (since that list is so inaccurate, especially for performance tests). Note: behavior is the same for regular unit tests. Hopefully, can create "the right" fix, eventually, but won't be this milestone. I did belatedly think I should at least leave a reminder note they are not displayed (bug 451810). 

But, getting theses unit test to display, was so hard, that I've not looked at "finger prints" at all yet. 

I am beginning to wonder if I should focus on getting "long running" tests running (for I-builds only) and worry about "finger prints" after that? 

The other "issue" -- and purpose of this "attachment" is that apparently only for I-builds is a "history" kept, and displayed. See attachment for an example, from my local test machine, where I normally do only "I-builds". Versus the N-builds on eclipse.org, there are always only "two points", the reference, and current build. 

I think I've seen there that's "configurable" ... need to investigate. 

But, bottom line, having the "unit tests" displayed is a huge step forward for committers, since besides investigating "performance problems", many will need to investigate "how to get performance test" to run again. -- Or, I suppose, if that is not possible, should probably disable performance tests that can not be fixed any time soon, just to avoid the noise and "wasted processing".
Comment 10 David Williams CLA 2014-12-02 12:05:20 EST
I'm going to count the "script part" of this work done. There are still problems with the tests, but I don't think it is with the scripts ... but either the "writing" of data or the "reading" of data. 

Any other "script changes" needed for specific issues would deserve it's own bug.