Bug 377365 - get unit tests running (again) on build.eclipse.org
Summary: get unit tests running (again) on build.eclipse.org
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 4.2   Edit
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: 4.2 RC1   Edit
Assignee: David Williams CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 372880 377453 377592 377670 377680 377857 377859 378303 378631 378784 379013 379018
Blocks: 355430
  Show dependency tree
 
Reported: 2012-04-22 20:15 EDT by David Williams CLA
Modified: 2012-05-23 09:54 EDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2012-04-22 20:15:54 EDT
I guess there's not a general bug for this, not one that I could find ... so, thought I'd open this one to document the general state of getting the unit tests running (again) on build.eclipse.org, via hudson. 

As I understand it, before leaving Kim had tests running for 3.8 builds, but not fully automated and integrated, had solved several issues of running the tests,  and was in the middle of tracking down problems to the new environment, such as 
bug 372880 and others. 

And then I come along and break all that :) 

The first issue I encountered was that the hudson job used to check out both basebuilder and eclipsebuilder (from cvs). Now that was no longer possible, since the eclipsebuilder is in git and, apparently, Hudson only let's you use one scm for its "check out" step. 

So, I changed  it so that it gets the eclipsebuilder from git, added a script there that gets the base builder from cvs, and then invokes what used to be invoked on Hudson. 

I'm focusing on linux first, trying semi-manual invocation of the tests (not from the build process itself, and making progress that way. Things will get more complicated for windows and mac, since, I'm assuming, the code to fetch the base builder, and invoke the original "ant runner xml file" has to be machine specific. 

I'm not sure how much the "base builder" is actually needed? Perhaps some of that code could move to eclipsebuilder so there would be just one "checkout"? 
I'm not sure that much is needed from eclipsebuilder either ... I mean, there are definitely scrips, etc., that live there under the "buildconfigs" directory, but seems like (long term) there should be a whole seperate "testeclipse" project.
Comment 1 David Williams CLA 2012-04-22 20:20:48 EDT
The first (and probably continuing) problem I'm encountering is there is/are many assumptions about "relative directories" ... such as that eclipsebuilder and basebuilder would be "peers" in the directory structure.

That didn't seem very good to me, so trying to solve by "passing in" the directory locations of where they are. The initial steps now work, but not sure if the relative directory issue will resurface. 

I'll also confess, I'm learning the "meaning" of variables a bit by trial and error.  

For example, the test results are coming back to 

/shared/eclipse/eclipse4N/siteDir/eclipse/downloads/drops4/testresults

which was probably supposed to be 
/shared/eclipse/eclipse4N/siteDir/eclipse/downloads/drops4/${buildId}/testresults
Comment 2 David Williams CLA 2012-04-22 20:34:41 EDT
So the test are (sort of) running, and you can (temporarily) see some of the "raw" results at 
http://build.eclipse.org/eclipse/eclipse4N/siteDir/eclipse/downloads/drops4/testresults/2012-02-28_16-58-09/eclipse-testing/results/html/ 

(Not sure yet how/where the summary gets created) 

I've looked at only a few the test results, some pass ... some completely fail (likely due to setup/script isssus). 

And there are "disturbing" messages in the Hudson log. Such as most recent one (which won't be around for long) at 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/eclipse-JUnit-Linux2/46/console

ends with ... 

     [exec] Executing 'bash' with arguments:
     [exec] './testAll.sh'
     [exec] 
     [exec] The ' characters around the executable and arguments are
     [exec] not part of the command.
     [exec] caution: filename not matched:  */plugins/org.eclipse.test*
     [exec]      [java] Java Result: 1
     [exec]      [java] Java Result: 2
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process

After the 
     [exec] caution: filename not matched:  */plugins/org.eclipse.test*
     [exec]      [java] Java Result: 1
Is where the process spins for along time apparently finding and running some tests, since by watching the hudson workspace I can see the test results slowing getting more and more *html and *xml files as it finishes tests. 

And then does eventually "send" a zip file of results back to the build machine process that invoked the test via the invokeTestsJSON.xml file. 

My current plan is to keep working through these issues, learning the scripts, variables, etc., to get them fully working/automated enough so that we can at least present test results for Linux. THEN, I'd start on widows, and then on the mac.
Comment 3 David Williams CLA 2012-04-23 02:06:06 EDT
FWIW, I've tracked down (mostly) the reason for the message about 

caution: filename not matched:  */plugins/org.eclipse.test*

It is coming from a piece of code that says, 

unzip -qq -o -C eclipse-junit-tests-*.zip */plugins/org.eclipse.test* -d eclipse/dropins/

And, the reason it doesn't match, is due to the initial '/' in the pattern. If instead, it said

unzip -qq -o -C eclipse-junit-tests-*.zip *plugins/org.eclipse.test* -d eclipse/dropins/

Then it would match exactly three files: 

org.eclipse.test.performance_3.8.0.N20120422-2000.jar
org.eclipse.test.performance.win32_3.1.100.jar
org.eclipse.test.source_3.3.100.jar

So ... guess we don't care about those anyway, (at the moment) and, I'm assuming, the other tests are unzipped/installed somewhere else ... which I have yet to find. :/ 

This code, btw, is in "runtest*" scripts in 
/org.eclipse.releng.eclipsebuilder/eclipse/buildConfigs/sdk.tests/testScripts
Comment 4 Kim Moir CLA 2012-04-23 21:09:48 EDT
Hi David

I don't know if you know but the there is an invokeTestsJSON.xml script in eclipsebuilder that I was using to invoke the tests in both the 4.2 and 3.8 stream builds.  It would probably have to be updated to reflect the current changes.  Anyways, I thought I would mention this in case this was helpful. Of course, the token in the Hudson job has to be added to the script that invokes the tests on the three platforms.
Comment 5 David Williams CLA 2012-04-23 21:43:11 EDT
(In reply to comment #4)
> Hi David
> 
> I don't know if you know but the there is an invokeTestsJSON.xml script in
> eclipsebuilder that I was using to invoke the tests in both the 4.2 and 3.8
> stream builds.  It would probably have to be updated to reflect the current
> changes.  Anyways, I thought I would mention this in case this was helpful. Of
> course, the token in the Hudson job has to be added to the script that invokes
> the tests on the three platforms.

Yes, luckily you'd given me just enough hints at EclipseCon I could figure this out after only an hour or so of study :) 

To test the unit tests, I'm currently invoking that "manually" (with a "canned" set of values" and waiting waiting waiting for it to run, then waiting waiting waiting waiting for it to finish. I have gotten back results, once or twice, ... but ... several hiccups. And its slow going, since hudson is slow and busy.
Comment 6 David Williams CLA 2012-04-23 23:16:20 EDT
I think I might see the source of problem for some of the "property" issues. 

I do see a file copied to main working directory, called testing.properties, which has values such as 

#directory on test machine where automated testing framework will be installed
testDir=${testDir}

#directory where test scripts are launched
executionDir=${testDir}/eclipse-testing


And, I _suspect_ those "embedded" values are not be expanded to the actual values, either because I've changed where that property file is loaded ... 
or, changed the order of initialization. 

So, I'll be looking at it from that angle.
Comment 7 David Williams CLA 2012-04-26 10:54:24 EDT
Status: 

I think I've fixed the fundamental issues of getting the builds running, again. 

I get the eclipsebuilder, using hudson SCM control, when a build starts. 

As a first build step, I use an ant script in eclipsebuilder to get basebuilder. 

As a second build step, I invoke "antrunner" (from base builder) to run  runTests2.xml (from eclipsebuilder). 

One test actually completed, after about 11 hours, using an I build from a few days ago (and, I don't think I had all the right paths to find parameters, etc., correctly for that build, so may be less accurate than it should be) but ... it did finish! 

1022 failures from 81,369 tests. Sound like right neighborhood. 

Nothing was "copied back" to build machine (the script was originally made to wait only 6 hours to get get results ... I've now changed that to 12 ... but not sure if that's a sign something is badly wrong ... or, if just never got to "completion point" before. But, the build results for that run can be seen on Hudson itself, at 
https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/lastCompletedBuild/testReport/

A similar test on mac too about same amount of time, but failed at the end, I think while producing the test summaries on hudson, with "out of memory" error: 

https://hudson.eclipse.org/hudson/job/eclipse-JUnit-mac2/26/console

The linux tests, on similarly old I-build should finish soon, and I think that one had all the "property paths" basically correct, so should be the most accurate (if it doesn't run out of memory or otherwise crash): 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/eclipse-JUnit-Linux2/64/
Comment 8 David Williams CLA 2012-04-26 11:22:23 EDT
Some other misc. notes: 

There is a job "rmf-nightly" defined in the "Eclipse and Equinox" view. 

From its appearances, its appears unrelated to "Eclipse and Equinox", so unless someone knows otherwise, I'll try and remove it from view, assuming I have permission to edit view? 

Also, there are two versions of all the eclipse tests, such as 

eclipse-JUnit-Linux and 
eclipse-JUnit-Linux2

I _think_ what this is for the differences between 3.8 and 4.2, such as 3.8 downloads are in "drops" 4.2 in "drops4", similar differences in update sites .../updates/3.8-I-builds vs .../updates/4.2-I-builds. 
I don't think there is a need for two jobs, with a a little more bash/ant programming to "compute" the right values, so I just plan to get the "2" versions working and eventually remove the "blank" versions if unused. 

There is also a job on that page called
eclipse-equinox-test-I
no idea yet what that's for. 

There is also a job named 
eclipse-e4-test
which has never ran. 
Not sure if that's for "incubator" or if its something erroneously created, but I'll try and track down. 

If anyone has any insights into these extra jobs, that would be appreciated.
Comment 9 David Williams CLA 2012-04-26 11:28:46 EDT
Oh, I see that the eclipse-e4-test has a description of 

test integration build from the R4_HEAD branch of org.eclipse.releng to test the integration of 3.7 stream bundles with e4 bundles

and last "attempt" to run it was 1 year ago. So, pretty sure that safely be removed. 


Plus, a scary thought, I wonder if the "12 hour instead of 6 hour" time differences are due to performance differences between 4.2 and 3.8. Guess that means 3.8 I builds deserve some quick testing attention. (I've been focused on 4.2 I builds, so far ... I plan to look at N builds last).
Comment 10 David Williams CLA 2012-04-26 14:55:11 EDT
(In reply to comment #9)
> Oh, I see that the eclipse-e4-test has a description of 
> 
> test integration build from the R4_HEAD branch of org.eclipse.releng to test
> the integration of 3.7 stream bundles with e4 bundles
> 
> and last "attempt" to run it was 1 year ago. So, pretty sure that safely be
> removed. 
> 
> 

Update on "hudson jobs" in view, I did remove the rmf-nightly and sent a personal email to the "notification" address listed in the job, in case it was supposed to be there, for some hard to grop reason. 

While looking at the whole list of jobs, I also saw 

eclipse-equinox-test-N (as well as eclipse-equinox-test-I) so added it to view, until I figure out what they are, or if they are needed. (they were "last ran" two months ago, so ... might be something Kim was actively working on?) 

Also saw Eclipse-test-1 which looks like one of Kim's early attempts ... so I'll plan on deleting that job, unless someone says otherwise. (last run was over a year ago). 

Also saw eclipse-sdk-perf-test which I don't think is actively being using, but probably plans to, so will leave that (and put in on the "view" so it would not be forgotten about).
Comment 11 David Williams CLA 2012-04-27 00:21:13 EDT
just to document it, I'm simplifying "runTests2.xml" and invokeTestsJSON.xml so that only three variables need to be specified, it can figure out the rest (I think already was ... and invokeTestsJSON.xml just never updated? This means the "parameters" defined for hudson jobs can be greatly reduced. 

Also, I'm adding a change so the test script does not fail outright, if the cvstest.properties file is not found. Those tests would fail without it, but ... seems easier to try and test the build tests "locally" without requiring each and every little thing the full official tests need.
Comment 12 David Williams CLA 2012-04-27 12:28:32 EDT
I'm not sure how to interpret these results, exactly, but after "cranking down" the timeout limit to 15 minutes per suite (bug 377859) the latest attempt to run linux tests (using a 3.8 build) took 6 hours of elapsed time, reporting only 12 failures, but also reporting only 28,722 tests.  Also odd, Hudson's test report summary claims only an hour and a half of elapsed time.  Maybe the other 4.5 hours and missing tests were due to all the suites that took more than 15 minutes? 


https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/eclipse-JUnit-Linux2/72/testReport/

Be aware too that the webmasters are investigating overall sluggishness (bug 377453)  ... looks like there's a few rouge processes ... hope we aren't some of them :\) 

My next experiment might be to turn limit up to 30 minutes, but take out the "known 2 hour suite" (jdt core) temporarily to see if we can get a better picture ... still within 5 or 6 hours?
Comment 13 David Williams CLA 2012-04-29 14:56:22 EDT
Latest status: taking out jdt.core didnt' help much, and with John's suggestion, limited the tests to just, basically "platform" test suites. With the 30 minute upper bound, these (still) take 5 to 6 hours, and I expect there are still 5 or 6 of those that really take more than 30 minutes and end up timing out, due to the artificial limit. 

You can see the current list of tests suites comment out in bug 377718. Of those left "in", if anyone has any suggestions of other known, long running cases, that would be helpful.  

But, at least at 5 or 6 hours I can "see" results reliably, so think I can make progress getting the setup a little better, and the results we get back re-integrated with the build pages. 

But, in parallel, if anyone is able (i.e. has time), especially those of you in "platform or equinox or p2" levels can look at the results directly on "hudson" you might spot some obvious set up issues given the logs and failures that can be seen there. I'll give an out line of "how to look at hudson results" but a) even eventually we should put these an a wiki :) and b) some of you probably know more about it that me. 

The following URLs refer to the "windows" tests for "job 90" but similar for platforms and jobs. 

The "starting point" for looking at all our test jobs is 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/


For over all job results, you'd drill down to platform and recent job, such as 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/90/

The console link on left is a good one to see how unit testing is started, where I added <echoproperties> so you can see everything that is "set" as the tests are called. Here you might find some errors obvious to you, if you are expecting certain values for properties. 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/90/console

You an use the "parameters" link on the left to see what build was actually used for that test run. 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/90/parameters/


To actually "see reulsts" you have to start back as the top, such as 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/

and drill down to and through the "workspace" end up at "results", such as 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/ws/ws/eclipse-testing/results/

From there you can see the HTML results, and the logs, under a platform specific name, such as 

https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/JUnit-win2/ws/ws/eclipse-testing/results/win32.win32.x86_6.0/

So, with all of that information ... should be an easy matter for you to tell me what to fix in the build scripts :)  ... well, at least maybe a hint. (a separate bug, blocking this one, is probably best, if you do see something specific).
Comment 14 David Williams CLA 2012-04-30 15:27:01 EDT
Two important findings: 

a) I can use buildId to put test results in, instead of BUILD_ID. 
Not sure if I've "discovered" this or (more likely) I broke it along the way, but "buildId" is our usual build ID such as I20120429-2000 whereas BUILD_ID is Hudson's build identifier, at the time of the test job itself, such as 2012-04-29_21-32-44. That'll sure make it easier to get our results to a useful location. 

b) Our windows tests results were comping back and identified as being ran with a 1.6 VM, whereas I guess before, on IBM hardware, we were running them with a 1.7 VM? While I'm not positive what's being used, what's accurate, or where we need to get to, for the moment I changed the "template" to expect 1.6 reports (being identified as 1.7 the results generator was missing them completely). 

In a similar vein, I changed our test identification to be more generic: 
 Linux, 1.6 VM, MacOSX, 1.6 VM, Windows 7, 1.6 VM
They were hard coded in the testResults.php template as 
RHEL5, SUN 1.6.0; MacOSX, Apple 1.6.0_26-b03-384-10M3425; WIN XP, SUN 1.7.0
which I'm guessing was accurate for the old IBM environment. (We'll make all those variables, in the future .... the info is probably stored somewhere and I just don't know where to find it yet. 

So, the almost-good-news is with a fair amount of tweaking I an get the "test results generator" to run on my local machine and produce the traditional summary pages ... so with any luck tomorrow, for the one third subset we are running, everyone will be able to easily see the hundreds of errors we are producing. :) [due to setup, I'm sure, not code breakages.].
Comment 15 John Arthorne CLA 2012-04-30 17:09:26 EDT
I was quickly investigating all of the ProductActionTest test failures that look like this:

java.lang.NullPointerException
	at org.eclipse.equinox.p2.publisher.eclipse.ApplicationLauncherAction.createLauncherAdvice(ApplicationLauncherAction.java:89)

I can't see any possible reason for an NPE here. The test creates an object, passes it as an argument, and at the other end the argument is null. The only possibility that comes to mind is we aren't running the latest test and the line numbers are actually different, but I don't know how that could happen either.
Comment 16 David Williams CLA 2012-04-30 23:08:33 EDT
(In reply to comment #15)
> I was quickly investigating all of the ProductActionTest test failures that
> look like this:
> 
> java.lang.NullPointerException
>     at
> org.eclipse.equinox.p2.publisher.eclipse.ApplicationLauncherAction.createLauncherAdvice(ApplicationLauncherAction.java:89)
> 
> I can't see any possible reason for an NPE here. The test creates an object,
> passes it as an argument, and at the other end the argument is null. The only
> possibility that comes to mind is we aren't running the latest test and the
> line numbers are actually different, but I don't know how that could happen
> either.

Well ... good luck figuring that out. :/ 

I did "clean" all the test workspaces this evening and set the hudson jobs to "clean workspace" before each run. Not sure why it wasn't that way or what all effects will be ... but ... seems safest way to start off. 

And I wouldn't do it yet, but in some cases, if test fail for no obvious reason, extra diagnostics might have to be added to the tests to see if things are as expected. But, at this point it could "be anything" ... there's lots of things to investigate before changing tests. 

FWIW, on the "jre 7" issue on windows, it _does_ appear a 1.7 jre is being used, from the name, 
c:\java\jdk7u2\jre\bin\java
so the test results are likely correctly marked with _7 ... I'll have to instigate more about why not picked up by "test summary generator" (but, will leave as _6, for the short term. (and hard to tell what's being used on the macs ... one implies java 5?) 

In other news, I _think_ I've hooked things up so tests will be kicked off automatically at the end of each build (starting with Wednesday morning's build ... I might try a few test builds on Tuesday). (but, not yet "summary pages" will try a few more "manual" runs of that first.
Comment 17 David Williams CLA 2012-05-01 19:09:44 EDT
To give some status: 

I am able to "summarize" the linux test results for 3.8 build: 

http://build.eclipse.org/eclipse/eclipse3I/siteDir/eclipse/downloads/drops/I20120430-2000/testResults.php

(and probably the mac, based on 4.2, they just aren't done yet for 3.8). 

But windows still doesn't work ... I've changed the _6 back to _7 for Wednesday's builds as I don't think that was the problem I thought it was ... something more complicated. 

Worse, the "createIndex" script completely "trashes" the main build download page: 

http://build.eclipse.org/eclipse/eclipse3I/siteDir/eclipse/downloads/drops/I20120430-2000/

Not sure if that's due to other changes I've made, or if I don't know how to use the "createIndex" script, or both, but am surprised it appears to be designed to "start fresh" and recreate everything, instead of just updating the "test" portion of the page(s). 

Conclusion: complicated.
Comment 18 David Williams CLA 2012-05-02 02:45:07 EDT
I've gotten some results up on the download server for the final builds from Monday. 

4.2: 
http://download.eclipse.org/eclipse/downloads/drops4/I20120430-1800/

3.8:
http://download.eclipse.org/eclipse/downloads/drops/I20120430-2000/

I figured out the "step on" problem so now the scripts re-generate the main page sanely. 

Still don't know why id doesn't "summarize" for the windows tests. (we do "get results back from hudson). 

Still many "manual" steps involved that will take a day or three to automate.  

I enabled a few more test suite for Wednesday's builds, but left the time limit at 30 minutes ... so am hoping only an hour or two is added to the current 5 or 6 hour test-time, for 7 or 8 total. There are still 10 or so suites disabled. 

But, hopefully this will give some results so others can begin to figure out what's not configured correctly, if the wrong tests or VMs are used, or what ever. 

Another strategy, moving forward, in the short term, is to remove any tests suites we see with DNF. They either take more than 30 minutes, normally, or are hanging for some reason. But, I am pretty sure if we re-enabled them all, and set the time limit back to 2 hours per suite, we'd be seeing 12 to 20 hours to complete. (And, then, still lots of errors and DNFs). Guess we could set that in motion on last build on Wednesday or something .... but ... if overall test "times out" we sort of "lose" the results (they are probably still there on hudson but ... no summaries, and have to look at each HTML file in Hudson.

And, we now clean the workspace when ever a test starts, so can't look at results after the next test build starts (I think "artifacts" are saved for 3 jobs ... but, I don't think they are saved if hit the overall time limit ... I think currently set at 15 hours.)
Comment 19 Thomas Watson CLA 2012-05-02 08:35:46 EDT
For the OSGi test failures (all 37 of them) is due to the fact that the registry fragment's signer (org.eclipse.core.runtime.compatibility.registry) does not match the org.eclipse.equinox.registry host's signer.

You can see this in the logs at:

http://download.eclipse.org/eclipse/downloads/drops4/I20120430-1800/testresults/linux.gtk.x86_6.0/org.eclipse.osgi.tests.AutomatedTests.txt

I took a look at the signing certificate of each bundle and they do appear to be different.  Did the foundation change the signing certificate recently?

More information, this causes an issue when runtime verification is enabled because additional checks are enabled that prevents two different signers from contributing classes to the same package for the same class loader.  This ultimately causes the registry bundle to fail to start when running the security tests that enable full runtime verification of signers.
Comment 20 David Williams CLA 2012-05-02 09:18:32 EDT
(In reply to comment #19)

> 
> I took a look at the signing certificate of each bundle and they do appear to
> be different.  Did the foundation change the signing certificate recently?
> 

Yes, they did. about 3 or 4 weeks ago. I'm not sure what the cure is here, assume you do? We normally do NOT get the new certificate, unless a jar's version/qualifier changes. (The comparator log is full of messages about "ECLIPSEF.SF is not in the new bundle" (The new "certificate" is named "ECLIPSE_.SF). See bug 362445 for more detail.
Comment 21 Thomas Watson CLA 2012-05-02 09:59:24 EDT
I opened bug 378239 to force the fragment to get signed again.
Comment 22 Dani Megert CLA 2012-05-02 11:20:13 EDT
> But, hopefully this will give some results so others can begin to figure out
> what's not configured correctly, if the wrong tests or VMs are used, or what
> ever. 

Javadoc logs are missing, see bug 378272.
Comment 23 David Williams CLA 2012-05-03 02:24:19 EDT
To give some status at the "end of my day". I did get some tests summarized up on downloads for the final M7 I builds. The windows ones are still running ... not sure why they take logner? One of the set of mac tests seemed to have gotten lost ... not sure what happened. 

At any rate, I did re-enable all the tests, left in the 30 minute timeout, and started another set of test runs with those. With any luck, by the end of Thursday, we'd have some results from all the tests. (Though, I'm sure, plenty of DNFs since some (many?) take more than 30 minutes.
Comment 24 David Williams CLA 2012-05-03 10:40:38 EDT
The bad news: many of the tests I'd hoped would be long running and "run everything" failed for misc. reason (not fully understood or investigated). 

The good news: the initial windows tests finally finished and the missing Mac test "magically" showed up back in our results section 

AND, the "create summary" task now works for the windows results too! 

So, end result is we have a summary for all three platforms, but not yet all tests (and, still 30 minute timeout). 

4.2: 
http://download.eclipse.org/eclipse/downloads/drops4/I20120502-1800/testResults.php

4.3:
http://download.eclipse.org/eclipse/downloads/drops/I20120502-2000/testResults.php


One question: does anyone know right off for sure what a dash (hyphen) in the summary table means? How's that different from a "DNF"? If no one knows, I'm sure I'll figure it out eventually ... I'm guessing there's some positive way to tell a test tried to run but never finished, vs. test results just missing? Or, could it mean no tests results are expected on that particular platform? I'm assuming the former, since I don't see any dashes in the M6 test results summaries, but, thought I'd ask to save time if anyone knows.
Comment 25 David Williams CLA 2012-05-13 13:12:15 EDT
Just to keep notes for myself, in finishing the automation for this, noted odd (or complicated) behavior from ant. 

We invoke a target, in parallel, three times to test on the mac, windows, and linux, and then it "spins and waits" for some results to come back. That parts been working (relatively) well, but to put in the "finishing touches" where as results come back, we want to "regen" the test results page, and upload the test results to download.eclipse.org. So ... I put those "final steps" in to its own target and at the end of the task waiting for the results put in a simply "ant call" to the new final steps. 

Turns out, somehow, ant was waiting until ALL parallel tasks were complete before calling even one of the "antcall" targets, and if that wasn't bad enough, once that target ran got a weird message about "echoproperties" task not defined ... as though it had a completely different environment, or something. 

So, I moved "up" the code to do the finishing touches into the parallel task (and removed echoproperties ... just in case :) 

This was the final error message, repeated exactly the same, exactly three times. 



The following error occurred while executing this line:
/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.eclipsebuilder/invokeTestsJSON.xml:196: The following error occurred while executing this line:
/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.eclipsebuilder/invokeTestsJSON.xml:236: The following error occurred while executing this line:
/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.eclipsebuilder/invokeTestsJSON.xml:246: The following error occurred while executing this line:
/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.eclipsebuilder/genTestIndexes.xml:77: Problem: failed to create task or type echoproperties
Cause: the class org.apache.tools.ant.taskdefs.optional.EchoProperties was not found. 
        This looks like one of Ant's optional components.
Action: Check that the appropriate optional JAR exists in
        -/usr/share/ant/lib
        -/opt/buildhomes/e4Build/.ant/lib
        -a directory added on the command line with the -lib argument

Do not panic, this is a common problem.
The commonest cause is a missing JAR.

This is not a bug; it is a configuration problem
Comment 26 David Williams CLA 2012-05-13 17:51:53 EDT
The next "test run" to test automation was not much better. part of the code I "moved up" to parallal task still had an <ant task in it (not <antcall) but apparently that still caused all three of those "calls" to wait until all were done, and once they ran, got a blocking error message: 

/shared/eclipse/eclipse4I/build/supportDir/org.eclipse.releng.basebuilder/plugins/org.eclipse.build.tools/scripts/publish.xml:23: taskdef class org.eclipse.releng.generators.TestResultsGenerator

And, this one _might_ make sense. The "invokeTestsJSON.xml" might be a "pure" ant task, not running in antRunner (and, just saying maybe, I'd be surprised ...) 
but, in any case, either that, or not "inheriting" what it needs to. 

I think I'll change approaches. As the final step we always would have had to create a "promote script" since e4Build id can not upload to "downloads" (only a committer id) and think (conceptually) I'll try to expand that script to invoke the "generate indexes" task. This will work nicely, if I can figure out why we have "permission" problems on some (not all) the build machines directories. 
Tracked in bug 379359.
Comment 27 David Williams CLA 2012-05-15 14:52:34 EDT
My latest test to completely automate the "wait, generate, and re-upload" seemed to work well in its "proof of concept" form, so I've fixed the various remaining errors and typos involved with it, and in tonight's build (5/15) will produce "the real thing" so tests results should be uploaded as soon as they are ready ... well, within the 15 to 30 minutes involved with all the "waits" and processing. 


I think as a general-issue umbrella bug this has served its purpose. 

Though, there are many important fixes and improvements to make, such as bug 377718 and others, but if specific issues found, please open a specific bug.
Comment 28 David Williams CLA 2012-05-16 13:03:48 EDT
FWIW, the "corrections to typos" I did yesterday to finish the automation I did in some dead code no longer called instead of the "real" code. That'll teach me to remove dead code :) So, I have just now, for tonight's build, make the corrections to the "real" code. (And, I've removed the dead code).